Skip to content

job-retry mechanism broken in version 7.0.0 #4960

@Brend-Smits

Description

@Brend-Smits

Description

Yesterday we upgraded from version 6.9.0 to 7.0.0 and we started noticing an increase in job start time. After investigation we noticed that the job-retry mechanism is no longer functioning. Here are some screenshots:

Image Image

You can see clearly the invocations and logs stopped at 14:00 UTC (when we did the update).

Investigation

We're not sure yet what has caused it, we're still trying to figure this out. Any help is appreciated.

Edit 09:46:

  • Version 6.9.0 is calling publishRetryMessage(payload) in the scale-up function.
  • Version 7.0.0 is no longer calling the publishRetryMessage(payload) in the scale-up function, it's not calling it anywhere except for tests.

Edit 10:06:

  • Testing a fix locally which we will make a PR for if it works properly
  • Confirmed the fix is working as expected:
Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions