In some applications, some classes of errors may be raised during the
execution of a job which the developer would want to retry forever.
These classes of errors would most likely be infrastructure problems that
the developer knows will be resolved eventually but may take a variable
amount of time or errors where due to application business logic, there
could be something temporarily blocking the job from executing, like a
resource that is needed for the job being locked for a lengthy amount of
time.
While an arbitrarily large number of attempts could previously be passed,
this is inexpressive as sometimes the developer may just need the job to
continue to be retried until it eventually succeeds. Without this,
developers would need to include additional code to handle the situation
where the job eventually fails its attempts limit and has to be re-enqueued
manually.
As with many things this should be used with caution and only for errors
that the developer knows will definitely eventually be resolved, allowing
the job to continue.
[Daniel Morton + Rafael Mendonça França]
There is presently no clean way of telling a caller of `perform_later`
the reason why a job failed to enqueue. When the job is enqueued
successfully, the job object itself is returned, but when the job can
not be enqueued, only `false` is returned. This does not allow callers
to distinguish between classes of failures.
One important class of failures is when the job backend experiences a
network partition when communicating with its underlying datastore. It
is entirely possible for that network partition to recover and as such,
code attempting to enqueue a job may wish to take action to reenqueue
that job after a brief delay. This is distinguished from the class of
failures where due a business rule defined in a callback in the
application, a job fails to enqueue and should not be retried.
This PR changes the following:
- Allows a block to be passed to the `perform_later` method. After the
`enqueue` method is executed, but before the result is returned, the
job will be yielded to the block. This allows the code invoking the
`perform_later` method to inspect the job object, even in failure
scenarios.
- Adds an exception `EnqueueError` which job adapters can raise if they
detect a problem specific to their underlying implementation or
infrastructure during the enqueue process.
- Adds two properties to the job base class: `successfully_enqueued` and
`enqueue_error`. `enqueue_error` will be populated by the `enqueue`
method if it rescues an `EnqueueError` raised by the job backend.
`successfully_enqueued` will be true if the job is not rejected by
callbacks and does not cause the job backend to raise an
`EnqueueError` and will be `false` otherwise.
This will allow developers to do something like the following:
MyJob.perform_later do |job|
unless job.successfully_enqueued?
if job.enqueue_error&.message == "Redis was unavailable"
# invoke some code that will retry the job after a delay
end
end
end
Before this commit, only StandardError exceptions can be handled by
rescue_from handlers.
This changes the rescue clause to catch all Exception objects, allowing
rescue handlers to be defined for Exception classes not inheriting from
StandardError.
This means that rescue handlers that are rescuing Exceptions outside of
StandardError exceptions may rescue exceptions that were not being
rescued before this change.
Co-authored-by: Adrianna Chang <adrianna.chang@shopify.com>
The implementaiton of `instrument` in `ActiveJob::Instrumentation` was
not keeping the API of `ActiveSupport::Notification.instrument` and
returning the value of the block.
Fixes#40931.
Before #34953, when using the `:async` Active Job queue adapter, jobs
enqueued in `db/seeds.rb`, such as Active Storage analysis jobs, would
cause a hang (see #34939). Therefore, #34953 changed all jobs enqueued
in `db/seeds.rb` to use the `:inline` queue adapter instead. (This
behavior was later limited to only take effect when the `:async` adapter
was configured, see #35905.) However, inline jobs in `db/seeds.rb`
cleared `CurrentAttributes` values (see #37526). Therefore, #37568
changed the `:inline` adapter to wrap each job in its own thread, for
isolation. However, wrapping a job in its own thread affects which
database connection it uses. Thus inline jobs can no longer execute
within the calling thread's database transaction, including seeing any
uncommitted changes. Additionally, if the calling thread is not wrapped
with the executor, the inline job thread (which is wrapped with the
executor) can deadlock on the load interlock. And when testing (with
`connection_pool.lock_thread = true`), the inline job thread can
deadlock on one of the locks added by #28083.
Therefore, this commit reverts the solutions of #34953 and #37568, and
instead wraps evaluation of `db/seeds.rb` with the executor. This
eliminates the original hang from #34939, which was also due to running
multiple threads and not wrapping all of them with the executor. And,
because nested calls to `executor.wrap` are ignored, any inline jobs in
`db/seeds.rb` will not clear `CurrentAttributes` values.
Alternative fix for #34939.
Reverts #34953.
Reverts #35905.
Partially reverts #35896.
Alternative fix for #37526.
Reverts #37568.
Fixes#40552.
- ### Problem
If we use `perform_enqueued_jobs` without a block,
a job that raises an error wouldn't be appended to
the list of `performed_jobs`.
### Solution
Push the job in the array before it is actually performed.
- ### Problem
Given the below example the test adapter will retry the job
indefinitely:
```ruby
class BuggyJob < ActiveJob::Base
retry_on(Exception, attempts: 2)
def perform
raise "error"
end
end
BuggyJob.perform_later
perform_enqueued_jobs
```
The problem is that when the job get retried, the
`exception_executions` variable is not serialized/deserialized,
resulting in ActiveJob to not be able to determine how many time
this job was retried.
The solution in this PR is to deserialize the whole job in the test
adapter, and reserialize it before retrying.
Fix#38391
* Add failing ActiveJob exceptions test for "disable retry jitter"
Thanks to @kaspth for the starting point.
* Update ActiveJob retry jitter to correctly use zero value
* Simplify "disable retry jitter" test
We don't need to repeat this many times. Fewer is shorter.
* Refactor determine_delay with jitter
* Fix indentation
* Close the curtains and give JITTER_DEFAULT some privacy
* Use .zero? instead of == to check jitter value
* Add ActiveJob test for explicit zero jitter
Co-authored-by: Kasper Timm Hansen <hey@kaspth.com>
Co-authored-by: Cliff Pruitt <cliff.pruitt@cliffpruitt.com>
- ### Problem
```ruby
MyJob < ApplicationJob
before_enqueue { throw(:abort) }
after_enqueue { # enters here }
end
```
I find AJ behaviour on after_enqueue and after_perform callbacks
weird as they get run even when the callback chain is halted.
It's counter intuitive to run the after_enqueue callbacks even
though the job wasn't event enqueued.
### Solution
In Rails 6.2, I propose to make the new behaviour the default
and stop running after callbacks when the chain is halted.
For application that wants this behaviour now or in 6.1
they can do so by adding the `config.active_job.skip_after_callbacks_if_terminated = true`
in their configuration file.
- ### Problem
ActiveJob will always log "Enqueued MyJob (Job ID) ..." even
if the job doesn't get enqueued through the adapter.
Same problem happens when performing a Job, "Performed MyJob (Job ID) ..." will be logged even when job wasn't performed at all.
This situation can happen either if the callback chain is terminated
(before_enqueue throwing an `abort`) or if an exception is raised.
### Solution
Check if the callback chain is aborted/exception is raised, and log accordingly.
class SensitiveJob < ApplicationJob
self.log_arguments = false
def perform(my_sensitive_argument)
end
end
When dealing with sensitive arugments as password and tokens it is
now possible to configure the job to not put the sensitive argument
in the logs.
Closes#34438.
Previously, by extending ActiveJob::TestCase, the test adapter provided
for tests was being used always, in all executions where supposedly
different adapters were being used. As a consequence, some bugs visible
only for some adapters might have gone undetected. This commit changes
that, skipping queue adapters for which we can't test scheduling jobs
with a delay.
Also, make tests and examples for individual execution counters
clearer, as it wasn't entierly clear what would happen in this case:
```
retry_on CustomException, OtherException, attempts: 3
```
The job would be retried at most 3 times in total, for both
CustomException and OtherException. To have the job retry 3 times at
most for each exception individually, the following retry_on
declarations are necessary:
```
retry_on CustomException, attempts: 3
retry_on OtherException, attempts: 3
```
* Keep executions for each specific declaration
Fixes#34337
ActiveJob used the global executions counter to control the number of
times a job should be retried. The problem with this approach was that
in case a job raised different exceptions during its executions they
weren't retried the number of times defined by their `attemps` number.
**Example:**
Having the following job:
```ruby
class BuggyJob < ActiveJob::Base
retry_on CustomException, attemps: 3
retry_on OtherException, attempts: 3
end
```
If the job raised `CustomException` in the first two executions and then
it raised `OtherException`, the job wasn't retried anymore because the
global executions counter was already indicating 3 attempts.
With this patch each `retry_on` declaration has its specific counter so
that the first two executions that raise `CustomException` don't affect
the retries count that future exceptions may have.
* Revert "clarifies documentation around the attempts arugment to retry_on"
This reverts commit 86aa8f8c5631f77ed9a208e5107003c01512133e.
Currently, the execution count increments after deserializes arguments.
Therefore, if an error occurs with deserialize, it retries indefinitely.
In order to prevent this, the count is moved before deserialize.
Fixes#33344.
Record what was the current timezone in effect when the job was
enqueued and then restore when the job is executed in same way
that the current locale is recorded and restored.
Active Job logging instrumentation is changed to log errors (with
backtrace) when a job raises an exception in #perform. This improves
debugging during development and test with the default configuration.
Prior to Rails 5, the default development configuration ran jobs with
InlineAdapter, which would raise exceptions to the caller and be
shown in the development log. In Rails 5, the default adapter was
changed to AsyncAdapter, which would silently swallow exceptions
and log a "Performed SomeJob from Async..." info message. This could
be confusing to a developer, as it would seem that the job was
performed successfully.
This patch removes the "Performed..." info message from the log
and adds an error-level "Error performing SomeJob..." log message
which includes the exception backtrace for jobs that raise an
exception within the #perform method. It provides this behavior for
all adapters.
The `ActiveJob::TestHelper` replace the adapter to test adapter in
`before_setup`. It gets the target class using the `descendants`, but if
the test target job class is not loaded, will not be a replacement of
the adapter.
Therefore, instead of replacing with `before_setup`, modified to
replace when setting adapter.
Fixes#26360