I see Dead Jobs everywhere (sidekiq DeadSet)

When you are using Sidekiq to handle asynchronous jobs, some times there are exceptions and jobs failing, and I say sometimes because your environment is probably perfect, there is no lag, all services your jobs depend on are always on and responsive, and you probably write better code than most other developers 😛 otherwise it probably happens quite often…

But Sidekiq will retry that job for you, of course this is configurable:

# Retry 5 times before giving up
sidekiq_options retry: 5
# Don't retry at all
sidekiq_options retry: false

The default number of retries is 25, and with that many tries most of the transient problems will fix itselves probably.

But sometimes not even being that insistent is enough to avoid the problem, and in that case sidekiq will send your job to a DeadSet where all the job ghosts live.

For most applications, it is good to know when some or all jobs go to that DeadSet, meaning Sidekiq will not retry it anymore.

In this situation you can log the error, send one email for a human to fix the issue by hand, or handle it the best way possible for your business.

To intercept this, sidekiq the sidekiq_retries_exhausted hook, that you can configure either per worker class as bellow:

class ImportantWorker
  include Sidekiq::Worker
  sidekiq_options retry: 5
 
  sidekiq_retries_exhausted do |msg, exception|
    # example with using Rails' logger
    Rails.logger.warn("Failed #{msg&#91;'class']} with #{msg&#91;'args']}: #{msg&#91;'error_message']}", error: exception)
  end
 
  def perform(important_arguments)
    # do some work
  end
end

or you can configure a global handler for the entire application, adding a handler to Sidekiq like bellow:

Sidekiq.configure_server do |config|
  # other config stuff...
 
  config.death_handlers &lt;&lt; ->(job, ex) do
    Rails.logger.error "Surprise, an error!, #{job&#91;'class']} #{job&#91;"jid"]} just died with error #{ex.message}."
  end
end

I usually add this code to the config/initializers/sidekiq.rb file where all the sidekiq related configurations live together.

Of course just logging like this will not solve your problem, but intercepting the dead jobs event will allow you to write more robust asynchronous job processing applications.

Please add a comment bellow if you have any other issues with asynchronous jobs that I could help with.

I see Dead Jobs everywhere (sidekiq DeadSet)

Related

Leave a Reply Cancel reply