794 words, 4 min read

When building a LiveView dashboard that shows how many background jobs are still processing, a subtle race condition can make the count permanently off by one. Here's how I ran into it and how Oban's telemetry system solved it cleanly.

The Setup

The app has an Oban worker — ProcessExternalLinkWorker — that fetches a URL, extracts content, and creates a post. The LiveView index page shows a "X post(s) currently being processed" banner while jobs are in flight.

The job count is a straightforward Oban query:

from(j in Job,
  where: j.state not in ["completed", "discarded"] and j.worker == ^worker
)
|> Repo.aggregate(:count, :id)

The LiveView subscribes to a "posts" PubSub topic and refreshes this count whenever a "post_updated" message arrives. That message is broadcast from inside Posts.create_post/1, which is called from within the worker.

The Bug

Here's the execution sequence that causes the problem:

Oban picks up a job — state transitions to executing
The worker calls ProcessExternalLink.process_url/1
That calls Posts.create_post/1, which broadcasts "post_updated"
The LiveView receives the broadcast and re-queries the job count
The job is still executing — it hasn't returned :ok yet
The count includes this job, showing 1 item "still processing" even though the work is done
The worker returns :ok, Oban marks the job completed — but no one tells the LiveView

The post list refreshes correctly, but the processing counter stays at 1 until the next page load.

Subtracting 1 from the count isn't a fix — with multiple concurrent jobs, you'd need to know exactly how many are in this "just finished broadcasting but not yet completed" state, which is unknowable from the outside.

The Fix: Oban Telemetry

Oban emits telemetry events throughout the job lifecycle. The key one here is [:oban, :job, :stop], which fires after the job state has been updated to completed in the database. There's also [:oban, :job, :exception] for failed jobs.

The fix is to decouple the job count refresh from the "post_updated" broadcast. Instead, attach a telemetry handler that broadcasts a separate "jobs_updated" message when a job finishes:

defmodule MyWebApp.ObanTelemetryHandler do
  require Logger

  def attach do
    :telemetry.detach("oban-job-lifecycle")

    :telemetry.attach_many(
      "oban-job-lifecycle",
      [[:oban, :job, :stop], [:oban, :job, :exception]],
      &__MODULE__.handle_event/4,
      nil
    )
  end

  def handle_event([:oban, :job, event], _measurements, meta, _config) do
    Logger.debug("Oban job #{event}: worker=#{meta[:worker]}")
    Phoenix.PubSub.broadcast(MyWebApp.PubSub, "posts", %{event: "jobs_updated"})
  end
end

Two design decisions worth noting:

detach before attach: Calling attach_many with a handler ID that's already registered raises an ArgumentError. In development, a full server restart re-runs application.ex and would hit this error on the second start. Calling detach first makes attach/0 idempotent at the cost of one no-op call.

No worker filter: An earlier version filtered on the worker name in the handler's pattern match. That's redundant — the Ecto query in the LiveView already scopes the count to the specific worker. Removing the filter keeps the handler simpler and avoids fragility around how Oban formats the worker name in telemetry metadata.

Call attach/0 in application.ex after the supervisor starts:

{:ok, pid} = Supervisor.start_link(children, opts)
MyWebApp.ObanTelemetryHandler.attach()

Then handle the new event in the LiveView, separate from "post_updated":

def handle_info(%{event: "jobs_updated"}, socket) do
  worker = inspect(MyWebApp.Workers.ProcessExternalLinkWorker)

  num_processing =
    from(j in Job,
      where: j.state not in ["completed", "discarded"] and j.worker == ^worker
    )
    |> Repo.aggregate(:count, :id)

  {:noreply, assign(socket, :num_processing, num_processing)}
end

Why This Works

The "post_updated" broadcast still fires mid-job and the post list still refreshes correctly — that part was never broken. But the job count is now only refreshed in response to the telemetry event, which is guaranteed to fire after the state change has been committed. The LiveView queries at the right moment.

It also handles failure correctly. If the worker raises an exception, [:oban, :job, :exception] fires, the LiveView refreshes the count, and any retryable or discarded jobs show up accurately.

Takeaway

When displaying counts or status derived from job state, don't trigger the refresh from within the job itself. The job is still running at that point. Instead, hook into Oban's telemetry events, which fire at well-defined points in the lifecycle after state transitions have been committed.

If this post was enjoyable or useful for you, please share it! If you have comments, questions, or feedback, you can email my personal email. To get new posts, subscribe use the RSS feed.

🐥 Fixing a race condition in Oban job counting with telemetry

March 29, 2026

The Setup

The Bug

The Fix: Oban Telemetry

Why This Works

Takeaway