{{ result.summaryTitle }}
{{ result.primaryDisplay }}
{{ result.secondaryText }}
{{ badge.label }}
Queue worker capacity inputs
Shown in exports so the capacity run stays traceable.
Use the sustained peak interval, not a full-day average when traffic is bursty.
msg/s
Enter the active consumer count you want to compare against the target worker plan.
workers
Use measured completion rate per worker from logs or queue metrics.
msg/s
Enter visible queue depth, pending jobs, or delayed messages that must be drained.
messages
The worker recommendation must keep pace with arrivals and clear this backlog inside the target.
min
Common production targets sit around 70-85% when arrival bursts and retries are possible.
%
Leave 0 for the measured average; raise it when queue metrics show frequent short spikes.
%
Leave 0 when the per-worker throughput already includes normal retry and coordination overhead.
%
Leave 0 when current workers are already active or scale-out is immediate.
sec
Use 0 for pure math; add workers when downstream variance or rollout risk deserves a fixed buffer.
workers
Leave 0 for no cap; set a limit to flag when the plan cannot be reached.
workers
MetricValueDetailCopy
{{ row.metric }}{{ row.value }}{{ row.detail }}
Plan pointWorkersNet drainDrain timeUtilizationOperator noteCopy
{{ row.point }}{{ row.workers }}{{ row.netDrain }}{{ row.drainTime }}{{ row.utilization }}{{ row.note }}
CheckpointSignalActionOperator noteCopy
{{ row.checkpoint }}{{ row.signal }}{{ row.action }}{{ row.note }}
Customize
Advanced
:

Introduction:

Queue worker capacity is the relationship between incoming work, service rate, and the backlog already waiting. A worker fleet can be stable under normal traffic and still miss a recovery target after an outage, deploy pause, retry storm, or downstream slowdown. The useful planning question is not only whether workers can keep up with new messages, but whether they can also clear old work before the target time.

Backlog math is easy to misread because averages hide short pressure windows. A queue receiving 12 messages per second may look comfortable with eight workers at 2.4 messages per second each, yet an 18,000 message backlog still needs catch-up capacity. If the net drain is only 7.2 messages per second, that backlog lasts more than 40 minutes even though steady utilization is below 70%.

Queue capacity model with arrivals and backlog feeding workers that drain messages before a target time.

Worker sizing also has a quality-of-service limit. Running a queue at nearly 100% worker occupancy leaves little room for bursts, retries, cold starts, lock waits, or downstream throttling. A conservative capacity plan usually leaves headroom, then checks that the remaining drain rate still clears the starting backlog inside the recovery window.

The estimate is still a planning model. It assumes one queue, identical workers, constant arrival and service rates during the modeled interval, and no priority classes or abandonment. Real queues can behave differently when message groups limit parallelism, worker startup is slow, retry storms duplicate work, or the downstream system becomes the bottleneck.

Technical Details:

Queue drain capacity is governed by the gap between modeled arrivals and effective service capacity. Incoming messages add work continuously. Workers remove work at their completed-message rate after retry overhead, setup cost, lock contention, and downstream service time have been included in the measured per-worker throughput.

Stability starts with utilization. If modeled arrivals are equal to or greater than total worker capacity, the queue cannot drain while that traffic rate continues. If capacity is higher than arrivals, the surplus service rate becomes net drain. The starting backlog divided by that net drain gives a catch-up time, after any worker start delay has allowed the current worker set to shrink or grow the backlog.

Wait estimates use the M/M/c Erlang C approximation when the worker count is small enough for the calculation. That approximation treats arrivals as Poisson, service times as exponential, and workers as identical parallel servers. It is useful for a directional wait signal, but it is not a substitute for production latency percentiles when jobs have uneven sizes or message groups restrict concurrency.

Formula Core:

The main calculation converts the raw inputs into modeled arrivals, effective per-worker throughput, service capacity, net drain, and required workers.

Amodeled = Araw×(1+burst reserve) Reffective = Rworker×(1-retry overhead) service capacity = workers×Reffective net drain = service capacity-Amodeled drain time = backlog after start delaynet drain, when net drain is greater than 0 recommended workers = max(drain workers,utilization workers)+safety workers
Queue worker capacity quantities
Quantity Meaning Practical reading
Modeled arrival rate Incoming messages per second after any arrival burst reserve is added. Use a sustained peak window when daily average traffic hides bursts.
Effective worker throughput Completed messages per worker per second after retry or overhead reserve is applied. Measure from successful completions, not just dequeue attempts.
Current service rate Current workers multiplied by effective worker throughput. If this is below arrivals, backlog grows while conditions stay the same.
Current utilization Modeled arrivals divided by current service rate. Values above the ceiling mean the queue may drain but with too little headroom.
Workers for drain target The worker count needed to keep up with arrivals and clear backlog inside the target. This can exceed the steady-state worker count after an incident or pause.
Workers for utilization ceiling The worker count needed to keep modeled utilization at or below the selected ceiling. This protects burst, retry, and downstream headroom even when backlog is small.
Worker cap outcome The modeled plan when an optional upper worker limit is lower than the recommendation. A cap can leave the displayed plan slower or above the utilization ceiling.

Several validation bounds protect the model from impossible or misleading input. Arrival rate, current workers, backlog, start delay, safety workers, and worker cap cannot be negative. Per-worker throughput must be greater than zero. Target drain time must be positive. Utilization ceiling must stay between 1% and 99%, while retry overhead must stay below 100% so effective throughput does not disappear.

A worker cap does not change the unconstrained recommendation. It changes the planned state used for charts and cap outcome rows. When the cap is below the recommended worker target, the summary reports that the cap blocks the target, and the planned drain time shows the constrained result.

Everyday Use & Decision Guide:

Start with one queue or worker group and use the same measurement window for arrivals and completions. Enter the queue name, average arrival rate in messages per second, current workers, measured per-worker throughput, current backlog, drain target, and utilization ceiling. The default run models 12 messages per second, eight workers, 2.4 messages per second per worker, 18,000 messages waiting, and a 30 minute drain target.

The first useful check is the summary. current plan passes means the current worker count clears the backlog inside the target and stays within the utilization ceiling. scale for target means current workers are stable but not enough for the selected deadline or headroom. backlog growing means arrivals exceed current capacity. cap blocks target means the optional worker cap prevents the recommended target from being used as the plan.

  • Use Arrival burst reserve when queue metrics show frequent short spikes above the average arrival rate.
  • Use Retry or overhead reserve when retries, duplicate work, lock waits, or downstream throttling reduce completed-message throughput.
  • Use Worker start delay when autoscaling, cold starts, deployment rollout, or image pull time delays added workers.
  • Use Fixed safety workers for a whole-worker buffer after the drain and utilization requirements are computed.
  • Use Worker cap when quotas, partitions, account concurrency, or downstream limits prevent unlimited scale-out.

Use the result tabs in a practical order. Queue Metrics gives the normalized rates, utilization, drain workers, utilization workers, and recommended target. Worker Plan compares current workers, drain target floor, utilization ceiling floor, recommended target, and any capped outcome. The burn-down chart is best for explaining when backlog reaches zero, and the sensitivity chart is best for seeing how many extra workers change drain minutes.

Do not treat the recommended worker target as a deployment command. Check partition count, maximum concurrency, downstream database or API capacity, retry policy, message visibility timeouts, and actual queue latency before changing production. The Capacity Brief is a good handoff summary because it keeps worker target, backlog drain, steady load, queueing estimate, model boundary, and capped outcome in one place.

Step-by-Step Guide:

Work from measured queue rates first, then add reserves and caps only when they match the operating case.

  1. Enter Queue name so copied rows, downloaded tables, and JSON identify the queue or worker group being sized.
  2. Enter Arrival rate from the sustained interval you want to model, in messages per second.
  3. Enter Current workers and Per-worker throughput from active consumers, pods, threads, function instances, or other worker units.
  4. Enter Current backlog and Target drain time. The summary should change to a worker target, a passing current plan, a growing-backlog warning, or an input check.
  5. Set Utilization ceiling to the maximum steady occupancy you are willing to model. A common production first pass is 70% to 85% when bursts and retries are possible.
  6. Open Advanced and add burst reserve, retry overhead, worker start delay, fixed safety workers, or a worker cap when the plain average would be too optimistic.
  7. If the red input alert appears, fix the named field before using the plan. For example, per-worker throughput must be greater than 0, and utilization ceiling must be between 1% and 99%.
  8. Read Queue Metrics, then Worker Plan, then the charts before copying JSON or table output into an incident note or scale review.

Interpreting Results:

The headline worker count is the unconstrained recommendation unless the current plan already passes. It combines the worker count needed to clear backlog by the drain target, the worker count needed to stay within the utilization ceiling, and any fixed safety workers. When a worker cap is set below that recommendation, the planned state uses the cap so the charts show the constrained outcome.

Queue worker capacity result signals
Signal Read it as Check next
current plan passes Current workers clear the backlog inside the target and stay under the utilization ceiling. Confirm observed queue latency and downstream capacity before reducing workers.
scale for target Current workers drain, but miss the target time or headroom requirement. Compare Workers for drain target with Workers for utilization ceiling.
backlog growing Modeled arrivals are at or above current worker capacity. Increase throughput, reduce arrivals, or pause noncritical producers before relying on wait estimates.
cap blocks target The worker cap is lower than the recommended target. Raise the cap, relax the drain target, reduce arrivals, or accept the slower planned drain.
not calculated queueing estimate The Erlang C wait estimate was skipped for a very large worker count or invalid service capacity. Use service rate, net drain, utilization, and observed wait metrics instead.

A low utilization number does not prove the backlog will clear in time. It only says modeled arrivals are a manageable share of current capacity. The drain target also requires enough surplus capacity to burn down existing messages while new messages keep arriving.

A passing model does not prove the queueing system is healthy. Compare the plan against actual visible backlog, oldest-message age, processing latency, retry count, dead-letter movement, and downstream saturation after scaling. If those signals disagree with the model, trust production evidence and adjust the measured arrival or throughput inputs.

Worked Examples:

Stable workers still miss the drain target

The default-style run uses email-jobs, 12 msg/s arrivals, eight current workers, 2.4 msg/s per worker, 18,000 messages waiting, a 30 minute target, and an 80% utilization ceiling. Current service rate is 19.2 msg/s, so utilization is 62.5%, but net drain is only 7.2 msg/s. Current drain time is 41 min 40 sec, so the summary points to a 10 worker target. At 10 workers, planned drain time is 25 min and utilization is 50%.

A worker cap blocks a real recovery target

A batch queue receives 32 msg/s, adds a 25% arrival burst reserve, has eight workers at 3.0 msg/s each, and applies a 10% retry overhead reserve. The modeled arrival rate becomes 40 msg/s and effective worker throughput becomes 2.7 msg/s. With 45,000 messages waiting, a 20 minute target, a 120 second worker start delay, two safety workers, and an 18 worker cap, the unconstrained recommendation is 32 workers. The summary reads 18 cap blocks target, and the capped plan drains in about 1 hr 31 min with 82.3% utilization.

Current workers already pass

A small queue has 6 msg/s arrivals, six workers, 1.5 msg/s per worker, 500 messages waiting, a 20 minute target, and an 80% utilization ceiling. Current service rate is 9 msg/s, net drain is 3 msg/s, and Current drain time is about 2 min 47 sec. The recommendation may be lower than the current worker count, but that should start a review rather than an automatic scale-down because future bursts, retries, and downstream limits still matter.

Input check stops an unusable model

If Per-worker throughput is set to 0, the alert shows Per-worker throughput must be greater than 0 msg/s. The metrics table then lists the input issue instead of a worker plan. Fix the throughput value from measured completions before reading Queue Metrics, charts, or JSON.

FAQ:

Should I use average traffic or peak traffic?

Use the sustained peak interval that matters for the operation. A full-day average can understate the worker target when traffic arrives in bursts, and Arrival burst reserve is available when recent metrics show short spikes above the selected average.

Why does the worker target rise when utilization already looks safe?

Utilization checks steady headroom, while the drain target checks whether existing backlog clears by the deadline. Current workers can sit below the utilization ceiling and still miss Target drain time when the starting backlog is large.

What should count as one worker?

Use the unit that actually processes messages in parallel, such as a consumer process, pod, thread, function instance, or batch runner. Keep that same unit when entering Current workers, Per-worker throughput, Fixed safety workers, and Worker cap.

Why does a worker cap change the chart but not the recommendation?

The recommendation stays as the unconstrained worker target. When Worker cap is lower than that target, the planned state uses the cap so Backlog Burn-Down, Worker Sensitivity, and the capped outcome show what the limit does to drain time and utilization.

Why did the Erlang C estimate disappear?

The queueing estimate is skipped for very large worker counts or invalid service capacity. The rest of the capacity model can still show service rate, net drain, drain time, utilization, and worker target, which are usually the most important sizing numbers.

Where does the calculation run?

The capacity math runs in the browser from the values entered on the page. Treat queue names, shareable URLs, copied tables, downloaded charts, and JSON as operational records when they include private system names or incident details.

Glossary:

Arrival rate
The average incoming message, job, task, or event rate during the modeled interval.
Per-worker throughput
The completed-message rate for one worker after normal service time and downstream cost are included.
Net drain
The service capacity left after modeled arrivals are handled.
Utilization ceiling
The maximum modeled worker occupancy used to preserve headroom.
Worker cap
An optional upper limit from quotas, partitions, deployment policy, or downstream concurrency.
Erlang C
A multi-server queueing approximation that estimates the chance of waiting and average wait for identical workers under M/M/c assumptions.

References: