Load Balancer Pool Capacity Calculator
Estimate load balancer pool capacity from backend RPS, weights, health, utilization targets, growth, and N+ reserve with bottleneck cues.{{ result.summaryTitle }}
Current result
| Aspect | Value | Details | Copy |
|---|---|---|---|
| {{ row.aspect }} | {{ row.value }} | {{ row.details }} |
| Backend | Health | Share | Assigned RPS | Max RPS | Utilization | Spare | Pool ceiling | Note | Copy |
|---|---|---|---|---|---|---|---|---|---|
| {{ row.backend }} | {{ row.health }} | {{ row.share }} | {{ row.assignedRps }} | {{ row.maxRps }} | {{ row.utilization }} | {{ row.spare }} | {{ row.poolCeiling }} | {{ row.note }} |
| Check | Signal | Detail | Action | Copy |
|---|---|---|---|---|
| {{ row.check }} | {{ row.signal }} | {{ row.detail }} | {{ row.action }} |
Introduction
Traffic capacity behind a load balancer depends on the weakest serving path, not only on the number of servers in the pool. A backend group may mix old and new instances, active and draining targets, different request limits, and weights that were set for a previous traffic shape. The useful ceiling is reached when one serving backend hits the load level you are willing to plan around, even if other backends still have room.
Requests per second, often shortened to RPS, is a convenient planning unit because it connects traffic demand to measured backend throughput. It is still a steady-state shortcut. A write-heavy API route, a burst of cache misses, or a retry storm can consume far more server time per request than a simple health endpoint, so backend maximums should come from representative load tests or production telemetry rather than best-case benchmark numbers.
- Backend pool
- The set of servers, instances, targets, or upstreams that can receive traffic behind a load balancer.
- Routing weight
- A relative share of traffic. A backend with weight
2receives twice the share of a backend with weight1when both are serving. - Planning utilization
- The fraction of measured backend maximum used for planning. A
70%target leaves room for latency, burstiness, and measurement error. - Failure reserve
- An N+ check that asks whether demand still fits after one or more serving backends are removed.
Weighted pools are where many capacity estimates go wrong. Equal weights only work cleanly when the backends have equal capacity and receive comparable request work. If a smaller backend receives the same share as a larger one, the smaller backend sets the pool ceiling. Adding another row may not help until weights, health state, and measured capacity line up with the traffic that each backend actually receives.
Health state is part of capacity, not a separate afterthought. A backend that is down, draining, disabled, or failing checks should not be counted as serving capacity for ordinary planning. Real platforms also add policy details such as health check thresholds, slow start, sticky sessions, zone balancing, least-connection routing, hashing, and fail-open behavior, so a weighted RPS model should be treated as a planning aid rather than a complete simulator.
A useful pool estimate answers three questions at once: does modeled demand fit, which backend reaches the planning limit first, and does enough capacity remain after a planned failure reserve. Those answers are more practical than a raw target count when deciding whether to add backends, change weights, lower traffic, spread requests by route, or run another load test.
How to Use This Tool:
Start with traffic and backend values you would be willing to defend in a production review. The calculator updates the summary and result tabs as the inputs change.
- Enter
Target demandas steady requests per second before growth. TheModeled demandrow will match this value unless a growth buffer is applied. - Set
Planning utilizationto the sustained backend load you want to allow. Values such as65%to80%usually leave more room than a theoretical100%ceiling. - Paste
Backend poolrows asname,max RPS,weight,health. Keep one backend per line, use positive max RPS and weight values, and use health text that clearly means serving, down, or draining. - Open
Advancedwhen the estimate needs a future traffic horizon, an N+ failure check, or more decimal places.Growth bufferraises demand,Failure reserveremoves serving backends, andDisplay precisionchanges display rounding only. - Fix any
Review pool inputsmessage before trusting the output. A missing backend list, non-positive max RPS, non-positive weight, or unrecognized health flag changes which rows are counted. - Read the summary first, then compare
Pool Snapshot,Backend Allocation, andCapacity Guidance. UsePool Capacity EnvelopeandBackend Saturation Mapwhen you need to show the spare range or bottleneck pattern to another reviewer.
A good first pass is to enter current steady traffic with no growth buffer, confirm the bottleneck backend, then add the growth or N+ reserve scenario you actually need to plan for.
Interpreting Results:
Weight-limited ceiling is the main capacity result because it respects routing weights and serving health. Gross healthy ceiling is a useful comparison number, but it can overstate usable capacity when traffic share does not match backend size.
| Result cue | Boundary or signal | What to check next |
|---|---|---|
Spare headroom |
>= 0 RPS |
Modeled demand fits the weighted serving pool at the selected planning utilization. |
Spare headroom |
< 0 RPS |
Demand is above the planned ceiling. Add capacity, reduce demand, or rebalance weights toward stronger backends. |
Weight gap |
Greater than zero | Some healthy capacity is stranded by routing weights. Inspect the backend with the lowest Pool ceiling. |
N+ reserve spare |
< 0 RPS |
Normal traffic may fit, but the selected failure reserve does not. |
Peak backend utilization |
Near or above Planning utilization |
One serving backend is close to the planning limit. Check its weight and measured max RPS. |
Serving backends |
Less than total rows | Down, draining, invalid, or unrecognized rows are excluded from serving capacity. |
A Capacity ok badge means the entered model fits the entered demand. It does not prove that the service will stay healthy under real traffic. Compare the result with latency, error rate, queue length, CPU, memory, connection count, database load, and retry behavior before treating the estimate as a production limit.
When the result looks surprisingly good, verify backend maximums first. When it looks surprisingly bad, check for a down row, a typo in routing weight, or a small backend receiving the same share as larger backends.
Technical Details:
The model uses only backend rows that are valid and marked as serving. Each serving backend receives a fraction of modeled demand equal to its routing weight divided by total serving weight. The backend-specific pool ceiling is the total pool RPS at which that backend reaches the selected planning utilization.
The final planned capacity is the smallest backend-specific ceiling. This makes the bottleneck explicit: the first backend to reach its planning limit caps the pool, even when the raw sum of healthy backend ratings is higher.
Formula Core:
Let demand, backend weights, backend ratings, and planning utilization use the same RPS basis.
| Symbol | Meaning | Unit or boundary |
|---|---|---|
D | Modeled demand after growth buffer | RPS |
D_target | Entered target demand | RPS, zero or greater |
g | Growth buffer converted from percent to a ratio | 0 to 5 |
w_i | Routing weight for backend i | Positive number |
s_i | Traffic share for backend i | Ratio from 0 to 1 |
R_i | Backend maximum RPS | Positive RPS |
u | Planning utilization converted from percent to a ratio | 0.01 to 1 |
C_i | Total pool RPS at which backend i reaches planning utilization | RPS |
C_pool | Weight-limited pool ceiling | RPS |
For example, three serving backends with weights 1, 1, and 1 each receive one third of modeled demand. If one backend can handle 600 RPS and planning utilization is 70%, that backend reaches the planning line at 600 x 0.70 / (1/3) = 1260 RPS. Larger backends in the same equal-weight pool cannot raise the final ceiling unless the smaller backend receives less traffic or gains capacity.
| Rule area | Boundary | Effect on the calculation |
|---|---|---|
Target demand |
Zero or greater | Forms modeled demand before growth. |
Planning utilization |
1% to 100% |
Caps backend max RPS for planning capacity. |
Growth buffer |
0% to 500% |
Raises modeled demand by the selected percentage. |
Failure reserve |
N+0 to N+5 |
Removes serving backends and recalculates the remaining weighted ceiling. |
Max RPS |
Positive number required | Invalid rows are excluded from serving capacity. |
Routing weight |
Positive number required | Controls the backend traffic share among serving rows. |
Display precision |
0 to 3 decimal places |
Changes displayed numbers and exports without changing the model. |
Gross healthy ceiling is calculated separately as the sum of serving backend max RPS values multiplied by planning utilization. The Weight gap is the amount by which gross healthy ceiling exceeds the weight-limited ceiling. A warning appears when the gap is more than 10 RPS and more than about 3% of gross healthy target capacity.
Serving health flags include yes, y, true, up, healthy, enabled, active, and 1. Excluded flags include no, n, false, down, unhealthy, disabled, drain, draining, and 0. Unknown health text is reported and treated as down.
Failure reserve checks remove serving backends and use the lowest remaining planned ceiling. When the number of possible removal combinations is too large to enumerate completely, the reserve result becomes a screening value that should be followed by checks for the specific zone, node, or target failures that matter most to the architecture.
Accuracy Notes:
The estimate is only as reliable as the entered backend ratings and weights. It models weighted RPS capacity, not every behavior of a real load balancer, request router, or application stack.
- Use comparable max RPS measurements. Mixing best-case benchmark numbers with production traffic numbers can hide the real bottleneck.
- Check whether the load balancer uses round robin, least connections, random selection, hashing, sticky sessions, slow start, zone balancing, or another policy that changes traffic distribution.
- Do not treat a passing health check as the only capacity signal. A backend can pass health checks while still producing high latency, elevated errors, or database pressure.
- For critical launch or incident work, confirm the estimate with load tests and production telemetry before changing capacity commitments.
Worked Examples:
Equal backends near the limit. Three serving app nodes each handle 850 RPS, each has weight 1, and planning utilization is 70%. A target demand of 1800 RPS produces a Weight-limited ceiling of 1785 RPS, so Spare headroom is -15 RPS. Read this as a small capacity shortfall, not a comfortable pass.
One smaller backend with equal weight. A pool has serving max RPS values of 900, 600, and 900, all with weight 1 and planning utilization of 70%. The Gross healthy ceiling is 1680 RPS, but the smaller backend caps the Weight-limited ceiling at 1260 RPS. A positive Weight gap points to the equal weight on a lower-capacity backend.
N+1 reserve fails after the normal pool passes. Three identical serving 850 RPS backends at 70% utilization have a normal ceiling of 1785 RPS. With target demand at 1300 RPS, normal Spare headroom is 485 RPS. Set Failure reserve to 1 and the reserve ceiling falls to 1190 RPS, making N+1 reserve spare -110 RPS.
Health text blocks a row from serving. A row such as app03,650,1,maybe creates a Review pool inputs message because the health flag is not recognized. Change the health value to a supported serving or excluded flag, then recheck Serving backends and the bottleneck label before using the estimate.
FAQ:
Why can the weight-limited ceiling be lower than the gross healthy ceiling?
Gross healthy ceiling adds serving backend ratings after planning utilization. Weight-limited ceiling also accounts for traffic share, so a smaller backend with too much weight can cap the pool early.
Should planning utilization ever be 100%?
Use 100% only for a theoretical maximum. For service planning, a lower value leaves room for bursty routes, retries, measurement error, and latency growth near saturation.
What health values count as serving?
Use values such as yes, up, healthy, enabled, active, or 1. Values such as down, disabled, draining, or 0 are excluded, and unknown values trigger a review message.
Why does N+ reserve fail when normal capacity passes?
The reserve check removes serving backends and recalculates the weighted pool. The remaining pool can be too small even when the full serving pool fits modeled demand.
Can this replace a load test?
No. The estimate depends on the max RPS values you enter. Use load tests and production telemetry to confirm backend limits, request mix, connection behavior, and failure handling.
Glossary:
- Backend pool
- The group of servers, instances, targets, or upstreams behind the load balancer.
- Routing weight
- A relative value that controls each serving backend's share of traffic.
- Planning utilization
- The selected percentage of backend maximum RPS used as the sustained planning ceiling.
- Weight-limited ceiling
- The total pool RPS where the first serving backend reaches planning utilization.
- Gross healthy ceiling
- The serving backend max RPS sum after applying planning utilization.
- Spare headroom
- Weight-limited ceiling minus modeled demand.
- N+ reserve
- A capacity check that removes one or more serving backends and tests the remaining pool.
- Modeled demand
- Target demand after applying the optional growth buffer.
References:
- HTTP Load Balancing, NGINX Documentation.
- Health checks for Application Load Balancer target groups, AWS Elastic Load Balancing documentation.
- Addressing Cascading Failures, Google SRE Book.
- How to configure HTTP load balancing with HAProxy, Simplified Guide.
- How to configure backend health checks in HAProxy, Simplified Guide.