{{ result.summaryTitle }}
{{ result.primaryDisplay }}
{{ result.secondaryText }}
{{ badge.label }}
Clients Edge Nodes Egress {{ websocketStage.nodeMarker }}
WebSocket connection capacity inputs
Choose the closest traffic shape, then tune the fields below with measured values.
Use the observed or forecasted open-socket peak, not daily active users.
sockets
Used to compare the current pool against the recommended plan.
nodes
Keep this below the point where event-loop latency, reconnects, or heartbeat misses rise.
sockets/node
Measure this from a staged load test when possible; examples often range from a few KB to much higher stateful sessions.
KB/socket
Use the application memory budget, not total host RAM if other workloads run on the same machine.
GB/node
Use fanout traffic here; inbound-only workloads can use a low value.
msg/socket/min
Use compressed payload size only if compression is consistently enabled and included in the overhead field.
bytes/msg
Use measured sustained throughput after load balancer, TLS, and cross-zone limits.
Mbps/node
The target demand is peak sockets multiplied by this reserve.
%
{{ result.actionHint }}
Lower this when garbage collection, caches, or colocated workloads need more headroom.
%
Keep this below line rate to absorb retransmits, TLS overhead, cross-zone traffic, and noisy neighbors.
%
The model reserves 1,024 descriptors per node for logs, upstream sockets, and runtime overhead.
FDs/node
Applied to average outbound payload before bandwidth capacity is computed.
%
Use 1 for N+1 planning or 2 when a zone/node pair can disappear.
N+ nodes
Use 0 for whole-node planning; use 1-2 for closer diagnostic exports.
digits
MetricValueOperator noteCopy
{{ row.metric }}{{ row.value }}{{ row.note }}
ConstraintSafe cap/nodePlan utilizationActionCopy
{{ row.constraint }}{{ row.safeCap }}{{ row.utilization }}{{ row.action }}
Plan pointNodesFailover capacityHeadroomNoteCopy
{{ row.point }}{{ row.nodes }}{{ row.failoverCapacity }}{{ row.headroom }}{{ row.note }}
{{ capacityBrief }}
{{ jsonString }}
Customize
Advanced
:

WebSocket capacity is a long-lived connection planning problem. Unlike short HTTP requests, a WebSocket fleet keeps sockets open, holds file descriptors, retains per-connection memory, sends heartbeats, and often pushes messages at uneven bursts. The same connection count can be easy for a quiet presence service and unsafe for a market data stream if outbound payloads and fanout are not budgeted.

The useful capacity number is the smallest safe limit across several constraints. A node may pass a synthetic connection test but still run out of memory, hit a descriptor ceiling, or saturate network egress when messages increase. Production planning also has to reserve capacity for rolling restarts, failed nodes, autoscaling delay, and regional imbalance.

Connection capacity should therefore be treated as a pool design, not a single-server brag number. The plan starts with peak concurrent sockets, adds reserve, estimates per-node limits from measured and resource-derived ceilings, then converts the safe per-node capacity into a recommended node count with N+ reserve.

WebSocket capacity planning factors
Capacity factor What it limits Common failure mode
Tested connection cap Measured stable sockets per node. Lab tests exceed what production health checks and reconnect storms can sustain.
Memory per connection Session state, buffers, TLS objects, and application bookkeeping. Heap or resident memory climbs before CPU becomes the bottleneck.
Outbound bandwidth Payload bytes, protocol overhead, and messages per minute. Fanout traffic saturates egress while socket count still looks acceptable.
Descriptor limit Operating system file descriptor budget after reserved handles. New connections fail even though memory and bandwidth still have room.

How to Use This Tool:

  1. Choose a Workload preset such as Chat / presence fanout, Market data tick stream, Multiplayer lobby, or IoT telemetry dashboard.
  2. Enter Peak concurrent sockets, Current nodes, and the Tested connection cap per node from load testing or production evidence.
  3. Add Memory per connection, Node RAM, outbound message rate, average payload size, and node egress bandwidth.
  4. Set Planning reserve for growth and reconnect bursts, then use N+ node loss reserve for failover posture.
  5. Open Advanced to tune memory utilization, network utilization, file descriptor limits, protocol overhead, and display precision.
  6. Review Capacity Snapshot first, then inspect Constraint Ledger, Scale-Out Plan, the chart tabs, Capacity Brief, and JSON for handoff detail.

Use measured production values when possible. If memory per connection or payload size is unknown, run a representative soak test before treating the recommended node count as a procurement or autoscaling target.

Interpreting Results:

Recommended nodes includes both the nodes needed for the reserved peak and the requested N+ loss allowance. Safe sockets per node is the smallest usable constraint after memory, descriptor, tested, and bandwidth ceilings are compared.

  • Target sockets with reserve is peak concurrent sockets after planning reserve.
  • Bottleneck identifies the first constraint that drives scale-out.
  • Current failover capacity removes the N+ reserve nodes before comparing capacity to demand.
  • Scale-Out Curve shows how the node requirement changes as peak sockets grow.

A green capacity snapshot does not guarantee clean reconnect behavior. WebSocket systems should still test backoff, load balancer stickiness, health-check draining, TLS termination, and message broker fanout under failure conditions.

Technical Details:

The calculation converts peak socket demand into a reserved target, then evaluates independent per-node ceilings. The measured tested cap is compared with resource-derived limits for memory, descriptors, and outbound bandwidth. The lowest finite ceiling becomes the safe per-node connection capacity.

Bandwidth capacity uses average outbound payload size, message frequency, and protocol overhead. Memory capacity uses node RAM multiplied by the allowed memory utilization and divided by per-connection memory. Descriptor capacity subtracts a reserved operating system allowance before comparing it with socket demand.

Formula Core

The reserved demand is peak concurrency with growth and burst allowance.

Target sockets = Peak sockets 1+Planning reserve

Safe per-node capacity is the tightest applicable node limit.

Safe per node = min Tested cap , Memory cap , Descriptor cap , Bandwidth cap

The recommendation adds node-loss reserve after the base node count is known.

Recommended nodes = Target sockets Safe per node + N+ reserve nodes
WebSocket capacity calculation fields
Field Calculation role Boundary or caution
Planning reserve Raises demand before node count is calculated. Low reserve can hide reconnect storms and traffic spikes.
Memory utilization Limits how much RAM is available to connection state. Leaving headroom helps garbage collection and application work.
Network utilization Caps egress below line rate before payload traffic is divided. High values can ignore retransmits, TLS overhead, and noisy neighbors.
N+ reserve Adds spare nodes after the base fleet size is calculated. Current capacity is judged after those reserve nodes are removed.

With the default chat and presence workload, 75,000 peak sockets and 25% reserve become 93,750 target sockets. The tested cap of 30,000 sockets per node is the limiting constraint, so the base requirement is 4 nodes and the N+1 recommendation is 5 nodes.

Accuracy Notes:

WebSocket capacity estimates depend on runtime behavior, kernel tuning, application memory layout, load balancer policy, and message fanout architecture. Treat the output as a sizing model that must be validated by soak tests and failure drills.

  • Measure per-connection memory with representative authentication state, subscriptions, compression settings, and TLS termination.
  • Include reconnect surges after deploys, regional failover, mobile network churn, and broker replay bursts in load tests.
  • Check operating system limits such as file descriptors, ephemeral ports, TCP keepalive, and accept queue settings.
  • Do not assume bandwidth capacity is symmetric. Outbound fanout often dominates inbound traffic for live update systems.

Worked Examples:

A Chat / presence fanout service with 75,000 peak sockets, 3 current nodes, and 30,000 tested sockets per node needs 5 recommended nodes under N+1 reserve. The current 3-node fleet only has 2 active capacity nodes after failover reserve, so the scale-out gap is 2 nodes.

A Market data tick stream may have fewer users but larger outbound message rates. In that case the bottleneck can move from tested connection cap to node egress, and adding RAM alone will not increase safe socket count.

An IoT telemetry dashboard with small payloads but many idle devices may be driven by file descriptors or memory. Raising descriptor limits without measuring per-connection memory can simply move the failure point.

FAQ:

Why is the tested cap still needed if memory and bandwidth are entered?

The tested cap captures real runtime limits such as event loop behavior, garbage collection, framework overhead, and load balancer behavior that simple resource math may miss.

Should N+ reserve be included before or after node sizing?

The model sizes the active fleet first, then adds reserve nodes. Current failover capacity is evaluated after removing the reserve node count from the active pool.

What if bandwidth capacity shows infinity?

That means the entered outbound message rate or payload size is zero, so bandwidth is not a finite constraint in the model. Use realistic traffic values for push-heavy systems.

Can this size a globally distributed WebSocket service?

Use it per region or per shard. Global designs also need traffic steering, regional failover assumptions, session affinity, and message broker replication capacity.

Glossary:

WebSocket
A persistent, full-duplex connection that starts with an HTTP upgrade and then carries framed messages.
Fanout
Sending one event to many connected clients, often the main outbound bandwidth driver.
Descriptor
An operating system handle used for files, sockets, and related resources.
Soak test
A long-running load test that exposes leaks, churn, garbage collection, and slow resource growth.
N+ reserve
Extra node capacity held aside so the service can survive one or more node losses.