Virtualization Host Capacity Calculator
Estimate safe VM capacity after failover, maintenance, CPU and memory reserves, growth headroom, and large-VM placement checks.{{ result.summaryTitle }}
Current result
| Section | Item | Value | Note | Copy |
|---|---|---|---|---|
| {{ row.section }} | {{ row.item }} | {{ row.value }} | {{ row.note }} |
| Metric | Value | Copy |
|---|---|---|
| {{ row.label }} | {{ row.value }} |
| Planner Metric | Value | Copy |
|---|---|---|
| {{ row.label }} | {{ row.value }} |
| Placement Metric | Value | Copy |
|---|---|---|
| {{ row.label }} | {{ row.value }} |
| Total hosts | Delta vs current | Active at policy | Safe VM ceiling | Gap vs current | Gap vs projected | Reserve footprint | Limiter | Status | Copy |
|---|---|---|---|---|---|---|---|---|---|
| {{ row.totalHostsText }} | {{ row.additionalHostsText }} | {{ row.activeHostsText }} | {{ row.safeVmText }} | {{ row.currentGapText }} | {{ row.projectedGapText }} | {{ row.reserveShareText }} | {{ row.limiter }} | {{ row.statusText }} |
| Failed hosts | Active hosts | Safe VM ceiling | CPU limit | RAM limit | Limiter | Runway vs current | Copy |
|---|---|---|---|---|---|---|---|
| {{ row.failedHosts }} | {{ row.survivingHosts }} | {{ row.safeVmDisplay }} | {{ row.cpuVmDisplay }} | {{ row.ramVmDisplay }} | {{ row.limiter }} | {{ row.runwayDisplay }} |
| Field | Value | Copy |
|---|---|---|
| {{ row.label }} | {{ row.value }} |
Introduction:
A virtualized cluster is sized for the moment when capacity is missing, not for the calm hour when every host is healthy. A useful plan asks how many virtual machines can keep running after one or more hosts are lost while another host may be out for patching, firmware work, evacuation, or storage maintenance. Raw host totals hide that policy choice because they count machines that the failure plan is supposed to hold back.
Capacity also depends on the shape of the guest estate. A hundred small web servers, a pool of virtual desktops, and a handful of memory-heavy database guests can consume the same headline number of VMs while stressing very different parts of the cluster. CPU is commonly planned with some vCPU-to-pCPU overcommit because not every guest wants full CPU at the same instant. Memory needs stricter judgment because pressure can turn into ballooning, host swap, guest swap, or failed placement during a restart.
| Planning term | What it means | Common mistake |
|---|---|---|
| Active hosts | Hosts left for guests after failed and maintenance hosts are removed. | Sizing from total hosts and forgetting the N+ policy window. |
| Overcommit ratio | Virtual CPU or memory assigned for each physical unit that the policy allows. | Using one ratio for every workload class without checking demand patterns. |
| Utilization cap | The steady-state ceiling held below 100 percent to preserve burst and recovery room. | Treating average utilization as if peak, boot, backup, and failover periods behave the same. |
| Large-VM fit | Whether an outsized guest fits one surviving host, and possibly one NUMA node. | Trusting aggregate cluster totals even when a single guest cannot be placed cleanly. |
Reserves and overcommit are policy decisions, not universal best practices. A bursty development cluster may run acceptably with a higher CPU overcommit ratio than a latency-sensitive database estate. A memory overcommit plan that looks harmless during normal hours can become risky during boot storms, host evacuation, backup scans, or failover restart when many guests touch more memory at once.
The other trap is average density. Average VM size is useful for a fleet estimate, but one large guest still has to fit on one surviving host. NUMA adds another practical limit because a host with enough total CPU and memory may still split that capacity across sockets or nodes in a way that changes performance for a large guest.
Capacity arithmetic should be treated as a planning model. CPU ready time, storage latency, network contention, live migration cost, reservation rules, anti-affinity rules, and hypervisor-specific admission control can all make the usable ceiling lower than the arithmetic result. Hardware limits from the chosen platform also matter because host and cluster maximums change across hypervisors and releases.
How to Use This Tool:
Start with the cluster policy you need to survive, then read the result tabs in the order that narrows the first constraint.
- Choose a
Presetthat resembles the VM estate, or chooseCustom (no auto-apply)when you already have measured averages. The preset changes the average VM shape and overcommit assumptions, so switch to Custom before fine tuning. - Enter
Hosts in cluster,Failure tolerance target,Physical cores per host, andMemory per host. These fields define the raw cluster and the N+ policy window before any density math is applied. - Set
Average VM vCPU,Average VM memory,CPU overcommit ratio, andMemory overcommit ratio. If the VM fleet is mixed, use the average for the ordinary guests and reserve large outliers in Advanced. - Enter
Projected VM targetsoGrowth PathandHost Laddercan compare today's footprint with the next planned milestone. - Open
AdvancedforCurrent VM count, host CPU and memory reserves, per-VM memory overhead, utilization caps, scheduler/NUMA penalty, ballooning safety buffer, growth reserve, and maintenance reserve. - Use
Largest VM vCPU,Largest VM memory,Largest VM count, andNUMA nodes per hostwhen a few outsized guests can break placement even though the average VM count looks safe. - Check
Capacity Brieffirst. IfValidation adjustmentsorInput guardrailnotes appear, reviewNormalized Inputsbefore trusting the number.
Use Policy Metrics to identify the limiting resource, Placement Mix for large-VM packing, Failure Envelope for failed-host sensitivity, and Host Capacity Ladder Chart when you need to explain the next host-count decision.
Interpreting Results:
Policy Window Safe VM Ceiling is the headline count. It is the smaller CPU or memory VM limit after failed hosts, maintenance reserve, host reserves, utilization caps, overcommit ratios, scheduler penalty, ballooning buffer, and growth reserve are applied. It is a planning ceiling for the modeled average VM shape, not a guarantee that every real VM will start during a failure.
A positive Runway at policy means the current VM count fits the selected policy window. A negative runway means the current fleet is already above the modeled ceiling. Do not treat a positive runway as approval by itself: check Primary limiter at policy, Risk score, and Largest VM placement fit before using the count as a quota.
| Output cue | How to read it | What to verify next |
|---|---|---|
Primary limiter at policy |
CPU or memory produced the lower VM count. | Tune the limiting resource first. Adding the non-limiting resource will not raise the safe ceiling much. |
Risk score |
A posture score from overcommit, utilization caps, growth reserve, and runway pressure. | Treat Aggressive and Stressed as prompts for telemetry review before using the ceiling operationally. |
Largest VM placement fit |
Whether the largest modeled guest fits one surviving host and, when possible, one NUMA node. | If it reports cross-NUMA or broken placement, review host shape before trusting aggregate density. |
Mixed-estate safe ceiling |
Safe count after reserving slots for a declared cohort of large VMs. | Use it instead of the aggregate ceiling when Largest VM count is above 0. |
Minimum hosts for projected target |
Smallest host count found within the 512-host planning range for the projected VM target. | Compare it with Failure Envelope before buying, decommissioning, or changing the N+ target. |
Technical Details:
The model starts by reducing total cluster size to active hosts. A host counted for failure tolerance is not counted as guest capacity in the policy row, and a host reserved for maintenance is also removed. CPU and memory are then handled separately because they fail in different ways: CPU pressure usually appears as scheduling delay, while memory pressure can lead to reclaim, swap, failed placement, or poor restart behavior.
Both resource pools are reduced before they become a VM count. CPU reserve is removed per active host before CPU overcommit is applied. Memory reserve is removed per active host before memory overcommit, target utilization, scheduler efficiency, and the ballooning buffer are applied. The final count is floored because partial VMs cannot be scheduled as capacity.
Formula Core:
The main calculation compares a CPU-derived VM count with a memory-derived VM count, then keeps the lower value and removes growth headroom.
| Symbol | Meaning | Source in the model |
|---|---|---|
A |
Active hosts left after failure and maintenance reserves. | Hosts in cluster minus Failure tolerance target minus Maintenance reserve hosts. |
C |
CPU policy pool in schedulable vCPU. | Physical cores, CPU reserve, CPU overcommit, CPU utilization cap, and scheduler/NUMA penalty. |
R |
Effective memory pool in GiB. | Memory per host, host system reserve, memory overcommit, memory utilization cap, scheduler/NUMA penalty, and ballooning safety buffer. |
v, g, o |
Average VM vCPU, average VM memory, and per-VM memory overhead. | Visible VM-shape fields plus the Advanced overhead field. |
E, B, Q |
Scheduler efficiency, ballooning memory factor, and growth factor. | 1 - scheduler penalty, 1 - ballooning buffer, and 1 - growth reserve. |
For the default general-server assumptions, a 10-host cluster with N+1 and one maintenance reserve leaves 8 active hosts. With 32 cores per host, 2 reserved cores per host, 4x CPU overcommit, a 75 percent CPU cap, and a 5 percent scheduler penalty, the CPU policy pool is 684 vCPU. Dividing by 2 vCPU per average VM gives 342 CPU-limited VMs. A 15 percent growth reserve lowers that CPU-bound planning ceiling to 290 VMs when memory is not lower.
Placement and Risk Rules:
Large-VM placement uses per-host and per-NUMA slices rather than only aggregate cluster totals. A guest can pass the aggregate CPU and memory math while still reporting Does not fit a surviving host or Fits host only with cross-NUMA placement. When a declared large-VM cohort is enabled, host-local slots are reserved first, and only the remaining CPU and memory are converted into average-VM capacity.
| Risk band | Score range | Signals that raise the score |
|---|---|---|
Conservative |
0 to 24 | Policy runway remains healthy and overcommit, utilization, and growth reserve stay moderate. |
Balanced |
25 to 49 | Some density pressure exists, but the policy row still has usable runway. |
Aggressive |
50 to 74 | High CPU or memory overcommit, high utilization caps, low growth reserve, or thin runway. |
Stressed |
75 to 100 | Negative runway or several aggressive settings at once. |
| Input area | Modeled bounds | Why the bound exists |
|---|---|---|
| Cluster size and failure policy | 1 to 512 hosts; failure and maintenance reserves are reduced when needed so at least one active host remains. | The planner must avoid impossible policy rows and keep the host ladder inside its search range. |
| Overcommit ratios | CPU from 1x to 12x; memory from 1x to 3x. | Extreme ratios are clamped before they can create misleading VM counts. |
| Utilization caps and buffers | CPU and memory utilization caps from 40 percent to 95 percent; ballooning and growth reserves from 0 percent to 40 percent. | The result keeps steady utilization, memory reclaim risk, and future growth visible. |
| Large-VM cohort | A cohort count is ignored unless largest-VM vCPU and memory are both above 0. | A slot model needs both CPU and memory shape to reserve host-local capacity. |
Accuracy Notes:
This calculator is a capacity planning aid, not a live hypervisor admission-control engine. Use the output to structure hardware sizing, maintenance reviews, and growth discussions, then compare it with real cluster telemetry, reservation policy, and platform limits before making production commitments.
- CPU ready time, co-stop, storage latency, network contention, backup windows, and migration traffic are not measured here.
- Hypervisors use their own admission-control, reservation, affinity, and restart-priority rules. Match those rules before treating
Policy Window Safe VM Ceilingas an approval limit. - Memory overcommit deserves extra caution for device-pinned, database, boot-storm, or swap-sensitive workloads.
- Different host generations can have different per-core performance even when the core count is the same.
Worked Examples:
A general server cluster with spare runway
With General Server Cluster, 10 hosts, N+1, one maintenance reserve, 32 physical cores per host, 512 GiB per host, 2 vCPU and 8 GiB per average VM, the policy row has 8 active hosts. The CPU side produces 342 average VM slots before growth reserve, memory is higher, and Safe VM ceiling at policy becomes 290 VMs. If Current VM count is 180, Runway at policy is +110 VMs and Primary limiter at policy is CPU.
A small cluster that misses the current fleet
A six-host cluster with the same VM shape, N+1, and one maintenance reserve has only four active hosts at the policy row. The same CPU assumptions yield a Safe VM ceiling at policy of about 145 VMs. If Current VM count is 160, Runway at policy reads -15 VMs, and Minimum hosts for current fleet points to 7 hosts under the same policy.
A large-VM cohort that breaks placement
In the 10-host default cluster, a single 64 vCPU and 384 GiB guest can fit a surviving host but not one two-node NUMA slice, so Largest VM placement fit reports cross-NUMA placement. If Largest VM count is raised to 10 for that same guest shape, Policy large-VM slots is only 8 at the policy row. Placement Mix then warns that the declared cohort does not pack after failover, even though the aggregate VM ceiling still looks positive.
A policy input that gets corrected
Three hosts with Failure tolerance target set to N+2 and Maintenance reserve hosts set to 2 cannot leave an active host for guests. The model reduces the failure target after the maintenance reserve is applied, adds an Input guardrail note, and lists Validation adjustments in Policy Metrics. Review Normalized Inputs before using the adjusted result in a design review.
FAQ:
Why is the safe ceiling lower than the CPU or memory limit?
The CPU and memory limits are hard resource counts before the final growth reserve. Policy Window Safe VM Ceiling keeps the smaller of those counts and then applies Growth reserve, so it is intentionally lower than the raw limiting side.
What CPU overcommit ratio should I use?
Use a ratio that matches measured workload behavior. Bursty, low-duty-cycle guests can tolerate more CPU overcommit than latency-sensitive or constantly busy guests. Check Risk score, Primary limiter at policy, and real CPU ready or scheduling metrics before raising the ratio.
Why model memory overcommit separately?
Memory pressure can lead to reclaim, ballooning, host swap, guest swap, or failed restart placement. That is why Memory overcommit ratio, Memory utilization cap, and Ballooning safety buffer affect the RAM-derived VM count separately from CPU.
Why can a large VM fail when the cluster has spare aggregate capacity?
Aggregate capacity can be spread across many hosts. Largest VM placement fit checks whether the large guest fits one surviving host and one NUMA slice, which is a different question from whether the whole cluster has enough total CPU and memory.
What should I do when Validation adjustments appear?
Open Normalized Inputs and check which field was clamped or rounded down. Common causes are too many failure or maintenance hosts for the cluster size, unsupported overcommit values, and a large-VM cohort count that exceeds the modeled fleet.
Are my cluster numbers uploaded for calculation?
The capacity model runs in the browser after the page loads. Copied rows, downloaded tables, DOCX reports, and JSON exports include the values you entered, so handle those outputs as internal capacity data.
Glossary:
- Active hosts
- Hosts left for guest workload after failed hosts and maintenance reserve hosts are removed.
- N+1
- A failure policy where the cluster should keep running after one host is unavailable.
- Overcommit ratio
- The planned amount of virtual CPU or memory assigned for each physical unit.
- Utilization cap
- The maximum planned steady resource use before the model treats the pool as full.
- Growth reserve
- Final headroom held back from the hard VM count for normal growth and procurement delay.
- Ballooning
- A memory reclamation technique where the hypervisor asks guests to release unused memory back to the host.
- NUMA
- A host architecture where CPU and memory locality is split across nodes, often aligned with sockets.
- Placement slot
- A host-local position where one declared large VM can fit after the selected failure policy is applied.
References:
- Insufficient HA failover resources alert, Broadcom.
- Error: Insufficient configured resources to satisfy the desired vSphere HA failover level on the cluster, Broadcom.
- Hyper-V maximum scale limits in Windows Server, Microsoft Learn, October 17, 2025.
- Virtualization Deployment and Administration Guide, Red Hat Enterprise Linux 7 documentation.
- CPU Performance Options, Red Hat Enterprise Linux 7 documentation.
- AHV Internals: Memory Overcommit, Nutanix, December 23, 2024.