Projected Usable Capacity
{{ formatCap(usableTB) }} {{ unit }}
{{ formatCap(rawEffectiveTB) }} {{ unit }} Raw (post-overhead) {{ protectionLabel }} {{ (100 * totalEfficiency).toFixed(1) }} % Protected Eff.
Node {{ i }}
Filled: {{ manualValidCount }} / {{ Math.max(0, Math.floor(Number(nodeCount) || 0)) }} nodes, {{ formatCap(rawTB) }} {{ unit }} total (before overhead).
{{ (nearfull * 100).toFixed(0) }} %
OSDs / node
Cap {{ unit }}
{{ (osdOverhead*100).toFixed(1) }} %
{{ (skew*100).toFixed(1) }} %
Recommended nearfull ≤ {{ (recommendedNearfull * 100).toFixed(0) }}% to recover after {{ failNodes }} {{ failureDomain }} failure{{ failNodes===1?'':'s' }}. Using {{ (effectiveNearfull * 100).toFixed(0) }}%.
MetricValue
Nodes
{{ nodes.length }}
Raw storage ({{ unit }})
{{ formatCap(rawTB) }}
Overhead ({{ unit }})
{{ formatCap(rawTB - rawEffectiveTB) }}
Raw post-overhead ({{ unit }})
{{ formatCap(rawEffectiveTB) }}
OSD nearfull (used)
{{ (effectiveNearfull*100).toFixed(0) }} %
Recommended nearfull
{{ (recommendedNearfull*100).toFixed(0) }} %
Protection
{{ protectionLabel }}
Protected efficiency
{{ (100*totalEfficiency).toFixed(2) }} %
Efficiency @ min_size
{{ (100*minEfficiency).toFixed(2) }} %
Usable capacity ({{ unit }})
{{ formatCap(usableTB) }}
Redundancy overhead ({{ unit }})
{{ formatCap(redundantRawTB) }}
Reserved / free raw ({{ unit }})
{{ formatCap(reservedRawTB) }}

                

Introduction:

Ceph stores objects across many nodes using replication or erasure coding to survive failures while distributing load. Usable capacity differs from raw capacity because space is reserved for redundancy, metadata, and rebalancing. Planning therefore requires converting node-level figures into protected efficiency, then applying operational headroom so recovery can complete within your fault-tolerance goals.

This tool models that process from first principles. You enter homogeneous or heterogeneous node capacities, choose a protection scheme, and set operational factors such as nearfull safety, service overhead, placement skew, and failure tolerance. The reactive engine calculates raw post-overhead capacity, recommends a safe nearfull threshold, and derives usable capacity plus a concise breakdown.

Use it when designing a new cluster, validating an expansion, or explaining trade-offs to stakeholders. For example, five 8 TB nodes at 3× replication often yield roughly 10 TB usable after headroom. Caution: it estimates space, not performance; leave margin for background repairs, snapshots, and upgrades.

Technical Details:

Concept Overview

Replication stores r full copies of each object; erasure coding splits data into k data and m parity chunks, reconstructable from any k. Protected efficiency expresses payload as a fraction of consumed raw space. Operational factors include filesystem/OSD overhead, a nearfull ratio limiting usable raw to preserve recovery headroom, and placement skew reflecting CRUSH imbalance. A failure domain defines the blast radius considered in headroom.

Core Equations

Raw post-overhead
R=R_raw×(1o)
Protected efficiency
E=1r (replication)
E=kk+m (erasure coding)
Recommended nearfull
n= min(0.95, max(0.50, 1 min(1, f×D R ) ) )
Usable capacity
U=R×n×E×(1s)
  • Rraw raw sum of node capacities; o overhead fraction; R raw post-overhead.
  • r replicas; k data chunks; m parity chunks; E protected efficiency.
  • n effective nearfull; f tolerated failure domains; D capacity of one domain.
  • s skew fraction modeling placement imbalance; U usable capacity.

Interpretation & Output Semantics

MetricMeaningImplication
UsableEstimated payload space after protection and headroom.Primary figure for planning dataset growth.
RedundancyRaw consumed to achieve durability.Grows with higher replication or parity.
ReservedRaw left unused due to nearfull headroom.Enables rebalance and repair after failures.
Protected efficiencyPayload divided by consumed raw.Equals 1/replicas or k/(k+m).
Efficiency @ min_sizePayload efficiency when operating at the minimum healthy size.Repairs may temporarily reduce efficiency.
Recommended nearfullCeiling ensuring enough free space to heal tolerated failures.Lower when domains are large or many.

Use the protected efficiency and recommended nearfull together; raising one typically lowers the other. A higher headroom percentage trades capacity for faster, safer recovery.

Variables & Parameters

ParameterMeaningUnit/DatatypeTypical RangeNotes
Capacity per nodeHomogeneous node size.MB/GB/TB/PB or MiB/GiB/TiB/PiB1–100 TBUse manual list for mixed nodes.
Total nodesNode count when homogeneous.Integer3–1000Minimum depends on chosen protection.
Node capacitiesComma/space list for heterogeneous clusters.NumbersVariesParsed as a set of node sizes.
Protection modeReplication or erasure coding.EnumChoose to match pools.
Replicas (r)Full copies stored.Integer ≥ 12–4Efficiency = 1/r.
Rep min_sizeMinimum healthy copies.Integer2–3Lower implies degraded risk.
EC k, mData and parity chunks.Integersk=4–12, m=2–6Efficiency = k/(k+m).
EC min_sizeMinimum chunks to serve IO.Integer ≥ kk–k+mLower increases risk under failures.
NearfullAllowed fullness before backoff.Fraction0.50–0.95Clamped to recommendation unless overridden.
OSD/metadata overheadNon-payload storage.Fraction0–0.15Accounts for filesystems and metadata.
CRUSH skewImbalance across devices.Fraction0–0.15Reduces usable to reflect hot/overfull nodes.
Failure domainUnit used for headroom.Host, OSD, or customDomain capacity multiplies tolerated failures.
Domains to tolerateFailures healed without reweighting.Integer ≥ 00–2Higher values reduce nearfull.
Accept degradedAllow operating above recommendation.BooleanOff/OnRisky under multiple failures.

Worked Example

Example (5 nodes × 8 TB, replication r=3, overhead o=0.05, nearfull request 0.80, tolerate f=1 host, domain D=8 TB, skew s=0):

R=40×(10.05)=38 n=1838=0.7895 n=min(0.80,0.7895)=0.7895 E=13=0.3333 U=38×0.7895×0.3333×1=10.0 TB

Redundancy ≈ 20 TB; reserved headroom ≈ 8 TB.

Assumptions & Limitations

  • Headroom is based on the largest domain and tolerated failures, not rack-level power or network constraints.
  • Skew models placement imbalance, not transient spikes or IO hotspots.
  • Nearfull override accepts degraded service and may prolong recovery under additional failures.
  • Overhead parameter aggregates filesystem, small-object effects, and metadata; it is not a precise filesystem model.

Edge Cases & Error Sources

  • Highly heterogeneous nodes lower usable space because headroom is sized by the largest domain.
  • Very small clusters may violate placement rules, making the efficiency at min_size optimistic.
  • Incorrect units (TB vs TiB) lead to consistent scaling errors across all outputs.
  • Setting min_size equal to k+m allows service but eliminates parity tolerance, increasing risk during recovery.

Scientific Validity & References

Equations align with standard Ceph concepts for replica count, erasure coding parameters, placement via CRUSH, nearfull thresholds, and pool min_size. See the Ceph documentation on replication, erasure coding profiles, and cluster fullness thresholds, and the “CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data” paper by Weil et al. for placement theory.

Privacy & Compliance

This calculator processes non-personal infrastructure data and performs all computations client-side; organizational policies and data-handling rules may still apply.

Step-by-Step Guide:

Follow these steps to estimate usable space and export results for planning.

  1. Enter capacity per node and total nodes, or toggle heterogeneous input and paste a list of node sizes.
  2. Select protection mode and set replicas or k+m with an appropriate min_size.
  3. Adjust nearfull, overhead, and skew to match operational practice; prefer the recommended nearfull for safe recovery.
  4. Choose the failure domain, set domains to tolerate, and refine OSDs per node or custom capacity if applicable.
  5. Review the table, charting layer breakdown, and JSON, then copy or download CSV/JSON for records or peer review.

Warning Aggressively raising nearfull or lowering min_size increases operational risk during drive or host failures.

FAQ:

Is my data stored?

No. Calculations run in your browser, and nothing is sent to a server.

What is nearfull?

A safety cap on raw usage that preserves enough free space to heal the number of failures you plan to tolerate at your chosen domain size.

Why does efficiency drop at min_size?

During repair or degraded states, fewer replicas or chunks serve IO; payload per raw temporarily aligns with the min_size efficiency.

Can I mix node sizes?

Yes. Paste capacities as a list. Headroom is sized by the largest domain, so heterogeneity can reduce usable space.

Does it model racks or power zones?

It models a generic failure domain by capacity. For rack-aware planning, use a domain that reflects the largest independent blast radius.

Are binary units supported?

Yes. You can use decimal (MB, GB, TB, PB) or binary (MiB, GiB, TiB, PiB) units consistently across inputs and outputs.

Troubleshooting:

  • Usable shows 0 — check that node count and capacities are positive and protection parameters are valid.
  • Nearfull recommendation seems low — reduce tolerated failures or failure-domain size, or add nodes to increase total raw.
  • Chart appears empty — ensure results are ready and that at least one node capacity is non-zero.
  • CSV/JSON lacks values — confirm browser clipboard/download permissions and that calculations completed first.
  • Unexpected units — verify that inputs and expectations both use decimal or binary consistently.

Advanced Tips:

  • Tip Model maintenance by temporarily increasing tolerated failures; this lowers nearfull and reveals conservative headroom.
  • Tip Compare replication and erasure coding at equal durability to visualize the efficiency benefits versus CPU and latency trade-offs.
  • Tip Use custom domain capacity for rack-sized failure planning when hosts vary or OSD counts differ widely.
  • Tip Keep skew modest; large values suggest topology or weight tuning rather than capacity headroom alone.
  • Tip Export JSON to capture assumptions alongside results for peer review and reproducibility.

Glossary:

Usable capacity
Payload space after protection and headroom.
Replication
Storing multiple full copies of an object.
Erasure coding
Splitting data into chunks with parity for recovery.
min_size
Minimum replicas or chunks required to serve IO.
Nearfull
Fullness limit to preserve recovery space.
CRUSH skew
Imbalance in data placement across devices.
Failure domain
Capacity at risk from a single fault.

Calculations run entirely in your browser; no data is transmitted or stored server-side.