Prometheus Retention Storage Calculator
Calculate online Prometheus retention storage from active series, scrape interval, sample size, retention, and disk budget for TSDB capacity planning.{{ result.summaryTitle }}
| Metric | Value | Detail | Copy |
|---|---|---|---|
| {{ row.metric }} | {{ row.value }} | {{ row.detail }} |
| Guardrail | Setting | Reading | Operator action | Copy |
|---|---|---|---|---|
| {{ row.guardrail }} | {{ row.setting }} | {{ row.reading }} | {{ row.action }} |
Introduction:
Prometheus retention storage is the local disk space needed to keep scraped time series for a chosen number of days. It matters because a Prometheus server keeps ingesting samples while old blocks are compacted and deleted in the background, so disk pressure can appear before a dashboard or alert tells a clean story. A retention change from 15 days to 60 days is not just a setting change; it multiplies the local TSDB footprint unless sample volume or bytes per sample falls at the same time.
The main quantity is ingest rate. Active series, scrape interval, and measured samples per second are different views of the same flow of samples into the time series database. High-cardinality labels increase active series, shorter scrape intervals increase samples per second, and longer retention keeps more compacted blocks on disk.
Disk planning should also leave room for the write-ahead log (WAL), current head chunks, checkpoints, filesystem overhead, and short-term peaks during compaction. Size-based retention can remove old persistent blocks, but it does not make the active WAL/head footprint disappear. A storage plan that exactly equals the block estimate can still run out of space during normal operation.
The estimate is not a durability plan. Prometheus local storage is a single-node database unless the wider architecture provides replication, backup, remote storage, or another long-term tier. The storage number helps decide whether a local retention window is realistic before changing command-line flags, persistent volumes, or shard topology.
Technical Details:
Prometheus stores samples as time series identified by metric name and labels. A new label value creates another active series, so cardinality growth can increase storage faster than target count suggests. The storage estimate starts with how many samples enter one Prometheus shard each second, then multiplies that sustained ingest rate by the retention window and compressed bytes per stored sample.
Prometheus planning guidance commonly uses 1 to 2 bytes per sample for compacted local TSDB blocks. That value is a useful first pass, not a fixed property of every workload. Real bytes per sample vary with series churn, scrape regularity, label cardinality, histogram use, and compression behavior after compaction.
Formula Core:
The core estimate converts series or measured ingest into persistent block bytes, then adds the larger of a percentage reserve or the configured WAL segment minimum.
When measured samples per second is used, that measured value replaces Rsamples directly. The scrape interval is still kept as context because it helps connect the measured rate back to an approximate active-series count.
Inputs and Bounds:
| Input | Accepted rule | Effect on the estimate |
|---|---|---|
Sizing input |
Choose targets and active series, or measured samples/sec. | Controls whether ingest rate is derived from scrape math or entered from Prometheus telemetry. |
Scrape targets and Series per target |
Whole numbers greater than zero in series mode. | Multiplied together to estimate active series for one shard. |
Samples per second |
Positive sustained rate in measured mode. | Used directly as the ingest rate for all storage math. |
Scrape interval |
At least 1 second. |
Shorter intervals raise samples per second when active-series mode is used. |
Bytes per sample |
Greater than zero; 1 to 2 is the first-pass Prometheus planning range. |
Scales persistent block storage linearly. |
Retention period |
At least 1 day, rounded down to whole days. |
Multiplies the persistent block estimate by retained days. |
WAL/head allowance |
0% to 300%. |
Adds reserve for WAL, current head chunks, churn, and compaction peaks. |
Minimum WAL segments |
At least 3 segments of 128 MiB each. |
Sets a floor under the WAL/head reserve even when percentage reserve is small. |
Guardrail Rules:
| Condition | Status or warning | Meaning |
|---|---|---|
| Any required numeric input is zero, negative, or out of range. | Prometheus storage inputs need review |
Fix the validation messages before using the capacity result. |
Planned local disk is greater than the optional disk budget. |
over disk |
The modeled retention window does not fit the configured volume or disk allocation. |
Planned local disk is greater than the retention size target but not greater than raw disk. |
over target |
The plan may fit raw disk but exceeds the requested operating buffer. |
Budget target use is >= 80%. |
headroom watch |
The plan is close enough to the target that churn or growth deserves review. |
Bytes per sample is less than 1 or greater than 2. |
Bytes/sample warning. | Replace the assumption with measured compaction data before committing storage. |
Retention period is greater than 90 days. |
Long-retention warning. | Compare local retention with remote storage or downsampled long-term options. |
Active series estimate is at least 1,000,000. |
Cardinality warning. | High-cardinality labels should be reviewed before extending retention. |
Everyday Use & Decision Guide:
Start with one Prometheus server or shard. Use Targets and active series when you know how many targets the shard scrapes and the average active series after relabeling and drops. Use Measured samples/sec when you already have a stable rate such as rate(prometheus_tsdb_head_samples_appended_total[5m]) for the shard being sized.
Keep the first pass conservative. Leave Bytes per sample in the 1 to 2 range unless you have measured compacted chunk data, and leave room in WAL/head allowance for current chunks, WAL growth, and short compaction peaks. A disk budget is most useful when the result needs to map to a persistent volume, VM disk, or filesystem quota.
- Set
Scrape targetsandSeries per targetfor one shard, not an entire fleet spread across many Prometheus servers. - Use active series, not metric-name count. Label combinations decide how many time series are stored.
- Open
Advancedwhen you need aDisk budget, a saferRetention size target, or a largerMinimum WAL segmentsreserve. - Read
TSDB Footprintbefore the charts. It shows the modeled ingest rate, persistent blocks, WAL/head allowance, planned disk, budget use, and suggestedretention.size. - Use
Retention Size Guardrailswhen a result needs an operator action, such as reducing cardinality, increasing scrape interval, lowering retention, or adding disk.
The estimate is a good fit for planning local TSDB retention, comparing shard options, and checking whether a retention-size cap leaves enough space. It is not a query-performance forecast, backup design, proof of HA durability, or guarantee that a remote storage system will retain the same data.
After the summary shows capacity planned, compare Planned local disk with the real filesystem and then check Suggested retention.size if a disk budget is set. If the status shows over disk or over target, change the retention window, sample rate, bytes-per-sample assumption, or storage allocation before rollout.
Step-by-Step Guide:
Choose the ingest-rate path first, then add retention and disk guardrails.
- Choose
Sizing input. PickTargets and active seriesfor target-count planning, orMeasured samples/secwhen Prometheus telemetry already gives a sustained append rate. - For target-count planning, enter
Scrape targets,Series per target, andScrape interval. The summary badge should show the derived samples/sec rate. - For measured-rate planning, enter
Samples per secondand keepScrape intervalas handoff context.Active series estimatebecomes the measured rate multiplied by the interval. - Set
Bytes per sample,Retention period, andWAL/head allowance. WatchPlanned local diskandWAL/head allowanceupdate inTSDB Footprint. - Open
Advancedwhen a fixed disk must be checked. EnterDisk budget, choose GiB, TiB, or PiB, then setRetention size targetas the usable share of that budget. - If a validation banner says
Prometheus storage inputs need review, fix the listed field before copying any result. Common fixes are positive targets, positive samples/sec, retention of at least one day, and at least three WAL segments. - Use
Retention Runway Curveto compare planned disk at different retention windows, and useStorage Layer Mixto see how much of the estimate is persistent blocks versus WAL/head reserve. - Use the JSON or table outputs only after the status,
Disk budget use, andSuggested retention.sizematch the rollout plan you intend to share.
Interpreting Results:
Planned local disk is the headline number. It combines Persistent TSDB blocks with WAL/head allowance, so it is safer for capacity planning than reading the persistent block estimate alone. Ingest rate explains why the number changes, and Persistent block growth per retained day shows the cost of adding one more day at the current sample rate.
Do not treat Suggested retention.size as the full disk size. It is a block-retention cap derived from the optional disk budget and target percentage. Prometheus size retention deletes persistent blocks, while WAL and head chunks still need separate space during normal ingestion.
| Output cue | Meaning | Useful follow-up |
|---|---|---|
capacity planned |
Inputs are valid and the modeled disk is within any configured budget target. | Compare the result with available filesystem space and leave extra filesystem margin. |
headroom watch |
The plan uses at least 80% of the retention size target. |
Check cardinality growth and WAL/head peaks before raising retention. |
over target |
The plan exceeds the budget target percentage while still fitting raw disk. | Reduce retention, lower sample volume, or raise the target only after reviewing operating buffer. |
over disk |
The planned total is greater than the configured disk budget. | Do not apply that retention window without more disk or lower ingest. |
| Bytes/sample warning. | The assumption is outside the common 1 to 2 bytes/sample planning range. |
Use measured compaction bytes per sample before treating the result as final. |
A green status does not prove the Prometheus server has durable long-term storage. Verify the real data directory, backup plan, filesystem type, and remote storage architecture separately when retention needs to survive node or disk failure.
Worked Examples:
Typical shard extension:
A shard scraping 250 targets with about 1,400 active series per target every 30 seconds has about 350,000 active series and 11,666.67 samples/sec. With 1.8 bytes/sample, 30 retained days, and a 15% WAL/head allowance, TSDB Footprint shows about 50.69 GiB for Persistent TSDB blocks, 7.60 GiB for WAL/head allowance, and 58.30 GiB for Planned local disk. With no disk budget set, Suggested retention.size stays not set.
Measured ingest with a large disk:
A busy shard measured at 25,000 samples/sec with 2 bytes/sample, 45 retained days, and a 20% allowance needs about 217.26 GiB of planned local disk. If Disk budget is 3 TiB and Retention size target is 85%, Disk budget use is about 7.1% and Suggested retention.size is about 2.13 TiB. The modeled cap supports roughly 540.8 days at that same ingest rate, so the requested 45 days has ample budget margin.
Cardinality and retention collide:
A shard with 1,200 targets, 1,600 series per target, and a 15 second scrape interval reaches 1,920,000 active series and 128,000 samples/sec. At 1.8 bytes/sample and 120 retained days, Planned local disk is about 2.50 TiB. A 2 TiB disk budget produces over disk, and the guardrail table warns about high active-series count and long local retention. The corrective path is to reduce cardinality, increase scrape interval, lower local retention, shard differently, or add storage before applying the plan.
Input review before handoff:
If measured mode is selected with Samples per second set to 0, the summary changes to Prometheus storage inputs need review and the validation banner asks for samples per second greater than zero. Fixing that field restores the footprint rows, runway chart, and JSON totals. This prevents a copied report from carrying a meaningless zero-ingest capacity plan.
FAQ:
Should I enter active series or metric names?
Enter active series. Prometheus stores time series by metric name plus label set, so one metric name can create many stored series when labels such as instance, pod, route, status, or customer vary.
What bytes-per-sample value should I use first?
Use 1 to 2 bytes per sample for a first pass, then replace it with measured compacted storage data when you have it. The calculator warns when the value is outside that common planning range.
Why is there a WAL/head allowance if Prometheus has size retention?
Size retention removes persistent blocks, but WAL and current head chunks still consume disk while samples are being ingested. The allowance keeps that active storage separate from the compacted block estimate.
Does the calculator query my Prometheus server?
No. It works from the numbers you enter, such as samples/sec or target and series counts. If you want measured input, copy the sustained append rate from your own Prometheus query and enter it manually.
What should I do when the result says over disk?
Treat over disk as a stop signal for that input set. Lower Retention period, reduce active series, increase Scrape interval, use a lower measured bytes/sample only if evidence supports it, or allocate more disk before using the plan.
Glossary:
- Active series
- Time series currently contributing samples to a Prometheus shard.
- Scrape interval
- The time between scrapes for targets in the modeled shard.
- Samples per second
- The sustained append rate used to calculate retained TSDB storage.
- TSDB blocks
- Compacted local storage blocks that hold retained samples.
- WAL/head allowance
- Reserve for write-ahead log files, current head chunks, and related peaks.
- retention.size
- A Prometheus size cap for retained persistent blocks, separate from raw disk capacity.
References:
- Storage, Prometheus Authors.
- Data model, Prometheus Authors.
- Query functions: rate(), Prometheus Authors.