{{ summaryHeading }}
{{ summaryPrimary }}
{{ summaryLine }}
{{ badge.label }}
Labels Series TSDB Cost {{ stage.marker }}
Prometheus cardinality cost inputs
Start with a realistic shape, then replace the sample values with Prometheus or remote-write usage data.
Choose the cost model closest to your bill, then edit the rates below.
Use prometheus_tsdb_head_series, Grafana/Mimir active series, or a billing export for this scope.
Use the dominant scrape interval for these series after relabeling and drops.
seconds
Sampling pressure {{ scrapeIntervalLabel }}
Use the period that actually creates billable storage or Prometheus TSDB disk pressure.
days
Retention window {{ retentionLabel }}
This affects storage footprint and self-hosted storage cost, not active-series list pricing.
bytes/sample
Compression assumption {{ bytesPerSampleLabel }}
Use current vendor pricing, a negotiated rate, or an internal showback rate.
$ / 1k series mo
Set to 0 for pure usage pricing or internal chargeback.
Use platform storage cost, managed block storage cost, or an internal blended rate.
$ / GiB mo
Use compute memory cost or internal Prometheus shard chargeback.
$ / GiB mo
Enter 0 to skip budget variance checks.
$ / mo
Model the likely fan-out behind expensive metrics; rows are ranked in the offender table and chart.
{{ offenderHint }}
Names the estimate in copied summaries, tables, and JSON.
Keep 30 for most planning estimates; change for exact billing-month reconciliation.
days
Use observed shard memory when possible; this default is intentionally conservative for planning.
KiB / series
Use 1 for a single local Prometheus TSDB; use your managed store or object storage replication factor otherwise.
Use 0 when the active-series or resource rates already include platform overhead.
$ / mo
{{ reductionTargetLabel }}
Use the target from Adaptive Metrics, relabeling, metric drops, or recording-rule compaction work.
{{ growthLabel }}
Use recent growth from active-series history or planned launch traffic.
Use 3 to 24 months for planning and budget review.
months
Prometheus cardinality cost ledger
Cost item Estimate Monthly cost Detail Copy
{{ row.item }} {{ row.estimate }} {{ row.cost }} {{ row.detail }}
Prometheus high-cardinality metric offenders
Metric Series Highest label Share Avoidable cost Copy
{{ row.metric }} {{ row.seriesDisplay }} {{ row.highLabelDisplay }} {{ row.shareDisplay }} {{ row.savingsDisplay }}
No valid offender rows are currently parsed.
Prometheus cardinality reduction plan
Finding Signal Recommendation Impact Copy
{{ row.finding }} {{ row.signal }} {{ row.recommendation }} {{ row.impact }}
{{ promqlArtifact }}
{{ jsonPayload }}
Customize
Advanced
:

Introduction:

A Prometheus bill or capacity alert often points to a number that is easy to underestimate: active time series. The expensive part is rarely the metric name by itself. A series is the metric name plus one exact set of label values, so a counter split by method, status, route, tenant, pod, and bucket can turn one useful measurement into thousands or millions of stored streams.

Cardinality becomes a cost problem when those label combinations stay active long enough to be billed, held in memory, shipped through remote write, or retained on disk. Short scrape intervals add more samples to every active series. Longer retention keeps those samples around. Managed metrics systems may charge by active series, samples, storage, or negotiated units, while self-hosted Prometheus still pays through memory, disk, compute, and operational risk.

The hard judgment is not simply "fewer labels is better." Labels carry ownership, routing, incident context, and customer boundaries. Removing tenant from every metric may hide a noisy customer; keeping raw request paths or user IDs on a hot histogram may make the monitoring system more expensive than the service being monitored. Good cardinality work separates labels that help people debug from labels that only create fan-out.

metric labels series fan-out TSDB retention cost storage and head memory
Prometheus cardinality cost factors
Factor What changes Typical mistake
Active series Billable series count, head memory, and index pressure. Using a fleet total against a shard-level budget.
Scrape interval Samples per second and retained sample volume. Assuming every job uses the default interval.
Retention How long samples keep consuming storage or billable history. Comparing a short local window with a longer remote bill.
Label fan-out How one metric family multiplies across label values. Counting metric names instead of full label combinations.

Histograms deserve special care because each bucket is another series for every label combination. Short-lived labels also create churn: a series appears, consumes memory and remote-write bookkeeping, then disappears before anyone gains much diagnostic value from it. In both cases, the cost estimate should be tied to the same tenant, shard, account, or remote-write stream that the engineering team can actually change.

Cardinality review is therefore part arithmetic and part instrumentation hygiene. The arithmetic turns active series, scrape interval, retention, replication, and pricing into a cost estimate. The hygiene work asks whether the top offending metric families can be aggregated, relabeled, templated, sampled, or moved to logs or traces without weakening alerts and incident response.

Active series
A series that is currently receiving samples or is counted by the metrics backend as active for billing or capacity purposes.
Cardinality
The number of unique metric-name and label-value combinations.
Series churn
The creation and disappearance of series over time, often caused by short-lived pods, request IDs, workers, or other unstable labels.

How to Use This Tool:

Use one consistent scope for the whole estimate: one Prometheus shard, one managed metrics tenant, one remote-write stream, or one internal chargeback boundary.

  1. Choose a Workload preset to load realistic starting assumptions, then replace Active series with a measured value from prometheus_tsdb_head_series, a managed metrics dashboard, or a billing export.
  2. Set Scrape interval, Retention period, and Bytes per sample to match the data being costed. The storage estimate changes with these fields even when active-series billing does not.
  3. Select a Pricing profile. Use managed active-series rates for vendor bills, self-hosted resource rates for Prometheus shard planning, or blended chargeback when both apply.
  4. Paste High-cardinality metric rows in the visible format: metric, base series, label value counts, owner, and action note separated by vertical bars.
  5. Use Advanced for Billing days per month, Head memory per series, Replication factor, fixed monthly cost, reduction target, growth rate, and forecast horizon.
  6. If row warnings appear or modeled cost is more than 3x the budget cap, fix the metric-row format, active-series scope, or rates before treating Cost Ledger as a budget estimate.

Interpreting Results:

Start with Total monthly estimate and Budget posture, then move to Label Offenders before deciding what to change. The cost total says whether the current scope is financially interesting. The offender rows point to the metric families and labels most likely to reduce spend or resource pressure.

A large Avoidable cost value is a review signal, not permission to delete a label. Check the Owner, the action note, dashboards, alert rules, incident workflows, and the PromQL Checks output before dropping, hashing, capping, or aggregating labels that carry operational meaning.

  • Budget posture is meaningful only when the monthly budget cap is entered for the same tenant or shard as the active-series count.
  • Parsed row coverage can exceed active series when rows overlap or base series are overestimated. Treat that warning as a scope check.
  • Cardinality Cost Curve models active-series growth or reduction under the current rates. It does not predict contract changes, new histogram buckets, or a different retention policy.
  • PromQL Checks are query patterns to run in your own Prometheus-compatible system; no live metrics backend is queried.

Technical Details:

Two quantities drive most Prometheus cardinality cost estimates. Active series controls billable series units and head memory. Sample rate controls retained storage because each active series contributes one sample at each scrape. A 15 second scrape interval produces four times as many samples as a 60 second scrape interval for the same active-series count.

The estimate supports managed active-series pricing, self-hosted resource pricing, and blended internal chargeback. Managed pricing applies a monthly rate per 1,000 billable series after any included amount. Self-hosted pricing applies retained storage and head RAM rates plus a fixed monthly floor. Blended chargeback combines those pressures when a platform team wants one showback number for vendor-like usage and infrastructure cost.

Formula Core:

S = N I G = S × D × 86400 × B × F 2 30 C = max ( 0 , N - Q ) 1000 × R + G × P + H × M + L

N is active series, I is scrape interval in seconds, S is samples per second, D is retention days, B is bytes per sample, F is replication factor, G is retained GiB, Q is included series, R is active-series rate per 1,000 series-month, P is storage rate per GiB-month, H is head memory in GiB, M is RAM rate per GiB-month, L is fixed monthly cost, and C is total monthly cost. Cost components that do not belong to the selected pricing profile are treated as zero.

Input or parsed field How it changes the estimate Common failure mode
Active series Raises active-series cost, head memory, sample rate, and marginal savings. Mixing a shard count with a tenant-level budget.
Scrape interval Shorter intervals increase samples per second and retained bytes. Using the default interval when hot jobs scrape faster.
Bytes per sample Controls retained storage for self-hosted or blended models. Ignoring native histograms, exemplars, or unusually large label sets.
Metric row label counts Base series are multiplied by each label's value count to estimate fan-out. Counting overlapping rows as if they were independent.
Reduction target Applies the current marginal cost per series-month to a target series reduction. Treating savings as guaranteed before validating queries and ownership.

Default substitution: 850,000 active series at a 30 second scrape interval produce about 28,333 samples per second. With 30 day retention, 1.8 bytes per sample, and one replica, retained storage is about 123 GiB. Managed active-series pricing at $8 per 1,000 series-month with 10,000 included series produces an active-series charge of about $6,720 per month before any resource or fixed platform costs.

Accuracy Notes:

Cardinality planning is sensitive to scope and measurement windows. Use real billing exports, active-series metrics, and top-series reports from the same tenant or shard whenever possible.

  • Managed metrics products may bill by active series, samples, datapoints, storage, or a custom contract unit. Match the editable rates to the bill you are reconciling.
  • Bytes per sample is a storage-planning assumption. Label size, chunk compression, histograms, exemplars, retention, and remote storage behavior can shift the real footprint.
  • Head-memory estimates are not a substitute for measuring process memory, shard pressure, and remote-write queue behavior.
  • Do not remove high-cardinality labels that are required for alerts, routing, customer isolation, compliance, or incident response without owner review.

Worked Examples:

Kubernetes API service under a cap

A Kubernetes API preset with 850,000 Active series, a 30 second Scrape interval, 30 day Retention period, and managed active-series billing produces an estimate close to the active-series charge. With a $7,500 Monthly budget cap, Budget posture can still be within budget while Label Offenders shows route or pod labels that deserve cleanup before the next release.

Multi-tenant route label growth

A row such as api_request_total | 1 | tenant_id=120,route=90,status=9,method=5 | edge-api | normalize route templates models 486,000 series before overlap checks. Highest label points to the largest multiplier, while Avoidable cost estimates savings if that label were capped or removed. Route template normalization may be the right action when tenant visibility still matters.

Row warning before cleanup planning

If a pasted row omits the label-count field or uses labels without numeric counts, the alert names the row that cannot be parsed and Label Offenders may show fewer rows than expected. Correct the row to the metric, base series, and label=value_count pattern before using Reduction target savings or Parsed row coverage.

FAQ:

What counts as high cardinality in Prometheus?

A label becomes high-cardinality when it has many possible values and combines with other labels to create many series. User IDs, request IDs, raw paths, pod names, worker IDs, span names, and tenant IDs are common examples to review.

Why can active-series pricing ignore scrape interval?

Some managed bills charge primarily by active series, so changing Scrape interval may not change that line. The same change still affects Sample pressure, retained storage, remote-write load, and self-hosted resource cost.

Does it connect to my Prometheus server?

No. The estimate uses the active-series, pricing, and metric-row values you enter. PromQL Checks gives query text you can run in your own environment.

What should I do when offender rows exceed active series?

Treat the warning as a scope problem. Check for overlapping metric rows, repeated labels, inflated base series, or an Active series count from a narrower shard than the offender list.

Glossary:

Active series
A currently active metric-name and label-value combination counted by the TSDB or metrics backend.
Cardinality
The number of distinct series produced by metric names and labels.
Head memory
Memory used for recently active series, samples, labels, and indexes before older data is compacted.
Retention
The period for which samples remain stored in the TSDB or remote metrics system.
Series churn
Frequent creation and disappearance of series, often caused by short-lived or unbounded label values.

References: