{{ summaryTitle }}
{{ summaryPrimary }}
{{ summaryLine }}
{{ badge.label }}
{{ errorBudgetStage.sloLabel }} Window Budget {{ errorBudgetStage.eventLabel }}
Error budget inputs
Name the reliability slice represented by these request counts.
Enter the success objective for this request-based SLO.
%
Use the period the service is judged against.
days
Set how much of the reporting window the observed counts cover.
days
Use the total request/event count that the SLO considers eligible.
Enter the event count that burns error budget in the same elapsed window.
Leave 0 for automatic traffic projection from the elapsed window.
{{ policy_watch_percent }}%
Use 70% as a conservative watch gate before the budget is fully spent.
Compare current burn rate to a high-severity short-window gate.
x
Compare current burn rate to a lower, longer-window watch gate.
x
{{ incident_budget_share_percent }}%
Use this gate when the current bad-event count represents one incident or class of failures.
Metric Value Operational note Copy
{{ row.metric }} {{ row.value }} {{ row.note }}
Gate Threshold Current Action Copy
{{ row.gate }} {{ row.threshold }} {{ row.current }} {{ row.action }}
Day Projected used Sustainable line Projected remaining events Copy
{{ row.day }} {{ row.projectedUsed }} {{ row.sustainableUsed }} {{ row.remainingEvents }}
Customize
Advanced
:

Introduction:

Reliability targets become practical only after the percentage is turned into a limited allowance. A service-level objective, or SLO, states the success level a service is expected to meet during a defined period. The error budget is the part that can go wrong before that promise is missed. For a request-based SLO, a 99.9% target leaves 0.1% of eligible requests for failed, slow, dropped, or otherwise bad events.

That small allowance changes with traffic volume and with the part of the window already observed. A checkout API with 1,000,000 eligible requests in a month gets about 1,000 bad requests of room at 99.9%. If the same service handles 40,000,000 requests, the percentage is unchanged but the count is 40,000. If only ten days of a thirty-day window have elapsed, the team also needs to know whether the current pace would stay inside the full-period allowance.

SLO conversations often fail because the terms sound interchangeable. The SLI, or service-level indicator, defines what is measured. Eligible events are the requests or jobs that belong in the SLI denominator. Budget-consuming events are the eligible events that the SLO policy counts as bad. The reporting window is the period being judged, such as 7, 28, 30, or 90 days.

Common error budget decisions and what must be checked
Situation Budget question Common mistake
Release readiness Is enough budget left to accept extra deployment risk? Looking only at the current success rate and ignoring projected period use.
Incident review How much of the period budget did one failure class consume? Counting all failures without matching them to the SLI's eligible-event rule.
Low-traffic service Is the sample large enough for a stable burn-rate readout? Treating a few bad events as a precise long-term trend.
Traffic spike or seasonality Should the full-period budget use observed pace or a known traffic forecast? Projecting from an unusual sample without checking whether the traffic mix is representative.

Budget used and burn rate answer related but different questions. Budget used shows how much of the period allowance has already been spent. Burn rate compares the observed failure ratio with the SLO allowance. A burn rate of 1.00x is exactly sustainable, 2.00x spends budget twice as fast as the SLO permits, and 10.00x can turn a monthly allowance into only a few days of runway.

Error budget math does not explain the cause of failures. It does not know whether the bad events came from a deployment, dependency, region, retry storm, or customer segment. Its job is narrower: show how much reliability allowance has been spent, how fast the current pace is burning through the rest, and whether the numbers are trustworthy enough to support a policy decision.

How to Use This Tool:

Start with the SLO definition, then enter one consistent slice of observed traffic. The result is only useful when the elapsed window, eligible events, and budget-consuming events all describe the same service and SLI.

  1. Enter Service or SLO name with the label you want to carry into the result tables and evidence exports.
  2. Set SLO target to the success objective, then set Reporting window to the full compliance period.
  3. Set Elapsed window to the amount of time covered by the observed counts. If it exceeds the reporting window, the validation message blocks the calculation.
  4. Enter Eligible events observed and Budget-consuming events from the same SLI slice. Budget-consuming events cannot be negative and cannot be greater than eligible events observed.
  5. Use Expected period events when you already have a full-window traffic forecast. Leave it at 0 when the observed eligible-event pace is the best projection.
  6. Adjust Policy watch gate, Fast-burn gate, Slow-burn gate, and Incident share gate only when your team has written thresholds that differ from the defaults.
  7. Read Budget Ledger first, then compare the same result against Burn Policy Gates, Budget Runway Curve, and Runway Points.

If the summary says the error budget needs input, fix the validation issue before using burn rate, projected exhaustion, or policy-gate output in a review.

Interpreting Results:

Current burn rate is the fastest read on sustainability. Values at or below 1.00x mean the observed bad-event ratio is within the SLO allowance. Values above 1.00x mean the current mix spends budget faster than the reporting window can sustain.

Budget used now and Projected period use answer different questions. Budget used now compares the bad events already seen with the full-window budget. Projected period use extends the current bad-event pace through the whole reporting window. A service can look acceptable in the elapsed slice while still projecting close to, or beyond, the full budget.

  • Burn Policy Gates maps the same calculation to review actions such as release watch, slow-burn follow-up, fast-burn incident review, and sample-size caution.
  • Projected exhaustion is a pace estimate, not a promise. It assumes the current bad-event rate continues and traffic projection remains representative.
  • Observed success rate should match the SLO definition you intended to test. A high success rate can still be risky when the SLO target is very strict.

The main false-confidence trap is mixing data slices. Before acting on a green or yellow result, verify that the event counts, SLO target, elapsed window, and traffic forecast came from the same measurement policy.

Technical Details:

The calculation treats the SLO target as an allowed failure ratio, then applies that ratio to projected eligible events for the full reporting window. When a full-window event forecast is not supplied, the observed eligible-event rate is extrapolated across the reporting window. When a forecast is supplied, it must be at least as large as the eligible events already observed.

Burn rate is normalized against the SLO allowance. This matters because the same raw error rate has different meaning at different targets. A 0.2% bad-event ratio is mild for a 99% SLO, but it is a 2.00x burn rate for a 99.9% SLO because the allowed failure ratio is only 0.1%.

100% budget watch gate elapsed window projected miss reporting window budget used projected burn sustainable line

Formula Core:

The core equation is the ratio of observed bad events to eligible events, divided by the failure ratio allowed by the SLO. Counts are then projected to the selected reporting window.

F = 100-SLO100 R = budget-consuming eventseligible events observed Burn rate = RF Nperiod = forecast events or Nobserved×reporting dayselapsed days Full-window budget = Nperiod×F Projected period use = bad events per day×reporting daysfull-window budget×100 Exhaustion day = full-window budgetbad events per day

Using the default example, a 99.9% SLO gives F = 0.001. With 12,500,000 eligible events and 8,200 budget-consuming events over 10 elapsed days, R = 0.000656 and burn rate is 0.656x. With no manual forecast, the 30-day projected traffic is 37,500,000 events, the full-window budget is 37,500 events, and projected period use is 65.6%.

Input bounds used by the error budget calculator
Input Calculation role Accepted boundary
SLO target Sets the allowed failure ratio. Greater than 0% and less than 100%.
Reporting window Sets the full compliance period. Greater than 0 days.
Elapsed window Sets the observed time slice. Greater than 0 days and not longer than the reporting window.
Eligible events observed Provides the denominator for the observed SLI slice. Greater than 0.
Budget-consuming events Provides the numerator for burn rate and projection. At least 0 and not greater than eligible events observed.
Expected period events Overrides observed-pace traffic projection. 0 for automatic projection, or at least the eligible events already observed.
Policy gate meanings and default thresholds
Gate Default boundary Meaning
Sustainable burn burn rate <= 1.00x The observed bad-event ratio is within the long-period allowance.
Slow-burn watch burn rate < 2.0x Crossing this line suggests a longer-window reliability follow-up.
Fast-burn incident burn rate < 10.0x Crossing this line is a short-window incident candidate if live traffic confirms the same pace.
Release-risk watch projected use < 70% Crossing this line asks for release-risk review before the budget is exhausted.
Incident share current bad events < 20% of full-window budget A single incident or failure class above this line deserves budget-policy documentation.
Sample confidence observed allowed failures >= 10 Smaller samples can be valid but noisy, especially for strict SLOs and low traffic.

Privacy and Accuracy Notes:

The calculator works from the values entered on the page and does not query live SLO dashboards, alert policies, logs, or service telemetry. Accuracy therefore depends on the numbers and definitions you bring to the calculation.

  • Do not paste sensitive customer data into the service name or exported evidence if those files will be shared outside the team.
  • Use the same SLI definition for eligible and budget-consuming events. Mixing request classes, time zones, or windows can make burn rate misleading.
  • Treat projected exhaustion as a planning estimate. Traffic changes, mitigations, retries, and incident recovery can move the actual budget path.

Worked Examples:

Checkout API still below the watch gate

A checkout service with a 99.9% SLO has a 30-day reporting window, 10 elapsed days, 12,500,000 eligible events, and 8,200 budget-consuming events. Current burn rate is 0.656x. Full-window error budget is 37,500 events when traffic is projected from the observed pace, and Projected period use is 65.6%. The result is under the default 70% release-risk watch gate, but close enough to mention in release notes.

One-day incident slice crosses the fast-burn gate

A service with the same 99.9% SLO records 1,000,000 eligible events and 12,000 bad events in one day of a 30-day window. Current burn rate is 12.00x, above the default Fast-burn incident gate of 10.0x. Projected exhaustion lands around day 2.5 if that pace continues, so the output supports incident review rather than normal release approval.

Traffic forecast entered too low

An analyst enters 2,000,000 Eligible events observed, then sets Expected period events to 1,500,000. The validation message says expected period events cannot be less than already observed eligible events. Raising the forecast above the observed count, or returning it to 0 for observed-pace projection, restores Budget Ledger and the runway outputs.

FAQ:

What counts as a budget-consuming event?

The SLO definition decides that before the calculation starts. It may be failed requests, slow responses, unsuccessful jobs, dropped messages, or another event class that the service owner counts against the SLI.

Why can the observed success rate look good while projected period use is high?

The observed success rate only describes the elapsed sample. Projected period use stretches the current bad-event pace across the full reporting window, so a short sample can still point to budget exhaustion later.

When should I enter expected period events?

Enter a forecast when the full-period traffic volume is known or when the observed traffic rate is not representative. Leave Expected period events at 0 when observed pace is the best available estimate.

What does burn rate above 1.00x mean?

It means the observed bad-event ratio is higher than the SLO allows over a sustained period. If the same mix continued, the service would spend the full error budget before the reporting window ended.

Does this verify my monitoring configuration?

No. The tool calculates from the values you enter. It does not inspect live service traffic, dashboard queries, alert rules, or whether the SLI was implemented correctly.

Glossary:

Service-level objective (SLO)
The target success level a service is expected to meet during a reporting window.
Service-level indicator (SLI)
The measurement definition used to decide which events count as good, bad, and eligible.
Eligible events
The requests, jobs, or other events included in the SLI denominator.
Budget-consuming events
The eligible events that count as failures under the SLO policy.
Error budget
The allowed number of budget-consuming events before the SLO target is missed.
Burn rate
The observed bad-event ratio divided by the failure ratio allowed by the SLO.
Projected period use
The share of the full-window budget expected to be spent by period end if the current bad-event pace continues.
Exhaustion day
The reporting day when the full-window error budget would reach 100% at the current bad-event pace.

References: