Incident MTTx Metrics Calculator
Calculate incident MTTD, MTTA, MTTC, and MTTR from lifecycle CSV rows, with target breaches, P85 tails, warnings, and stage charts.- {{ message }}
| Metric | Mean | Median | P85 | Target | Status | Copy |
|---|---|---|---|---|---|---|
| {{ row.metric }} | {{ row.mean }} | {{ row.median }} | {{ row.p85 }} | {{ row.target }} | {{ row.status }} |
| Incident | Severity | MTTD | MTTA | MTTC | MTTR | Breaches | Next action | Copy |
|---|---|---|---|---|---|---|---|---|
| {{ row.id }} | {{ row.severity }} | {{ row.mttd }} | {{ row.mtta }} | {{ row.mttc }} | {{ row.mttr }} | {{ row.breaches }} | {{ row.nextAction }} |
| Severity | Incidents | Average MTTR | Breached incidents | Improvement focus | Copy |
|---|---|---|---|---|---|
| {{ row.severity }} | {{ row.incidents }} | {{ row.avgMttr }} | {{ row.breached }} | {{ row.focus }} |
A response review built from one recovery average can hide where the time was actually lost. An incident may sit unnoticed before an alert fires, wait for ownership after detection, stabilize quickly after mitigation, and still take hours to validate before closure. MTTx metrics split that timeline into named durations so the review can focus on the response stage that needs work.
The "x" in MTTx stands for the specific lifecycle event being measured. MTTD is mean time to detect, MTTA is mean time to acknowledge, MTTC is mean time to contain, and MTTR is mean time to resolve. These terms sound similar, but they answer different operational questions. Detection checks whether monitoring and reporting see impact early. Acknowledgement checks whether responders pick up the incident quickly. Containment checks how fast immediate customer impact is reduced. Resolution checks the broader path to recovery or closure.
- Detected
- The first alert, ticket, report, or opened timestamp that made the incident visible.
- Acknowledged
- The time a responder accepted ownership, assignment, or active response.
- Contained
- The point where impact was mitigated, stabilized, or otherwise kept from getting worse.
- Resolved
- The recovery, closure, fixed, or ended timestamp used by the team as the final incident milestone.
The timestamp definitions have to stay consistent. Counting MTTR from first customer impact is not the same as counting it from alert creation, even if both results are called recovery time. Mixing definitions can make a trend look better or worse without any real change in response work.
Means are useful because they give one comparable number, but incident data often has a long tail. One severe row can pull the mean upward, while several short rows can hide the slowest cases. A percentile such as P85 helps show whether the upper end of the incident set is still missing the target even when the mean looks acceptable.
MTTx metrics are review signals, not verdicts on individual responders. Severity, dependency ownership, staffing hour, customer communication, rollback risk, and data repair can all change the durations. The useful comparison is a consistent incident population with the same timestamp policy, target values, and severity rules.
How to Use This Tool:
Use one coherent incident population, such as a service, queue, team, severity class, or review window. Mixing major outages with low-priority noise can produce a tidy average that is hard to act on.
- Enter
Service or queueso the metric slice is named in the summary, JSON, and downloaded reports. -
Set
MTTA target,MTTC target, andMTTR target. Enter0for a target when you want the metric calculated without breach labels for that stage.MTTR targetis entered in hours, whileMTTA targetandMTTC targetare entered in minutes. - Paste rows into
Incident timestamps CSV, drag in a CSV or TXT file, or chooseBrowse CSV. Files larger than1 MiBare rejected before browser-side parsing. - Use
Load sampleto see the expected lifecycle shape, or useNormalize rowsafter copying rows from a spreadsheet or ticket export. -
Review
Review the incident rowsandIncident data needs attentionbefore trusting the averages. Missing contained or resolved timestamps exclude only the affected metrics, while rows without a usable detected or opened timestamp are skipped.If many rows shown/a, skipped rows, or timestamp warnings, fix headers, time zones, and out-of-order dates before comparing teams or periods. - Read
MTTx Brieffor count, mean, median, P85, target, and status across MTTD, MTTA, MTTC, and MTTR. - Use
Incident Ledger,Severity Breach Queue, andStage Latency Chartto find the rows, severity groups, and lifecycle stages driving the slowest results.
Interpreting Results:
Read MTTR with MTTA and MTTC instead of treating recovery time as one undivided number. A good MTTR with slow MTTA can still point to paging or ownership friction. A good acknowledgement time with slow containment can point to missing rollback steps, unclear mitigation choices, or dependency waits.
| Result tab | Use it for | Do not overread |
|---|---|---|
MTTx Brief | Mean, median, P85, target, and status for each metric. | A passing mean can still hide a P85 tail above target. |
Incident Ledger | Per-incident durations, breach labels, and next action text. | One row can support MTTA while missing MTTC or MTTR. |
Severity Breach Queue | Severity groups ordered by severity rank, breach count, and average MTTR. | Local severity policy still controls what each label means. |
Stage Latency Chart | The slowest MTTR rows split into detect, acknowledge, contain, and resolve segments. | The chart identifies timing concentration, not root cause by itself. |
JSON | Structured metrics, incidents, warnings, severity groups, and target settings. | Identifiers and service names from pasted data can appear in the output. |
Target status is a prompt for review, not proof of a bad decision. A row can exceed MTTR because recovery validation was deliberately slow, because a dependency owner was unavailable, or because customer communication extended the closure time after service was stable.
Verify the slice before comparing results across periods. Keep timestamp meanings, severity labels, inclusion rules, and target values unchanged; otherwise a better-looking number may reflect different data handling rather than improved incident response.
Technical Details:
Incident MTTx calculations reduce each CSV row to timestamp pairs. Detection lag uses impact start and detected/opened time. Acknowledgement, containment, and resolution normally use detected/opened time as the response base, with started time as a fallback when no separate detection timestamp exists.
Missing later stages do not invalidate the whole row. An incident with opened, acknowledged, and resolved timestamps can contribute to MTTA and MTTR even when it has no contained timestamp. Negative durations are excluded because an acknowledged, contained, or resolved time before the response base usually means the row needs repair.
Formula Core:
Durations are measured in minutes, then summarized from the finite values available for each metric.
For P85, p is 85. The displayed median uses the same nearest-rank method with p = 50, so an even row count selects the lower central sorted value rather than averaging the two middle values.
| Metric | Duration rule | What it measures |
|---|---|---|
MTTD | detected/opened - started | How long impact existed before it was visible. |
MTTA | acknowledged - base | Responder pickup time from detected/opened, or from started when detection is absent. |
MTTC | contained - base | Time from response base to containment, mitigation, or stabilization. |
MTTR | resolved - base | Time from response base to recovery, closure, or resolution. |
CSV headers can use common incident-management names rather than one exact schema. Headerless rows are accepted when they follow the common lifecycle order. Date-only values are treated as midnight UTC, and date-time strings are parsed by the browser, so explicit time zones such as Z or an offset are safer than local-time text.
| Field | Accepted meaning | Metric impact |
|---|---|---|
id | Incident key, ticket, or row identifier. | Labels ledger rows, chart bars, and copied row details. |
started | Impact start, outage start, begin, or similar field. | Enables MTTD and can be the response base if detection is absent. |
detected or opened | Alert, report, created, triggered, opened, or reported time. | Normal base timestamp for MTTA, MTTC, and MTTR. |
acknowledged | Acknowledged, assigned, responded, or response-started time. | Enables MTTA and the acknowledgement segment. |
contained | Contained, containment, mitigated, mitigation, or stabilized time. | Enables MTTC and separates mitigation from final recovery. |
resolved | Resolved, closed, recovered, fixed, ended, or end time. | Enables MTTR and final recovery reporting. |
severity | Severity, SEV, priority, impact, or level. | Groups and sorts the severity breach queue. |
| Rule | Boundary | Result effect |
|---|---|---|
| Target breach | Strictly greater than the configured target. | A 15 min MTTA with a 15 min target is not a breach; 15.1 min is. |
| No target | Target value is 0 or not finite. | The metric is summarized but not marked as target-passing or target-breaching. |
| Missing stage | Required end timestamp is absent or unparseable. | Only that metric is excluded for the row when other timestamp pairs remain valid. |
| Out-of-order time | End time is before the base time. | The negative duration is excluded and a warning names the affected metric. |
Severity labels are normalized for common forms such as SEV1, severity-2, and P3. Ranking treats critical, blocker, SEV0, and P0 as the highest urgency, then works down through SEV5, P5, low, and informational labels.
Privacy Notes:
CSV parsing, duration calculation, chart preparation, and JSON generation run in the browser. The main privacy risk is the incident content you paste, copy, download, or place in a shareable URL.
- Incident IDs, service names, severity labels, and timestamps can appear in visible tables, JSON, chart exports, and downloaded reports.
- Changed inputs can be reflected in the page URL, so do not share a URL that contains sensitive incident details.
- Use anonymized service names and incident identifiers before sending results outside the response team.
Worked Examples:
The default sample data uses five checkout incidents with targets of 15 min MTTA, 120 min MTTC, and 4 hr MTTR.
Sample review slice
Loading the sample for checkout-api produces five usable rows. MTTA has a mean of 9 min and P85 of 17 min, so the mean is within target while the upper tail exceeds the 15 min acknowledgement target.
Breach queue focus
In the sample, INC-2026-0429 has 17 min MTTA, 202 min MTTC, and 343 min MTTR, so it breaches all three configured targets. The Severity Breach Queue groups that row under SEV2 and points the follow-up toward the response stage that breached most often.
Exact target boundary
If a row has an acknowledgement duration of exactly 15 min and MTTA target is 15, it is not marked as a breach. The breach label appears only when the duration is greater than the target.
Missing resolution timestamp
A row with detected and acknowledged timestamps but no resolved timestamp still contributes to MTTA. The ledger shows n/a for MTTR, the warning count notes the missing recovery timestamp, and the next action tells you to add a resolved timestamp before using that row for recovery reporting.
FAQ:
What timestamp starts MTTA, MTTC, and MTTR?
The normal base is the detected or opened timestamp. If that value is missing but a started timestamp exists, the started timestamp is used as the fallback base.
Why are some results marked n/a?
n/a means the row does not have a usable timestamp pair for that metric, or the duration would be negative. Other valid metrics from the same row can still be counted.
Can I paste rows without headers?
Yes, if the columns follow a common lifecycle order. Headered CSV is safer because recognized names such as detected, opened, acknowledged, contained, and resolved remove ambiguity.
Why does P85 exceed target when the mean passes?
A few slow incidents can sit in the upper tail while the average remains acceptable. The brief status calls this out so slow rows are not hidden by several short incidents.
Should these metrics be used to grade responders?
Use them as review prompts, not individual scorecards. Compare similar incidents with consistent timestamp rules, then inspect the ledger and chart before deciding what process, alerting, ownership, or runbook change is needed.
Glossary:
- MTTD
- Mean time to detect, measured from started to detected or opened when both timestamps exist.
- MTTA
- Mean time to acknowledge, measured from the response base to acknowledgement.
- MTTC
- Mean time to contain, measured from the response base to containment, mitigation, or stabilization.
- MTTR
- Mean time to resolve, measured from the response base to resolution, recovery, closure, or end time.
- P85
- The nearest-rank 85th percentile of valid durations for a metric.
- Severity breach queue
- A grouped view that summarizes target breaches by normalized severity label.
References:
- Incident Metrics in SRE, Google Site Reliability Engineering.
- Incident Management Guide, Google Site Reliability Engineering.
- How to choose incident management KPIs and metrics, Atlassian.
- Definitions, PagerDuty Operational Reviews.
- MTTA and MTTR reports, Splunk Documentation, 2025-07-16.