DORA Metrics Analyzer
Analyze online DORA deployment metrics from CSV rows to compare frequency, lead time, change fail rate, recovery time, and rework for service reviews.{{ result.summary.heading }}
| {{ header }} | Copy |
|---|---|
| {{ cell }} | |
| No rows for the current input. |
Introduction
DORA metrics describe how quickly and reliably a team gets software changes into production. They are most useful when the same application or service is measured over a consistent window, because deployment cadence, lead time, failures, and recovery patterns can look very different across products with different release models.
The current five-metric view combines delivery speed with the cost of instability. Deployment frequency and change lead time show whether changes move through the delivery system smoothly. Change fail rate, failed deployment recovery time, and deployment rework rate show how often production deployments need intervention, how quickly service is restored after a failed deployment, and how much unplanned deployment work is being spent on fixing production problems.
Good DORA reporting is not a scoreboard for individual engineers. The metrics are better at showing system constraints: release batch size, review and build delay, production-change quality, rollback readiness, and the amount of delivery capacity being pulled into incident work.
The strongest use is trend review for one service at a time. A single high or low number can mislead when the deployment source is incomplete, the window is too short, or planned delivery and emergency rework are mixed together. The useful question is whether the same service is becoming faster to change without creating more failed deployments or unplanned repair work.
Technical Details:
DORA measurement starts from production deployment events. Each event needs a deployment time, and it becomes more useful when it also has the commit time that entered the delivery process, the deployment outcome, the time service was restored after a failed deployment, and whether the work was planned delivery or unplanned repair.
Change lead time is measured from commit to production deployment. Deployment frequency normalizes the count of production deployments over the selected number of calendar days. Change fail rate and deployment rework rate are percentages of the valid deployment rows. Failed deployment recovery time is measured only for failed deployments with a valid recovery timestamp.
Formula Core
The core calculations use counts, elapsed hours, and medians. Percent results are calculated from the valid deployment rows in the current measurement window.
| Field or setting | Accepted meaning | Metric effect |
|---|---|---|
deploy_time |
Required timestamp for a production deployment row. | Rows without a valid deployment time are skipped and reported in data quality notes. |
commit_time |
Timestamp for the code change entering the delivery flow. | Lead time is counted only when commit time is valid and not after deployment time. |
status and change_type |
Words matched against failure and rework token lists. | Matched failure tokens count toward change fail rate. Matched rework tokens count toward deployment rework rate. |
recovered_time |
Timestamp when a failed deployment was restored. | Failed deployment recovery time is calculated only when the recovery timestamp is valid and not before deployment time. |
Measurement window |
Calendar days represented by the pasted deployment rows. | Deployment frequency is normalized to deployments per week using this value. |
Benchmark profile |
Frequency target only: elite 14+/week, high 1+/week, medium 0.25/week, or baseline trend only. |
The raw frequency stays visible while the scorecard signal changes against the selected profile. |
Median values reduce the effect of one unusually slow change or one unusually long recovery. The 85th-percentile lead time is also retained in the result snapshot because a team can have an acceptable median while a tail of slow changes still deserves review.
Failed deployment recovery time is narrower than broad mean time to recovery. It is tied to deployments that fail and require immediate intervention, so missing recovery timestamps on failed rows should be fixed before a recovery target is treated as met.
Everyday Use & Decision Guide:
Choose one service boundary first. Use Service or team for the application, stream, or team name that should appear in the scorecard and exports, then set Measurement window to match the deployment rows you are reviewing. A 30-day or 90-day service-level window is usually easier to discuss than a mixed organization-wide extract.
Start with the default CSV shape when possible: id, commit_time, deploy_time, status, recovered_time, and change_type. The last two columns are optional, but recovery and rework interpretation becomes weaker when they are missing. Browse CSV can load a CSV or TXT file up to 1 MB, and Normalize rewrites the current valid rows into the expected column order.
Set targets to match the review conversation. Benchmark profile controls the deployment-frequency target, while Lead time target, Change failure target, Recovery target, and Rework target set the other guardrails. Use Baseline / trend only when the first goal is to establish a starting point without judging frequency yet.
- DORA Scorecard is the first place to compare current values, targets, signal labels, and recommended actions.
- Deployment Ledger shows how each row contributed to frequency, lead time, failure, recovery, or rework metrics.
- Improvement Brief turns missed targets, missing recovery timestamps, parser notes, and rework pressure into prioritized next actions.
- Throughput Cadence Chart compares weekly deployment counts and median lead hours against frequency and lead-time targets.
- Stability Recovery Chart compares weekly change fail rate, rework rate, and median recovery hours.
- JSON keeps the normalized metrics, weekly rows, deployment details, targets, and warnings together for downstream review.
Open Advanced before trusting change fail rate or rework rate when your source system uses custom deployment outcomes. Failure tokens are matched case-insensitively against status and change type, as are rework tokens. Add words such as outage, rollback, emergency, or your release-system status names so failed deployments and incident repair rows are counted consistently.
Do not treat an on-target scorecard as proof that delivery health is settled. Check Data quality notes, missing recovery rows, and the ledger before sharing the result, especially when the deployment source was pasted from several systems or cleaned by hand.
Step-by-Step Guide:
Work from service boundary to targets, then use the ledger and brief to verify the result before exporting.
- Enter the application or team in
Service or team. The summary heading should update to that name, such ascheckout-api DORA Snapshot. - Set
Measurement windowto the number of calendar days covered by the deployment rows. The top summary andDORA Scorecardwill recalculate deployment frequency per week. - Choose
Benchmark profileand set the lead time, change failure, recovery, and rework targets. Confirm the target column inDORA Scorecardshows the guardrails you intended. - Paste rows into
Deployment rows, useBrowse CSV, or pressSampleto load the sample window. Each valid row needs a parseabledeploy_time. - If
Review DORA inputappears, add at least one row with a valid deployment time or fix the measurement window. IfData quality notesappears, fix skipped deployment times, missing commit times, commit times after deployment, or missing recovery timestamps on failed rows. - Use
Advancedto adjustFailure status tokensandRework tokenswhen your source system uses different status names. - Read
DORA Scorecard, then confirm row-level evidence inDeployment Ledger. The ledger is where failed, rework, missing lead time, and missing recovery classifications become visible. - Use
Improvement Brief, the two charts, andJSONonly after the scorecard and ledger agree with the source records.
Interpreting Results:
Read the scorecard as a comparison against the targets you entered, not as an external certification. On target means the current value satisfies the selected guardrail. Needs review means the current value missed that guardrail or the required data is incomplete. Trend only appears for deployment frequency when the baseline profile has no frequency target.
The false-confidence risk is clean speed with weak stability. A service can show healthy deployment frequency and median lead time while change fail rate or deployment rework rate is above target. In that case, use Improvement Brief and Deployment Ledger to find the failed or unplanned rows before pushing for more frequent releases.
| Result cue | What it means | What to verify |
|---|---|---|
throughput aligned |
Deployment frequency and median lead time are inside the selected guardrails. | Check whether Change fail rate, Recovery time, and Deployment rework rate are also controlled. |
stability risk |
Change fail rate or failed deployment recovery time needs review. | Use the ledger to identify failed rows, missing recovery timestamps, rollback outcomes, and incident-like status tokens. |
rework pressure |
The share of rows matched by rework tokens is above the selected target. | Confirm the source distinguishes planned releases from unplanned hotfix, emergency, patch, or incident deployment work. |
Needs data |
The metric cannot be judged from the available rows. | Fill missing commit or recovery fields, or narrow the measurement window to rows with complete deployment evidence. |
Use the charts for trend shape, not for final diagnosis. The table rows carry the exact values, while the charts help spot weeks where deployment count, lead time, failure rate, rework rate, or recovery time changed together.
Worked Examples:
Reviewing the sample deployment window:
With the sample data, a 30-day window, the high benchmark profile, 24-hour lead target, 15% change failure target, 4-hour recovery target, and 10% rework target, the summary shows about 1.6/week. DORA Scorecard shows deployment frequency and median lead time on target, with median lead time at about 9 hr. Change fail rate is about 42.9%, and deployment rework rate is about 14.3%, so the brief pushes failure causes and incident rework ahead of cadence changes.
Using baseline mode for a first measurement:
A team with four production deployments in 28 days can choose Baseline / trend only while it cleans up source data. The deployment-frequency row then reports the current value without a frequency pass/fail target. Lead time, change fail rate, recovery time, and rework rate still compare against their entered targets, so reliability problems remain visible during the baseline review.
Recovering from missing deployment evidence:
A CSV row with deploy_time blank is skipped, and Data quality notes names the line that could not be parsed. A failed row with no valid recovered_time is kept for change fail rate, but recovery time becomes incomplete and the brief asks for restore timestamps. Fixing those fields and pressing Normalize produces a cleaner ledger for follow-up.
Changing tokens for a custom release system:
If the source uses degraded and repair instead of failed or hotfix, those rows will be undercounted until the advanced token lists include the source terms. Adding degraded to Failure status tokens and repair to Rework tokens makes the scorecard match the release-system language.
FAQ:
Which CSV columns should I paste?
Use id, commit_time, deploy_time, status, recovered_time, and change_type. The last two are optional, but they are needed for stronger recovery and rework interpretation.
How does the analyzer decide that a deployment failed?
It matches the lowercased status and change_type text against Failure status tokens. Defaults include words such as failed, rollback, incident, hotfix, revert, and outage.
Why is lead time showing as missing or n/a?
Lead time needs a valid commit_time and a valid deploy_time. Rows with no commit timestamp, an unparseable timestamp, or a commit time after deployment time are excluded from median lead time and reported as data quality notes.
Why can recovery be off target when the failures recovered quickly?
Every failed deployment needs a valid recovered_time at or after deploy_time. If any failed row is missing that timestamp, the scorecard treats recovery evidence as incomplete even when the rows that do have recovery times are fast.
What should I do when the page says to review DORA input?
Add at least one deployment row with a valid deploy_time and keep Measurement window greater than zero. If the warning came from a pasted file, remove extra notes or malformed rows and try the normalized CSV shape.
Are deployment rows sent to an analysis service?
No external analysis service is used for the entered deployment rows. The parsing, scorecard, charts, and JSON are generated in the browser from the values you enter, so handle shared URLs and exported files as deployment records.
Glossary:
- DORA
- DevOps Research and Assessment, the research program associated with these software delivery performance measures.
- Deployment frequency
- The number of production deployments normalized over the selected measurement window.
- Change lead time
- The elapsed time from commit to production deployment for rows with valid commit and deployment timestamps.
- Change fail rate
- The percentage of valid deployment rows matched by failure status tokens.
- Failed deployment recovery time
- The elapsed time from a failed deployment to a valid recovery timestamp.
- Deployment rework rate
- The percentage of valid deployment rows matched by rework tokens such as hotfix, emergency, patch, or unplanned.
- Measurement window
- The number of calendar days represented by the pasted deployment rows.
References:
- DORA's software delivery performance metrics, DORA and Google Cloud, last updated January 5, 2026.
- A history of DORA's software delivery metrics, DORA and Google Cloud, January 2, 2026.
- Accelerate State of DevOps Report 2024, DORA and Google Cloud, last updated April 13, 2026.