Text Diff Comparator
{{ analysis.primaryMetric }}
{{ analysis.summaryLine }}
{{ analysis.badges.changed }} {{ analysis.badges.added }} {{ analysis.badges.removed }} {{ analysis.badges.unchanged }} {{ analysis.ignoredCount }} ignored
Text diff inputs
Paste the baseline version or drop one text-like file onto the textarea.
{{ oldStatus || 'Drop TXT, MD, CSV, JSON, LOG, DIFF, or PATCH onto the original textarea.' }}
Paste the revised version or drop one text-like file onto the textarea.
{{ newStatus || 'Drop TXT, MD, CSV, JSON, LOG, DIFF, or PATCH onto the revised textarea.' }}
Load a safe sample pair for a quick cold-start comparison.
Text pair
Diff updates automatically after every edit.
Use 0 for compact hunks or more lines for review context.
lines
Off keeps leading and trailing spaces significant.
{{ trimLineEdges ? 'On' : 'Off' }}
Off preserves exact spacing inside each line.
{{ collapseWhitespace ? 'On' : 'Off' }}
Off treats uppercase and lowercase as different text.
{{ ignoreCase ? 'On' : 'Off' }}
Off reports added or removed empty lines as changes.
{{ ignoreBlankLines ? 'On' : 'Off' }}
{{ line.text }}
{{ header }} Copy
No rows to export for the current input.
{{ cell.value }} {{ cell.value || ' ' }} {{ cell.value }}
Diff Change Mix Chart Static artifact-readiness chart for the tool result set. Diff Change Mix Chart Original 34 Revised 38 Changes 18 Review 10
Customize
Advanced
:

Introduction

Text comparison is a line-by-line way to see what changed between an original version and a revised version. It is useful for release notes, configuration snippets, data extracts, policy text, logs, and short code-like fragments where a small insertion or deletion can change the meaning of the whole document.

A good diff does more than say that two blocks are different. It separates replacement lines from added lines, removed lines, unchanged context, and lines that only match because comparison rules ignored case or spacing. That distinction helps a reviewer decide whether a change is substantive, cosmetic, or a sign that the comparison settings are hiding something important.

Original and revised line blocks with removed, added, and equal lines marked for a unified diff hunk.

Line comparison is not the same as understanding intent. A moved paragraph may appear as one removal and one addition. A case-only change may disappear when case is ignored. A blank line can be meaningful in a patch, a configuration file, or a Markdown document, but it can also be noise in copied prose. The useful review question is whether the current comparison rules match the way the text will be used.

Unified diff notation gives reviewers a compact, familiar record of changes. Removed lines are prefixed with a minus sign, added lines with a plus sign, and unchanged context lines keep the surrounding text visible so the change can be read in place.

Technical Details:

A line-oriented diff first splits each text block into ordered line records. The comparison then looks for a short sequence of insertions and deletions that can transform the original sequence into the revised sequence while preserving matching lines as anchors.

The matching pass works on normalized line keys, not always the visible line text. When comparison options are off, the key is the exact line content after newline handling. When options are on, leading and trailing spaces can be trimmed, runs of whitespace can be collapsed, uppercase and lowercase can be treated as equal, and blank-only lines can be removed before matching.

Transformation Core:

Text diff transformation stages
Stage Rule Result to Check
Line records Carriage-return line endings are normalized to newline breaks, then each side is split into numbered lines. Original lines compared and Revised lines compared show how many lines remain after blank-line removal, if used.
Comparison keys Each line receives a key after the active trim, whitespace, and case options are applied. Ignored option matches counts rows that matched by key while their visible text still differed.
Edit path A Myers-style shortest edit path finds equal, inserted, and deleted line positions between the two key sequences. Material changes counts changed, added, and removed rows after the active options.
Change pairing Adjacent deletions and insertions are paired as replacements where possible; leftover deletions become removed rows and leftover insertions become added rows. Change Ledger shows changed, added, removed, and ignored rows with original and revised line numbers.
Unified diff Changed pairs render as one removed line and one added line. Context lines are grouped around each material change. Unified diff hunks reports the number of grouped change regions.

Changed line pairs also receive a token-level review note. Whitespace and non-whitespace runs are compared as tokens, so a replacement can report how many tokens were removed or added. Very long line pairs skip that token summary after 600 combined tokens and instead show a character-count note, which keeps large rows from turning a quick review into a slow token audit.

Comparison options and their effect on diff interpretation
Control Comparison Effect Interpretation Caution
Context lines Uses 0 to 20 unchanged lines around each material change in the unified diff. Lower values make compact hunks; higher values help reviewers see surrounding text.
Trim line edges Ignores leading and trailing whitespace while matching lines. Indented code, list nesting, or fixed-width text may treat those spaces as meaningful.
Collapse whitespace Treats runs of spaces and tabs as a single space before matching. Useful for prose cleanup, risky when alignment or spacing carries meaning.
Ignore case Compares lowercased line keys while keeping original casing in the result rows and exports. Case changes can matter in identifiers, environment variables, product names, and code.
Ignore blank lines Removes blank-only lines before matching and omits them from aligned review rows. Blank-line changes will not appear as additions or removals under this option.

The unified diff output begins with original and revised headers. Each hunk header uses old and new line ranges, then the body marks removed lines with -, added lines with +, and context or ignored-match lines with a leading space.

Worked transformation path:
original lines + revised lines
  -> normalize line keys from active options
  -> find equal, insert, and delete operations
  -> pair adjacent delete/insert runs as changed rows
  -> count changed, added, removed, equal, and ignored rows
  -> build unified diff hunks, review tables, warnings, and JSON summary

Warnings are review signals, not failures. Empty inputs, very large line counts, large aligned tables, blank-line removal, and active ignore options all deserve attention because they change how much of the visible text is being compared or how easy the result is to inspect.

Everyday Use & Decision Guide:

Start with exact line matching and 3 context lines. Paste the baseline into Original text and the candidate version into Revised text, or load a text-like file into each textarea. That first pass is the safest review because spacing, case, and blank lines remain visible as changes.

Use ignore options only for a specific reason. Trim line edges is useful when copied text picked up stray spaces. Collapse whitespace can quiet formatting churn in prose. Ignore case helps when capitalization is not meaningful. Ignore blank lines is best reserved for documents where vertical spacing is not part of the meaning.

  • Use Unified Diff when you need a compact change artifact with hunk context.
  • Use Change Ledger when you only want rows that changed, were added, were removed, or matched because options ignored a difference.
  • Use Pair Review when you need the full aligned view, including exact equal rows.
  • Use Match Audit before sharing results; it names active options, counts compared lines, reports hunks, and lists warnings.
  • Use JSON when a handoff needs the current summary, options, counts, unified diff text, changes, and aligned rows together.

Stop and verify when Ignored option matches is greater than zero. Those rows are treated as equal only after normalization, so the visible text still differs. That can be exactly right for copied prose, but it can hide a real change in source-like text.

Local files are limited to TXT, MD, CSV, JSON, LOG, DIFF, or PATCH-style text and must be under 2 MB. If a file is rejected, paste a smaller text extract or save the source as a supported text format before comparing again.

Step-by-Step Guide:

Use the controls in a review-friendly order so each result tab has a clear source.

  1. Paste the baseline into Original text, or choose Browse original and load a supported text file under 2 MB.
  2. Paste the candidate version into Revised text, or choose Browse revised. The summary should change from Add text to a line-count comparison.
  3. Leave the Advanced options off for the first pass unless the review brief explicitly says to ignore spacing, case, or blank lines.
  4. Set Context lines between 0 and 20. Use 0 for compact hunks or keep 3 when reviewers need nearby unchanged lines.
  5. Read the summary badges for changed, added, removed, equal, and ignored counts. If both inputs are empty, Match Audit reports the empty-input warning.
  6. Open Unified Diff for the hunk view, then open Change Ledger to inspect each non-equal row with original and revised line numbers.
  7. Open Match Audit before copying results. Resolve warnings about active ignore options, blank-line removal, or very large tables if they would affect the review.
  8. Copy or download the diff, table, or JSON result only after the line counts and warning cues match the comparison you intended to run.

Interpreting Results:

Material changes is the main count for review work. It adds changed, added, and removed rows after the active options. A result of No material changes means no changed, added, or removed rows remain under the current comparison rules; it does not mean the visible text is identical when ignored matches are present.

Changed means an original line and revised line were paired as a replacement. Added means a line exists only in the revised text. Removed means a line exists only in the original text. Equal means the visible line text matched exactly. Ignored means the row matched only after trim, whitespace, case, or blank-line settings changed the comparison.

Use Match Audit as the confidence check. If Comparison options says exact line matching and Ignored option matches is 0, the visible text and the comparison keys agree. If warnings list active ignore options, reread the relevant rows before sharing the diff as evidence.

Worked Examples:

Release checklist update. The sample original has 6 lines and the sample revised version has 7. Exact matching produces 3 changes: two changed line pairs for the smoke-test and notification text, plus one added line for launch metrics. Unified diff hunks reports one grouped change region with nearby checklist context.

Whitespace and case cleanup. If Status: Ready becomes status: ready , exact matching reports a changed row. Turning on Trim line edges, Collapse whitespace, and Ignore case can make the summary read No material changes, but Ignored option matches still shows that visible text differed before normalization.

Rejected file during review. Dropping a PDF, image, or large text export into a textarea does not create comparison rows. The status message asks for TXT, MD, CSV, JSON, LOG, DIFF, or PATCH text, or for a text file smaller than 2 MB. Use a supported text extract, then confirm that Original lines compared and Revised lines compared are no longer zero.

FAQ:

Why does a changed paragraph appear as removed and added lines?

Unified diff notation represents a replacement as the old line removed and the revised line added. Change Ledger pairs those rows as Changed so you can review the replacement with both line numbers together.

Why do I see no material changes when the text looks different?

One or more ignore options probably normalized the difference. Check Ignored option matches and Comparison options in Match Audit, then turn options off if casing, spacing, or blank lines should count.

How much context should I use?

Use 3 context lines for ordinary review, 0 for compact output, and a higher value when nearby unchanged lines are needed to understand a change. The control accepts values from 0 to 20.

What file types can I load?

Use Browse original, Browse revised, or drag and drop for TXT, MD, CSV, JSON, LOG, DIFF, or PATCH-style text files under 2 MB. Unsupported types show a message instead of changing the comparison text.

Does pasted text leave the browser for comparison?

The comparison work shown here runs in the browser session after the page loads. Pasted text and supported local file content are read there to build the summary, unified diff, tables, and JSON view.

Glossary:

Unified diff
A compact diff format that marks removed lines with -, added lines with +, and surrounding context with a leading space.
Hunk
A grouped region of nearby line changes plus the context lines around them.
Edit path
The sequence of equal, inserted, and deleted positions used to transform the original line sequence into the revised sequence.
Normalization
The comparison-time adjustment that can trim edge spaces, collapse whitespace, or ignore case before matching lines.
Ignored match
A row that matched after active comparison options even though the visible original and revised text differed.

References: