Text Sorter
Items {{ sortedLines.length }} Unique {{ uniqueCount }} Mode {{ sortModes[sort_mode] }} Locale {{ localeDisplay }} In {{ sepOptions[normalizeSepKey(input_sep)] }} Out {{ sepOptions[normalizeSepKey(output_sep)] }}
Text
Now: {{ localeDisplay }}:
{{ filter_regex_error }}
MetricValueCopy
{{ r.label }} {{ r.value }}
MetricValueCopy
{{ r.label }} {{ r.value }}

          
:

Introduction:

Text lists become easier to reuse when order, spacing, and separators are consistent. Sorting sounds simple until numbers, accents, articles, duplicate lines, and pasted whitespace start changing what "correct order" actually means. This tool is built for that practical cleanup work.

It is useful when you have a list of filenames, titles, tags, codes, or notes and need a predictable output string rather than another round of manual editing. The package can reorder items alphabetically or naturally, reverse the current sequence, rank by length, or shuffle the list, while also trimming, filtering, numbering, and rejoining the results for whatever comes next.

A familiar example is a copied file list such as `File10`, `File2`, and `File1`. Plain alphabetical comparison often puts `File10` before `File2`, because it compares characters one by one. Natural comparison treats the digit runs as numbers instead, which produces the order most people expect when scanning filenames, versions, or item labels.

The tool also helps when the list is messy rather than merely unsorted. Blank lines, repeated entries, ordered-list prefixes, uneven spacing, and leading articles in titles can all be cleaned before the final output string is assembled. That makes it useful for quick editorial cleanup, data-prep passes, or turning a rough note dump into something safe to paste into a document or script.

What it does not do is act like a full record parser. A `.csv` file is read as plain text and then split by the separator rules you choose, so quoted commas and multi-column table structure are outside the scope of this package. It works best on single-column lists or token streams where each item can be treated as standalone text.

Everyday Use & Decision Guide:

For most lists, the first useful pass is conservative: paste the text, match the `Input separator` to the source, choose `Alphabetical (A-Z)` or `Natural (A-Z)`, and enable `Ignore blank lines` plus `Trim each line` if the source came from email, chat, or a spreadsheet export. That usually removes the noise before you make harder judgment calls about duplicates or locale-specific ordering.

The tool is a strong fit for one-item-per-line cleanup, ad hoc token lists, and filename or title sorting. It is a poor fit for structured CSV records with quoted separators, because the split stage is intentionally simple text splitting rather than full CSV parsing.

  • Leave `Locale` blank unless you need a specific language rule. A fixed locale is useful for reproducible team output, but it can change the order of accented letters.
  • Use `Natural (A-Z)` for names with embedded numbers such as `File2`, `photo12`, or `item100`. Plain alphabetical mode will compare those digit runs as text instead.
  • `Unique` removes duplicates after cleanup and sorting. If `Case sensitive` is off, lines that differ only by case collapse to one item.
  • `Ignore leading articles` changes the comparison key, not the displayed text. Titles still come out with `A`, `An`, or `The` intact.
  • If the red `Invalid regex` message appears, the list is still processed without the regex filter until the pattern is fixed.

A good final check is to compare `Sorted Output` with the summary badges. If the item count, unique count, sort mode, and locale badge match what you intended, you are usually ready to copy or export.

Technical Details:

Text sorting is really a question of collation, which means the rule used to compare strings. In this package, alphabetical modes use language-sensitive string comparison, and natural modes enable numeric collation so digit runs are compared by value instead of one character at a time. That is why `File2` can sort before `File10`, even though a plain lexical pass would see `1` before `2` and keep `File10` earlier.

The `Locale` field is intended for BCP 47 language tags such as `en-US`, `fr`, or `de`. If the entered tag cannot be used, the code falls back to the browser default. `Case sensitive` changes both the collator sensitivity used for ordering and the simpler substring checks used by include or exclude filtering and duplicate removal, so the same toggle affects more than one stage of the pipeline.

Order of operations matters here. The source text is split first, then each item can be trimmed, stripped of leading list numbers, and normalized for internal whitespace. Blank-line removal and plain-text include or exclude filters happen before sorting. Regex include filtering, when enabled, runs after that cleanup stage. Sorting, reversing, or shuffling comes next, and duplicate removal happens afterward. Only then are optional line numbers, prefixes, suffixes, and the chosen output joiner applied.

Sort modes and comparison behavior in the text sorter
Mode Rule Important Detail
Alphabetical (A-Z/Z-A) Locale-sensitive string comparison Compares digit characters as text, not as whole numbers.
Natural (A-Z/Z-A) Locale-sensitive comparison with numeric collation Treats embedded digit runs as numbers, so `2` comes before `10`.
Character length Shortest item first, then alphabetical tie-break Useful for trimming labels or spotting unusually long entries.
Reverse Flips the current processed sequence It is not a descending alphabetical pass; it simply reverses item order.
Shuffle Randomized order A seed triggers deterministic shuffling; no seed means the order is not reproducible.

Seeded shuffle is implemented as a deterministic shuffle pass driven by a 32-bit seed derived from the supplied text. That makes it useful for repeatable editorial or workflow ordering, because the same input and the same seed recreate the same sequence. It should not be treated as a cryptographic draw or an audit-grade randomization method.

Decoration happens after ordering. When `Line numbers` is enabled, numbers are zero-padded to the width of the final item count. Prefixes and suffixes are then attached to each item, and the list is joined with the selected `Output separator`. `Newline style` matters only when the joiner itself is newline; if you switch the output separator to comma, semicolon, space, or tab, the newline setting has no effect on the exported text.

The four output surfaces describe different stages of the same run, which is why their numbers do not always match exactly. `Sorted Output` is the final decorated string. `Text Stats` combines raw-input counts with processed-output counts. `Character Counts` measures the sorted item list before numbering, prefixes, suffixes, and final joining. `JSON` records the settings and the processed `output` array, which means it is good for auditing the run but not identical to the final pasted text when decorations are enabled.

Output surfaces in the text sorter
Surface What It Represents Best Use
Sorted Output The final joined text after numbering and affixes Copying into documents, scripts, prompts, or forms
Text Stats Source and processed item counts Checking how much cleanup changed the list
Character Counts Length and word metrics for the sorted item array Estimating size, density, or verbosity
JSON Inputs, counts, and processed `output` array Auditing settings or passing a clean array downstream

Files are read locally in the browser and newlines are normalized on load. No network request is required for sorting, statistics, counts, copying, or downloads. A dropped or uploaded `.txt` or `.csv` file is simply another source of plain text for the same pipeline.

Step-by-Step Guide:

This flow gives you a clean output without guessing which setting changed the list.

  1. Paste the source into `Text`, or use `Upload file` to load a `.txt` or `.csv` file. Dropped files are read locally and newlines are normalized automatically.
  2. Set `Input separator` to match the source. Use `New Line` for one-item-per-line lists and switch to comma, semicolon, space, or tab only when the source is truly token-based.
  3. Choose `Sort mode`. `Natural (A-Z)` is the safest first pass for filenames and version-like labels, `Alphabetical` is best for plain words, and `Shuffle` plus `Shuffle seed` is the path for repeatable randomized order.
  4. Open `Advanced` and apply cleanup toggles such as `Trim each line`, `Ignore blank lines`, `Unique`, `Ignore leading articles`, `Line numbers`, `Prefix to add`, and `Suffix to add`.
  5. If you need filtering, use `Include only items containing`, `Exclude items containing`, or turn on `Regex filter`. If `Invalid regex` appears under `Pattern`, fix the pattern or `Flags` before trusting the filtered set.
  6. Read `Sorted Output` first. It shows the exact text that `Copy Sorted` and `Download Sorted` will export, including numbering, affixes, and the selected joiner.
  7. Open `Text Stats`, `Character Counts`, or `JSON` when you need counts, audit detail, or a machine-friendly array, then export from the tab that matches the job.

When the final text looks right in `Sorted Output` and the badges reflect the sort mode and locale you meant to use, the run is ready to leave the page.

Interpreting Results:

Treat `Sorted Output` as the authoritative final text. It is the only surface that includes zero-padded line numbers, prefixes, suffixes, and the chosen output separator in one place.

  • `Text Stats` combines raw-source counts with processed counts, so it tells you how much cleanup changed the list rather than how the final pasted string is decorated.
  • `Character Counts` measures the sorted item list before numbering, prefixes, suffixes, and final joining. It is useful for size checks, but it is not a byte-for-byte description of `Sorted Output`.
  • `JSON` is an audit of settings and processed items, not a mirror of the decorated output text. If you need the exact pasted string, use `Sorted Output` instead.
  • Locale-sensitive ordering can change the placement of accented letters, and unseeded shuffle cannot be reproduced exactly. If those details matter, verify a few edge items before copying the result.

When reproducibility matters, save the output together with the chosen `Locale`, the `Case sensitive` setting, and any `Shuffle seed` you used.

Worked Examples:

Sorting filenames the way people read them

Paste `File10`, `File2`, and `File1` on separate lines, choose `Natural (A-Z)`, and leave the default newline output. `Sorted Output` becomes `File1`, `File2`, `File10`, while `Text Stats` shows `Output items` `3`. That is the right choice when numbers inside labels should behave like numbers instead of plain characters.

Recreating the same shuffled list

Paste `Alice`, `Bob`, `Carol`, and `Dave`, switch `Sort mode` to `Shuffle`, and enter `draft-1` as `Shuffle seed`. The processed order becomes `Bob`, `Alice`, `Carol`, `Dave`, and that same sequence reappears in `Sorted Output` and the `output` array in `JSON` every time you rerun the same source with the same seed. This is useful for review rotation, turn-taking lists, or any case where you want a shuffled order without losing repeatability.

Catching a broken regex filter

Paste `ERR42` and `OK17`, enable `Regex filter`, and enter `[` as the `Pattern`. The red `Invalid regex` message appears, and the list stays on the pre-regex output because no valid regex could be applied. Change the pattern to `^ERR` and the filtered result immediately narrows `Sorted Output` to `ERR42`, which confirms the problem was the pattern syntax, not the sorting stage.

FAQ:

How do I make File2 sort before File10?

Use `Natural (A-Z)` or `Natural (Z-A)`. Those modes compare digit runs by value instead of comparing each digit as plain text.

What should I put in the Locale field?

Use a BCP 47 language tag such as `en-US`, `fr`, or `de` when you need language-specific ordering. Leaving it blank uses the browser default locale.

What does Ignore leading articles really change?

It changes the comparison key used for sorting, so a title like `The Trial` is ordered by `Trial`. The displayed output text still includes the original article.

Why did my CSV rows break apart?

This package splits plain text by the separator you choose. It does not preserve quoted CSV cells or multi-column record structure, so it works best on single-column or token-based lists.

Why does JSON not always match Sorted Output exactly?

The JSON export stores the processed `output` array before line numbers, prefixes, suffixes, and final joining are applied. `Sorted Output` is the decorated paste-ready string.

Are uploaded files sent anywhere?

No. The file reader runs in the browser, the text is processed locally, and downloads are generated without a server-side step.

Glossary:

Collation
The rule set used to compare strings for ordering.
Natural sort
String ordering that treats embedded digit runs as numeric values.
BCP 47 tag
A locale identifier such as `en-US` or `de`.
Seeded shuffle
A reproducible randomized order driven by the same seed text.