Tab-Separated Values (TSV) Converter
Convert TSV data online into JSON, SQL, XML, Markdown, CSV, DOCX, and XLSX with header checks, typed profiling, and safer audit work before import.TSV Conversion
| # | {{ header }} | Copy |
|---|---|---|
| {{ row.index }} | {{ cell }} | |
| No rows to display. |
| Column | Key | Type | Non-empty | Null | Empty | Unique | Sample value | Copy |
|---|---|---|---|---|---|---|---|---|
| {{ profile.index }}. {{ profile.label }} | {{ profile.key }} | {{ profile.type || 'empty' }} | {{ profile.nonEmpty.toLocaleString() }} | {{ profile.nullCount.toLocaleString() }} | {{ profile.emptyCount.toLocaleString() }} | {{ profile.unique.toLocaleString() }} | {{ profile.sample || '—' }} | |
| No columns profiled yet. | ||||||||
{{ sqlInsertText }}
{{ xmlText }}
{{ htmlTable }}
{{ markdownTable }}
{{ csvText }}
Introduction
Tab-separated values, usually shortened to TSV, store one record per line and split fields with tab characters. That plain-text shape makes TSV easy to move between spreadsheets, scripts, and import jobs, but it also makes the structure easy to damage. One missing tab, one extra line break, or one wrong header decision can shift every field that follows.
The narrow TSV definition is stricter than many files people receive in practice. A baseline TSV file is expected to keep the same field count on every row, treat the first line as field names, and avoid literal tabs inside field values. Real exports often drift away from that ideal. A file may still arrive with a TSV label even when it contains blank lines, duplicated headings, quoted cells, or identifier columns that should stay text.
This converter is most useful in that inspection stage before a bad parse turns into a bad handoff. Paste the text or drop a .tsv or .txt file and the page immediately shows row count, column count, header status, delimiter choice, dominant column types, preview limits, and structural warnings. That gives you a quick read on whether the table is trustworthy enough to export.
If the structure looks right, the same parsed grid can then be copied or downloaded as JSON, JSON Lines, SQL, XML, HTML, Markdown, CSV, DOCX, or XLSX. You can also separate visible column labels from machine-facing keys, which helps when the source headings are blank, duplicated, or awkward to reuse in automation and database work.
Parsing, previewing, and text-based exports stay in the browser. The DOCX and XLSX paths load browser-side libraries only when you request those formats, then generate files from the current parsed rows. For sensitive data, the main boundary to think about is your browser session plus any file you choose to download or share afterward.
Technical Details
The page starts by normalizing line endings so pasted Windows, Unix, and mixed-text exports behave the same way. Delimiter handling is explicit when you choose comma, semicolon, tab, pipe, space, or a short custom separator. In automatic mode, the parser samples the opening non-empty lines and scores common delimiter candidates by how consistently they produce multi-field rows. That means the tool can recover from files that are called TSV but were actually exported with another separator.
Parsing itself is character-based rather than a naive split on tabs. You can choose double quotes, single quotes, or no quote handling at all, and you can decide whether embedded quote characters are escaped by doubling them or by using backslashes. That matters when a cell contains a delimiter-like character or an embedded newline. If a quoted field never closes, the run stops with a specific end-of-file quote error instead of guessing where the row should end.
After parsing, the tool measures the widest row and pads shorter rows with empty strings so the preview and export grid stay rectangular. It also keeps warning messages for the rows that did not match the widest width. Skipping empty rows, trimming fields, and preview-row limits all change how easy the table is to inspect, but the preview limit affects only the on-page table. The full parsed dataset is still used when you copy or download an export.
Header detection is heuristic when you leave Header row on automatic. The first row is compared with up to twelve later non-empty rows using shared field-name keywords, uniqueness, value-shape differences, proper-name penalties, numeric and date-like patterns, and broad type-distribution changes. That gives the page a sensible first guess, but it is still a guess. If the first row is real data, not labels, switch the header mode before trusting JSON, SQL, or any other keyed output.
Typed conversion is intentionally narrow. With Detect data types enabled, the page can turn numeric tokens into numbers, true/false and yes/no into booleans, and literal null or undefined markers into nulls. Blank fields become null only when Empty as null is enabled. Date normalization is stricter still: the page only converts ISO-like values when Detect dates is turned on, which helps avoid accidental coercion of regional date strings and identifier-like values.
Column overrides split human-readable labeling from machine-friendly naming. Display labels control the preview table and document-style exports. Key overrides influence object-style JSON more directly, then feed sanitized identifiers for XML tags, profile keys, and SQL columns. JSON Arrays are always row-based, while JSON, JSON Lines, and SQL depend on an active header interpretation. XML, HTML, Markdown, CSV, DOCX, and XLSX can still be produced without a real header row, although they may fall back to generated column names.
| Choice | What it changes | Why it matters |
|---|---|---|
| Delimiter | How the page splits one field from the next. | A wrong delimiter shifts columns immediately and makes every later export unreliable. |
| Quote character and escape style | Whether delimiters and line breaks inside quoted cells are treated as content. | This is the difference between a clean multi-line note and a broken row count. |
| Header row | Whether the first row becomes labels or remains ordinary data. | Keyed outputs such as JSON records, JSON Lines, and SQL depend on that decision. |
| Detect data types and Detect dates | Whether values stay text or become numbers, booleans, nulls, and ISO timestamps. | Useful for analytics handoff, risky for IDs, codes, and ambiguous date strings. |
| Empty as null | Whether blank cells remain empty strings or become null values. | This changes JSON, XML, SQL, and profile counts in ways downstream systems may care about. |
| Display labels and key overrides | How columns are shown to people and how they are named for keyed exports. | Helps clean up duplicate, blank, or awkward headings without changing the source text itself. |
| Output | Needs a header | Uses typed values | Best fit |
|---|---|---|---|
| JSON Arrays | No | Yes | Row-based pipelines where position matters more than labels. |
| JSON and JSON Lines | Yes | Yes | APIs, scripts, and record-oriented interchange. |
| SQL | Yes | Yes | Loading rows into a scratch table with generated identifiers. |
| XML | No | Yes | Systems that expect row tags and optional row indexes. |
| HTML, Markdown, CSV, DOCX, and XLSX | No | No | Human review, documentation, spreadsheet checking, and manual handoff. |
Everyday Use & Decision Guide
The safest default is conservative. Start with the source text, leave the delimiter on tab or automatic, keep header mode on automatic, and read the summary badges before you touch an export tab. If row count, column count, or the delimiter badge looks implausible, stop there. A clean-looking JSON or spreadsheet download is not helpful if the parser already split the rows incorrectly.
The Data tab answers one question: did the page reconstruct the table shape you expected? The Profile tab answers a different question: do the columns behave the way you expected after typing rules were applied? For casual review, the first question may be enough. For any keyed or typed export, you need both answers to be right.
| Situation | First thing to check | Reason |
|---|---|---|
| Columns look shifted or too wide | Delimiter plus quote settings | The file may not really be tab-delimited, or delimiters may be appearing inside quoted text. |
| The first real record disappeared | Header row | Automatic header detection may have treated data as labels. |
Codes such as 00127 lose their shape |
Detect data types | Numeric coercion is useful for measures, but destructive for identifiers with leading zeros. |
| Blank-looking cells should count as missing values | Empty as null | This changes JSON, XML, SQL, and the profile counts from empty strings to nulls. |
| Downstream code needs cleaner field names | Column naming overrides | You can keep readable labels for people while improving keys for object-style exports and generated identifiers. |
Choose the export by downstream job, not by habit. JSON Arrays work well when row position matters. JSON and JSON Lines are better when each field needs a stable name. SQL is useful when you want a quick insert script for a scratch table. HTML, Markdown, DOCX, and XLSX are the better choices when a person still needs to review, edit, or approve the content. The table-level CSV export is a clean copy of the current preview grid, not a verbatim reproduction of the original TSV text.
This page is a strong fit for structural triage and format conversion. It is not a semantic validator. A column that profiles as numeric is not automatically a valid account number, and a timestamp that parses cleanly is not automatically in the right timezone or business meaning for your system.
Step-by-Step Guide
- Paste the source text or drop a
.tsvor.txtfile into the input box. - Read the summary badges first. Confirm the row count, column count, header status, and delimiter label all look plausible.
- If the table shape is wrong, fix
Delimiter,Quote character, orEscape stylebefore doing anything else. - Set
Header rowexplicitly if the automatic guess is wrong, then use the column naming panel if you need clearer display labels or better export keys. - Review
Detect data types,Detect dates, andEmpty as nullwith theProfiletab open so you can see what changed. - Check any warning badges for irregular row lengths, duplicate headings, missing headings, or preview trimming.
- Open the target export tab and compare one sample row with the
Datatab before copying or downloading the final output.
Interpreting Results
Think of the summary strip as a structural health check. Header detected and the delimiter badge are the two signals that most strongly affect everything downstream. If either one is wrong, the rest of the page will still look busy and useful, but it will be describing the wrong table.
| Signal | What it means | What to do next |
|---|---|---|
Header detected |
The first row is being treated as labels, not data. | Make sure row one really names the columns before trusting keyed exports. |
| No header | All rows are being treated as data rows. | Expect JSON, JSON Lines, and SQL to stay unavailable until you activate a header. |
| Irregular row warning | At least one row had fewer fields than the widest row and was padded with blanks. | Inspect the affected rows before export, especially if column alignment matters. |
Preview trimmed |
Only the on-page preview was shortened. | Remember that exports still use the full parsed dataset. |
| Type summary badges | A quick count of dominant column kinds such as text, numeric, boolean, or date. | Treat this as an orientation cue, then confirm sensitive columns in the profile table. |
The Profile tab is where most hidden mistakes show up. Non-empty counts tell you how many rows carry actual content after trimming and null handling. Null counts include explicit null-like markers and blanks only when Empty as null is turned on. Empty counts reflect blank strings that stayed blank. The sample value is the first meaningful cell the profiler found in that column, which makes it a fast spot-check when a dominant type alone feels too abstract.
Unique counts are useful, but they are intentionally capped. The profiler only tracks up to 256 distinct values per column for speed, so a reported uniqueness value near that ceiling should be read as a quick diversity indicator, not a full-cardinality audit. If a column matters for keys, billing, or identity, check the raw data directly rather than relying on the capped uniqueness figure alone.
A successful parse is not the same as a safe downstream import. Spreadsheet software can still evaluate formula-like cells, SQL imports can still fail on business rules or schema mismatches, and XML tag cleanup can still rename awkward headings. The right habit is to compare one source row, one profile row, and one exported row before you treat the conversion as finished.
Worked Examples
Auditing a project-hours extract
A team paste includes columns such as id, name, team, active, score, and updated_at. The page detects the first row as a header, keeps the row count stable, and profiles active as boolean plus score as numeric. If you also enable date detection, ISO timestamps can be normalized for record-style exports. At that point JSON, JSON Lines, or SQL are reasonable choices because the column structure is already settled.
Fixing a mislabeled vendor file
A supplier sends a file called TSV, but the preview shows one giant column. Changing the delimiter to semicolon reveals the real table immediately. A second pass with quote handling keeps an address or notes field intact even though it contains punctuation and line breaks. The warning badges then tell you whether a few rows are still short or whether the file is structurally clean enough to export.
Preserving codes with leading zeros
A parts list contains item codes such as 00127 and 00045. With type detection enabled, those values can become plain numbers in typed outputs. The safer move is to disable Detect data types, recheck the profile, and confirm that the codes now remain text before you copy JSON or build a SQL script.
Preparing a documentation table
Sometimes the job is not machine import at all. You may only need a readable table for a report or knowledge base. In that case, keep the parsing rules correct, adjust display labels for readability, and export HTML, Markdown, DOCX, or XLSX. That path keeps the original rows visible to humans without forcing you into record-style naming that the document does not need.
FAQ:
Does pasted TSV data leave the page?
Parsing, previewing, clipboard actions, and text-based downloads are handled in the browser, and there is no dedicated conversion endpoint for the pasted table. DOCX and XLSX exports do fetch browser-side libraries from external CDNs when you request those formats, then generate the file locally from the current parsed rows.
Why are JSON, JSON Lines, or SQL unavailable?
Those outputs depend on named columns. If the page is treating every row as data, switch Header row to use the first row as labels, or provide a file that really has a header.
Why did a row still appear even though it was too short?
The parser keeps short rows instead of discarding them. Missing cells are padded with blanks so the grid stays rectangular, and the warning badges record where the mismatch happened.
Why were some dates normalized and others left alone?
Date conversion only happens when Detect dates is enabled, and even then it is limited to ISO-like values. Mixed regional formats are left as text so the page does not guess incorrectly.
Is this the same as strict TSV validation?
Not exactly. The page is intentionally more forgiving than the narrowest TSV definition because real exports often contain quote handling, alternate delimiters, or header ambiguity. It is better for inspection and conversion than for certifying that a file matches a strict TSV profile.
Why does the unique count stop growing on very diverse columns?
The profiler caps stored distinct values at 256 so the summary stays quick. That is enough for a useful signal, but it is not meant to replace a full uniqueness audit on large or highly varied datasets.
Glossary:
- Delimiter
- The character or short token used to separate one field from the next.
- Header row
- The first row when it is interpreted as column labels instead of ordinary data.
- Keyed record
- A row expressed as named fields, such as a JSON object or a SQL insert tuple tied to column names.
- JSON Lines
- A newline-delimited stream where each row is one JSON object on its own line.
- Null
- An explicit missing value, different from an empty string.
- Profile
- The per-column summary showing type, counts, uniqueness signal, and a sample value.
References:
- Media Types: text/tab-separated-values, IANA.
- Tab Separated Values (TSV) Family, Library of Congress.
- csv - CSV File Reading and Writing, Python Software Foundation.
- CSV Injection, OWASP.