Table Data Converter
Turn pasted tables or local text files into clean Markdown, JSON, HTML, or delimited output with header repair, previews, and formula guards.{{ outputText }}
| # | {{ header.label }} | Copy |
|---|---|---|
| {{ row.index }} | {{ cell.value }} | |
| No parsed rows to preview. |
| # | Header | Output key | Present | Blank | Fill | Type mix | Sample | Copy |
|---|---|---|---|---|---|---|---|---|
| {{ row.index }} | {{ row.header }} | {{ row.key }} | {{ row.present }} | {{ row.blank }} | {{ row.fillLabel }} | {{ row.typeMix }} | {{ row.sample }} |
| Area | Setting | Value | Evidence | Copy |
|---|---|---|---|---|
| {{ row.area }} | {{ row.setting }} | {{ row.value }} | {{ row.evidence }} |
Introduction
Table data rarely stays in the format where it was first created. A spreadsheet range may be pasted into a README, a database export may become a JSON feed, and a copied web table may need to land in a script, ticket, import job, or documentation page. The hard part is not the filename extension. The hard part is keeping the same records, cells, labels, blank values, and special characters after the format changes.
A reliable conversion starts with a rectangular grid. Every row should resolve to the same column positions, even when the original text uses commas, tabs, pipes, JSON objects, JSON arrays, Markdown separators, or HTML cells. Delimiters mark boundaries, quotes protect separators that belong inside a value, and headers decide whether a column is called name, Column 1, or something safe for code. When any of those assumptions is wrong, output can look tidy while later rows have silently shifted.
Common table formats carry different strengths. Delimited text is compact and easy to paste, but it needs consistent quoting when values contain separators or line breaks. Markdown is readable in documentation, but pipe characters inside values can be mistaken for new columns. JSON objects keep field names close to values, while JSON row arrays rely on positions. HTML tables already separate rows and cells, yet copied web fragments can include markup that is not part of the table itself.
| Data shape | Cell boundary | Mistake to catch |
|---|---|---|
| Delimited rows | Comma, tab, semicolon, pipe, or a custom token, with quotes protecting embedded separators. | A separator inside quoted text is read as a real column break. |
| Markdown table | Pipe-separated cells plus a header separator row. | A literal pipe in a value is not escaped before publishing. |
| JSON records | Object keys or array positions. | Records with different keys create sparse or surprising columns. |
| HTML table | Rows and cells in the actual table markup. | Surrounding page markup is copied as if it were table content. |
Headers often cause the largest downstream change because they stop being plain labels after conversion. A heading may become a JSON key, a Markdown column name, an import field, or a database-oriented identifier. Blank labels, duplicate labels, and labels that begin with numbers may be readable in a spreadsheet but awkward in code. Preserving the original wording is best for human-facing tables; normalized keys are safer when the result feeds another system.
Spreadsheet-bound text has a separate safety issue. Some spreadsheet apps treat values that begin with characters such as =, +, -, or @ as formulas after import. Guarding those values reduces a common CSV and spreadsheet injection risk, but it also changes the exported text by adding a protective prefix. Untrusted table data should be reviewed before it is opened, shared, or imported into a spreadsheet workflow.
Format conversion can prove that the structure was read consistently. It cannot prove that the source data is correct, complete, safe to publish, or semantically meaningful. The practical check is to compare row count, column count, headers, blanks, and warnings before using the converted result downstream.
How to Use This Tool:
Start by making the parsed grid believable, then choose the output shape for the next app, document, import, or review step.
- Paste text into Source table, drop text onto the textarea, browse for one local CSV, TSV, TXT, JSON, JSON Lines, Markdown, or HTML file, or load the sample. Local files over 5 MB are rejected.
- Keep Source format on Auto detect for ordinary pasted data. Pin CSV comma, TSV tab, semicolon CSV, pipe delimited, Markdown table, JSON object array, JSON row matrix, JSON Lines, HTML table, or custom delimiter when the same text could be read more than one way.
- Use Source delimiter only for custom parsing. One to four character separators such as
^,::, or|are supported. - Set Target format to the artifact you need. Outputs include CSV comma, TSV tab, semicolon CSV, pipe delimited, Markdown table, JSON object array, JSON row matrix, JSON Lines, HTML table, and custom delimited text.
- Choose Header handling. Auto promotes a likely label row, Use first row as headers forces row 1 into labels, and Generate Column 1 headers keeps row 1 as data.
- Leave Output header row enabled for spreadsheets and documentation. JSON object and JSON Lines outputs still use header keys even when row-style outputs hide a visible header row.
- Open Advanced when you need to preserve, snake case, camel case, or regenerate header names, trim cell edges, change delimited quoting, guard spreadsheet formulas, set a missing-cell token, or limit preview rows.
- Check Converted Data, Grid Preview, Column Profile, Column Completeness, Conversion Audit, and JSON. The result is ready only when the summary is not asking for repair and the audit notes match your source assumptions.
Interpreting Results:
The row count and column count are the first sanity check. A wrong delimiter can still produce copyable output, so compare those counts against the source before trusting a conversion. A sudden extra column often points to an unquoted separator inside a value. A lower row count may mean the first row was promoted to headers or empty rows were removed.
Grid Preview shows the normalized rows after parsing, header handling, and padding. Column Profile reports each column's source label, output key, present cells, blank cells, fill percentage, detected type mix, sample value, and maximum length. Column Completeness charts the first 24 columns for quick scanning; full tables and exports still include every column.
Conversion Audit is the best place to look when the result feels suspicious. It records the requested and detected source format, active delimiters, applied header mode, duplicate or blank header repair, padded cells, quote policy, formula guard activity, and parser warnings. Use JSON when you need a structured report of the current conversion settings, rows, headers, warnings, and profile data.
- If row 1 became labels by mistake, switch Header handling to generated headers and compare the preview again.
- If JSON keys look awkward or duplicated, change Header names and check the repaired keys in Column Profile.
- If rows were padded, decide whether a blank value,
NULL, or another token matches the receiving system. - If Delimited quoting is disabled, inspect cells that contain delimiters, quotes, or line breaks before importing the output elsewhere.
- If formula-looking cells were guarded, confirm that a leading apostrophe is acceptable for the destination.
Technical Details:
Table conversion is a structured transformation from one record representation into another. The source is first reduced to rows and cells, then widened into a rectangular grid, then labeled with output keys, and finally serialized into the selected target format. That middle grid matters because it keeps previews, profiles, charts, audits, downloads, and copied output aligned.
Format rules decide where values begin and end before any output exists. CSV-like text depends on delimiter and quote behavior. Markdown tables depend on pipe rows and a separator line. JSON separates structure from presentation, so object keys and array positions need different handling. HTML table content already has cell elements, but only actual table cells should become data cells.
Transformation Core:
| Stage | Rule | Resulting check |
|---|---|---|
| Detect or pin source shape | Auto detection checks HTML, JSON Lines, JSON, Markdown, and scored common delimiters; a pinned source format bypasses detection. | The audit names the resolved source format and delimiter. |
| Parse cells | Delimited parsing respects quoted fields and doubled quote escapes, Markdown drops separator rows, JSON values become rows, and HTML table cells become row cells. | Parser notes and errors explain structure decisions before output is generated. |
| Select headers | A likely label row can be promoted, the first row can be forced into headers, or generated Column labels can be used. | Headers become preview labels, JSON keys, Markdown headings, profile labels, and audit entries. |
| Normalize width | Rows shorter than the widest row are padded with the selected missing-cell token. | Later columns do not slide left because one row has fewer cells. |
| Serialize target | The reviewed grid is written as delimited text, Markdown, JSON object array, JSON row matrix, JSON Lines, or HTML table. | Each output format represents the same row and column model. |
Format Rules:
| Format | Important behavior | Review before use |
|---|---|---|
| CSV, TSV, semicolon, pipe, and custom delimited text | Fields split on the active delimiter only when the parser is outside quoted text; doubled quotes are treated as escaped quotes. | Cells containing delimiters, quotes, or line breaks should stay quoted for reliable import. |
| Markdown table | Pipe rows are parsed as cells and the separator row is ignored; output includes a Markdown header separator. | Pipes inside cell text are escaped so they do not create extra columns. |
| JSON object array | Object keys form column labels; nested objects and arrays are kept as compact JSON text inside cells. | Records with uneven keys can create sparse columns. |
| JSON row matrix | Arrays become rows, and scalar values become single-cell rows. | Generated headers are often safer when the first row is data rather than labels. |
| JSON Lines | Each nonblank line must parse as one JSON value before rows can be built. | A malformed line stops conversion and reports the line number. |
| HTML table | The first table is read through its row and cell elements, with header cells treated as evidence for a header row. | Only actual table cells are data; surrounding page text is not a table. |
Safety and Normalization Rules:
| Condition | Behavior | Practical effect |
|---|---|---|
| Blank, duplicate, or number-leading header | A generated label, prefix, or numeric suffix keeps output keys unique. | JSON objects and profiles can point to one clear column key. |
| Snake case or camel case header option | Header labels are normalized into machine-friendly keys without changing cell values. | Useful when the output feeds scripts, APIs, or database imports. |
| Short row | Missing cells are padded to the widest row using the selected missing-cell token. | Column positions remain stable across uneven records. |
Cell starts with =, +, -, @, tab, or carriage return |
Formula guard adds a leading apostrophe when enabled. | Spreadsheet apps are less likely to treat the value as an active formula. |
| More than 24 columns | The completeness chart shows the first 24 columns. | The visual stays readable while tables and exports keep every column. |
Worked Transformation Path:
For name,city,score followed by three data rows, auto detection reads comma-delimited text, promotes row 1 as headers, builds the keys name, city, and score, then serializes the same three-row grid into the selected target. If one row contains only Ada,London, the score cell is padded, Column Profile lowers the fill percentage for that column, and Conversion Audit records the padded cell warning.
Privacy Notes:
Pasted text, dropped text, and selected local files are read in the browser for conversion. Copied output, downloaded files, table exports, chart exports, and JSON reports are generated from the current browser session, so review artifacts before sharing when source data contains personal, private, confidential, or untrusted values.
Worked Examples:
Spreadsheet CSV to Markdown. A source that starts with name,role,score and three employee rows should show a ready conversion with 3 rows x 3 columns. With Target format set to Markdown table, Converted Data includes a header row, a separator row, and the three records.
JSON Lines import with one bad record. Three lines such as {"name":"Ada","score":98} can become a JSON object array or Markdown table. If line 2 is malformed, the summary changes to Needs repair and reports a JSON Lines parse error with the line number.
Uneven pipe-delimited rows. A row like Ada|Engineer|UK beside Grace|Analyst produces a padded missing cell. Grid Preview shows the blank or selected token, Column Profile lowers the fill percentage, and Conversion Audit records how many cells were padded.
Spreadsheet-bound safety check. A cell containing =IMPORTXML(...) is prefixed with an apostrophe when Spreadsheet formula guard is on. Conversion Audit reports how many formula-looking cells were guarded before the output is copied or downloaded.
FAQ:
Why did Auto detect choose the wrong source format?
Auto detection favors recognizable HTML, JSON Lines, JSON, Markdown, and consistent common delimiters. If your text uses a rare separator or contains many separator-like characters inside values, choose the source format or custom delimiter manually.
Does Preview rows limit copied or downloaded output?
No. Preview rows limits only the on-page Grid Preview. Copy actions, downloads, table exports, and JSON reports use the full converted grid.
Why did my first row disappear from the data?
It was probably used as headers. Set Header handling to Generate Column 1 headers when the first row is a real data row.
Why are my JSON keys different from the original headers?
Header names may be set to snake case, camel case, or Column number labels. Choose Preserve labels when human-readable headers should remain unchanged.
What does spreadsheet formula guard change?
When enabled, cells that begin with common spreadsheet formula starters are prefixed with an apostrophe during output generation. This reduces formula-injection risk for spreadsheet workflows, but it also changes the exported text.
Why does Markdown output still include a header?
Markdown tables need a header and separator row. When output headers are disabled, generated Column labels are used for the Markdown header so the table remains valid Markdown.
Glossary:
- Delimited text
- Rows where a delimiter such as a comma, tab, semicolon, pipe, or custom token separates cells.
- Header row
- The row used to label columns before those labels become preview headings, output keys, or table headers.
- JSON Lines
- A line-oriented JSON format where each nonblank line is one complete JSON value.
- Normalized grid
- The rectangular row-and-column model after parsing, header handling, trimming, and missing-cell padding.
- Column Profile
- A per-column report of source label, output key, present cells, blank cells, fill percentage, type mix, sample value, and maximum length.
- Conversion Audit
- The result table that records selected formats, delimiters, header choices, parser notes, warnings, padded cells, and formula guard activity.
References:
- RFC 4180: Common Format and MIME Type for Comma-Separated Values (CSV) Files, RFC Editor, October 2005.
- RFC 8259: The JavaScript Object Notation (JSON) Data Interchange Format, RFC Editor, December 2017.
- GitHub Flavored Markdown Spec: Tables extension, GitHub.
- JSON Lines, JSONLines.org.
- CSV Injection, OWASP Foundation.