Extensible Markup Language (XML) Converter
Convert XML into JSON, JSON Lines, tables, SQL inserts, and formatted XML with local parsing, record-path checks, and schema review.XML {{ workflowModeLabel }} Snapshot
Conversion result
| {{ header }} | Copy |
|---|---|
| {{ cell }} | |
|
No record rows detected
Choose another record path when the XML has a different repeating node.
|
{{ csvText }}
{{ tsvText }}
{{ prettyXml }}
{{ htmlTable }}
{{ markdownTable }}
{{ sqlInsertText }}
{{ treeText }}
| Path | Count | Attributes | Child tags | Text | Copy |
|---|---|---|---|---|---|
| {{ row.path }} | {{ row.count.toLocaleString() }} | {{ row.attributes }} | {{ row.childTags }} | {{ row.hasText ? 'Yes' : 'No' }} | |
|
Schema summary unavailable
Paste or load well-formed XML to build tag, attribute, and child-node rows.
|
|||||
Introduction
XML conversion usually becomes important at the handoff point: an older feed, partner export, SOAP response, RSS-style document, or configuration file has to become something easier to inspect, import, or compare. XML carries meaning through a tree of elements, attributes, and text. That makes it more expressive than a plain row file, but it also means the useful record is often buried under wrappers, repeated branches, namespace prefixes, and mixed content.
The main decision is the shape of the destination. A full-tree conversion keeps the hierarchy, which is useful when the document structure is still part of the answer. A row conversion chooses one repeated element as the record path, then turns each matching branch into one record. The same XML can be valid for both purposes, but the right path depends on whether the destination needs orders, order lines, books, products, entries, or another repeated business object.
| XML shape | What to decide | Risk to check |
|---|---|---|
| Repeated child elements | Which repeated branch should become one record? | A wrapper branch can produce too few rows. |
| Attributes on start tags | Should attributes become fields with the element values? | IDs, codes, and labels may be missed if only child elements are reviewed. |
| Text mixed with child tags | Should text fragments be kept, trimmed, or collapsed? | Notes and descriptions can disappear when text is treated as empty structure. |
| Numeric-looking text | Is the value a quantity or an identifier? | Codes such as 0017 can lose leading zeros if converted to numbers. |
Well-formed XML has one document element and properly nested markup. That only proves a parser can read the document. It does not prove the selected record path matches the business object, that each row has enough parent context, or that a namespace-prefixed name carries the same meaning as a similar unprefixed name. A clean parse is the first check, not the final proof.
A dependable XML conversion therefore starts with structure review. Identify the repeated branch, decide whether attributes and text nodes are data fields, keep whitespace rules consistent with the destination, and check that row outputs still carry the identifiers and context that made the original document useful.
How to Use This Tool:
Use the first pass to parse the XML and inspect its shape. Then refine the record path and conversion options before copying or downloading the output.
- Choose Workflow target for the output you need first. Row-based targets such as Data table, JSONL records, SQL inserts, and HTML/Markdown table need a matching record path, while JSON tree, Pretty XML, and Integration handoff can still help with structure review.
- Paste the source into XML document, drop one XML or TXT file onto the text area, or select Browse file. Use Load sample when you want to confirm the result tabs before working with private or partner data.
- Leave Record path on auto detect for the first run, then compare Records, Record Fields, Tag Schema, and Node Tree. If the rows represent a wrapper or the wrong repeated branch, choose a suggested path or enter a Custom record path such as
catalog>book. - Open Advanced when attributes, text nodes, whitespace, type coercion, indentation, preview size, tree depth, or the SQL table name changes the result. Turn off Coerce booleans and numbers when codes with leading zeros must remain text.A parse error means the XML is not ready for conversion. Fix broken nesting, unclosed tags, invalid characters, missing root elements, or incorrectly quoted attributes before trusting any result tab.
- Read warnings before exporting. Record preview rows limits the visible record table from 25 to 1000 rows, and Tree depth can trim the displayed node tree, so warning text tells you whether a view is capped or a required record path is missing.
- Finish in the result tab that matches the destination. Use Record Fields, CSV, TSV, SQL Inserts, HTML Table, and Markdown Table for flattened records; use Document JSON, Pretty XML, Node Tree, Tag Schema, and Payload JSON for structure review and handoff checks.
Interpreting Results:
Start with the whole-document metrics. Elements, Attributes, Unique tags, and Depth describe the parsed XML tree. Records describes only the selected record path. When the document metrics look reasonable but Records is 0, the XML parsed and the row selection still needs correction.
Tag Schema is the best quick check for path quality. High counts, child tags, attributes, and text flags show which branches carry data. Node Tree helps when the same local tag appears under different parents, because two identical-looking names can mean different things in different parts of the tree.
Preview limits should not be mistaken for missing data. A record preview warning means the visible table is capped by Record preview rows. A tree warning means the displayed node outline reached the selected depth or line cap. Check the warning wording before assuming the conversion dropped records.
The false-confidence risk is treating a valid parse as a correct data model. XML can be well formed while the selected rows lose parent IDs, attributes, mixed text, namespace meaning, or leading zeros. Verify Record Fields against Tag Schema before importing converted output into another system.
Technical Details:
XML 1.0 describes a document as one well-formed tree with a single root element and properly nested markup. A fatal well-formedness error stops normal processing because the later element structure cannot be trusted. That is why one mismatched closing tag prevents record, table, and JSON views from being treated as reliable.
Elements form the tree, attributes attach values to element start tags, and character data supplies text content. CDATA sections still become text for conversion purposes. Namespace-qualified names can distinguish vocabularies that share similar local names, so row review should include parent path and attribute context rather than only the visible tag text.
Flattening is a projection from tree data into rows. It chooses a repeated branch, converts each matching element into a nested value, and then converts child paths into shared columns. Full-tree outputs avoid that projection, but row outputs depend on record-path choice, attribute policy, text policy, whitespace handling, and scalar coercion.
Transformation Core
Once the XML parses, the conversion follows a fixed sequence of structural steps.
- Parse the source as one XML document and stop when the parser reports a fatal syntax problem.
- Walk the element tree to count elements, attributes, unique tag names, maximum depth, text-bearing nodes, and repeated paths.
- Resolve the record path from auto detection, a selected suggestion, or a custom path. Auto detection favors repeated, structured elements over shallow wrappers.
- Convert each matching record element into a nested value, applying the active attribute, text, whitespace, and scalar policies.
- Flatten nested record values into shared column paths for row outputs while preserving full-document views for tree-oriented outputs.
| Policy | Technical effect | What to verify |
|---|---|---|
| Record path | Every matching element becomes one structured record before flattening. | The path represents the intended entity, not a wrapper or incidental repeated tag. |
| Include attributes | Attribute values become fields with the configured prefix. | Identifiers, codes, dates, and labels in start tags are still present. |
| Include text nodes | Element text is kept as scalar content or under the configured text key when mixed with children or attributes. | Descriptions, notes, labels, and mixed-content fragments are not silently dropped. |
| Whitespace handling | Text can be trimmed and repeated whitespace can be collapsed before values are emitted. | Preformatted text, addresses, and code-like content still read correctly. |
| Type coercion | Plain numeric text and true/false text can become JSON numbers and booleans. | Identifier strings such as 00123 stay as strings when leading zeros matter. |
| Preview limits | Visible record and tree views can be capped without changing the full parsed source. | Warnings are read as display limits or missing-path warnings, not automatic data loss. |
A compact record example shows the intermediate structure. With the path catalog > book, <book id="bk101"><title>Analytical Notes</title><price>12.50</price></book> becomes one record. Keeping attributes adds @id; child elements add title and price; flattening turns those names into row headers. If scalar coercion is on, 12.50 can become a number. If it is off, the value stays text.
| Output view | Source scope | Review purpose |
|---|---|---|
| Document JSON and Pretty XML | Full parsed document. | Inspecting the whole hierarchy after parsing and formatting. |
| Record JSON and JSON Lines | Elements that match the selected record path. | Passing one selected record object at a time to ingestion, logs, or queue-style workflows. |
| Record Fields, CSV, TSV, HTML, and Markdown | Flattened records with one shared header set. | Spreadsheet review, documentation tables, and simple handoffs. |
| SQL Inserts | Flattened rows serialized as insert values with sanitized identifiers. | Loading sample records into a scratch table for review or comparison. |
| Node Tree and Tag Schema | Full document structure profile. | Finding repeated paths, checking depth, and explaining unfamiliar feeds. |
| Payload JSON | Metrics, warnings, records, flattened records, schema rows, and generated text outputs. | Keeping one review payload for downstream comparison or teammate handoff. |
Advanced Tips:
- Use Tag Schema before changing Custom record path; high-count paths with child tags or attributes are better row candidates than wrapper paths with a count of 1.
- Keep Include attributes on for partner feeds unless you have confirmed that identifiers are stored only as child elements.
- Disable Collapse whitespace for XML that carries formatted prose, addresses, signatures, or code snippets in text nodes.
- Set SQL table name before using SQL Inserts; the generated identifier is sanitized to safer letters, numbers, and underscores.
- Increase Tree depth when Node Tree stops before the branch you need, but use Tag Schema for a faster path list on large documents.
Privacy Notes:
Pasted XML and selected XML/TXT files are parsed in the current browser session, and normal conversion outputs are generated from that local data. The page can reflect entered source text and settings in shareable state, so avoid using confidential feeds, credentials, or customer data on shared devices, in screen shares, or in links you might copy elsewhere.
Worked Examples:
Catalog rows from repeated books
A catalog contains 800 catalog > book elements. Each book has an id attribute plus title, author, and price children. Auto detection should choose the book path, Records should show 800, and Record Fields should include columns such as @id, title, author, and price. Keeping Include attributes on preserves the book IDs.
Orders compared with line items
An export contains 50 orders > order elements and 640 orders > order > line elements. Choosing the order path gives 50 records with line details expanded into columns. Choosing the line path gives 640 records, which is useful only if each line still includes an order ID or enough parent context for import.
Large feed with a trimmed preview
A supplier feed has 12,000 item records and Record preview rows is set to 200. Records still reports 12,000, while the visible table stops at the preview limit and shows a warning. Treat that warning as a display cap; row downloads are built from the converted records when records exist.
Broken nesting blocks conversion
A fragment such as <item><name>Ada</item> fails because name was not closed before item. Fix the nesting first, then check Tag Schema, Node Tree, and the row outputs again.
FAQ:
Why did the XML parse fail?
The source must be well formed. Common causes are a missing root element, an unclosed tag, broken nesting, invalid markup in text, or an attribute value that is not quoted correctly.
Why are there no rows when the XML is valid?
The selected Record path did not match a useful repeated element. Check Tag Schema for high-count paths and use Node Tree to confirm the parent branch before choosing another path.
What is the difference between Document JSON and Record JSON?
Document JSON represents the full parsed document under its root element. Record JSON contains only the elements selected by the record path, represented as an array.
When should I use JSON Lines?
JSON Lines is useful when each selected record should be one separate JSON value on its own line, such as for ingestion jobs, logs, queue-style review, or batch comparison.
Does a trimmed preview mean the download is incomplete?
No. Preview trimming limits what is shown in the browser table or tree. When records exist, row outputs are built from the converted records rather than only from the visible preview rows.
Why did the SQL table name change?
SQL table name is sanitized to safer identifier characters. If the name starts with a number or contains punctuation, SQL Inserts adjusts it before generating the statement.
Does the converter send my XML to a server?
The XML text or selected file is parsed in the browser session. No server lookup is needed to build the normal conversion outputs, but copied links, browser history, and shared devices can still expose entered source text.
Glossary:
- Well-formed XML
- XML that follows core syntax rules, including one root element and properly nested tags.
- Record path
- The element path that decides which repeated XML nodes become records.
- Attribute
- A name-value pair stored inside an element start tag rather than as a child element.
- Text node
- Text content inside an element, separate from child elements and attributes.
- Namespace prefix
- A prefix on an element or attribute name that identifies the XML vocabulary for that name.
- Flattened row
- A nested record rewritten so child values become column paths in a table.
- JSON Lines
- A text format where each line is a separate valid JSON value, often one selected record per line.
References:
- Extensible Markup Language (XML) 1.0 (Fifth Edition), W3C, 26 November 2008.
- Namespaces in XML 1.0 (Third Edition), W3C, 8 December 2009.
- The JavaScript Object Notation (JSON) Data Interchange Format, IETF, December 2017.
- JSON Lines documentation, JSON Lines.
- Common Format and MIME Type for Comma-Separated Values (CSV) Files, RFC Editor, October 2005.