Conversion ready
{{ summaryHeadline }}
{{ summaryLine }}
{{ profileBadge }} {{ engineBadge }} {{ formatCount(metrics.blockCount) }} blocks {{ formatCount(metrics.links) }} links Local processing
HTML to Markdown converter inputs
Paste a fragment, article body, email snippet, or full HTML document to convert into Markdown.
{{ sourceActionHint }}
{{ fileStatus }}
Choose the Markdown dialect target before reviewing the generated document.
Select the source region to convert when pasting a full page instead of a small fragment.
Set how HTML tables should be represented in the Markdown document.
Choose the heading syntax that best fits your docs or CMS workflow.
Use the list marker expected by your target Markdown renderer.
Choose how pre/code blocks should be emitted.
Choose the fallback for tables with colspan or rowspan attributes.
Backticks are common; tildes avoid conflicts when source code already contains triple backticks.
Match the marker style used in your existing Markdown files.
Choose whether HTML br tags should become Markdown hard breaks.
Remove script, style, template, and noscript content before conversion.
{{ strip_unsafe ? 'Removed' : 'Included as text' }}
{{ markdownOutput }}
Check Value Detail Copy
{{ row.check }} {{ row.value }} {{ row.detail }}
Element Count Markdown handling Detail Copy
{{ row.element }} {{ row.count }} {{ row.handling }} {{ row.detail }}
{{ resultJson }}
Customize
Advanced
:

Introduction:

Moving a page from HTML into Markdown is less like copying a design and more like translating a document. HTML can carry layout regions, attributes, scripts, styles, embedded media, tables, and browser-only behavior. Markdown is deliberately smaller: it stores the main writing structure in plain text so headings, lists, links, quotes, code, and simple tables remain readable before any renderer touches them.

That difference matters during migrations from CMS exports, email templates, help-center articles, README drafts, static-site pages, and generated documentation. The useful result is not a perfect recreation of a browser page. It is a clean document that preserves the reading order, important links, meaningful images, code samples, and table content well enough for editing or publishing in a Markdown-aware system.

HTML and Markdown document differences
HTML Feature Markdown Equivalent Common Review Point
Document regions Usually one chosen reading body Navigation, sidebars, and footers may need to be excluded.
Nested headings and paragraphs Plain heading markers and blank-line-separated text Duplicate titles can appear when a page title and article H1 both exist.
Links and images Inline links, reference links, or readable URL text Relative URLs, empty link labels, and missing alt text still need human review.
Tables Pipe tables, raw HTML, or readable text rows Merged cells do not have a faithful pipe-table form.
Scripts, styles, and templates Usually removed for content migration Removal cleans the article text but is not a complete publishing sanitizer.

Markdown dialects are the other major decision. CommonMark describes a portable core with well-specified block and inline rules. GitHub Flavored Markdown, usually shortened to GFM, builds on that core with extensions such as tables, task list items, strikethrough, and autolinks. Documentation systems, wikis, and static-site generators may add their own features, so the same Markdown can render differently after import.

HTML to Markdown conversion review path A diagram showing HTML source being narrowed to document content, converted to Markdown blocks, then checked for links, tables, code, and warnings. HTML page article, nav, media tables, code, styles Readable body chosen content scope Markdown draft blocks and inline text Review evidence warnings and element map

Tables, code fences, and inline formatting cause most surprises. A simple two-column HTML table maps cleanly to a GFM pipe table, while a pricing matrix with rowspan or colspan needs HTML preservation or a plain-text fallback. A code sample that already contains triple backticks may need tilde fences. A link that looks fine on the page may depend on a relative URL that becomes confusing outside the original site.

A good conversion ends with a review pass. Check the first heading, reading order, important URLs, image alt text, list nesting, table shape, code fences, line breaks, and any removed script or style content before sending the Markdown to a publisher. Markdown makes document text easier to inspect, but it does not carry every browser behavior, style rule, security policy, or CMS-specific shortcode.

How to Use This Tool:

Choose the destination style first. A GitHub README, a CommonMark-only renderer, a CMS migration note, and a plain audit copy can all need different table, link, heading, and line-break choices.

  1. Paste markup into HTML source, drop text into the field, or use Browse HTML for one .html, .htm, or .txt file under 5 MB. Trim obvious page chrome first when the navigation or footer is not part of the document you need.
  2. Select Markdown profile. Use GitHub-friendly Markdown for GFM tables, task lists, strikethrough, and fenced code. Use CommonMark core for stricter portability, or Plain readable Markdown when visible text and URLs matter more than Markdown syntax.
  3. Set HTML scope. Auto content root looks for an article or main region before falling back to the body. Body or fragment converts the parsed body. Document title plus body prepends the browser title as a top heading when one is present.
  4. Choose link and table handling. Inline links keep URLs near their labels, reference links collect URLs near the end, and plain labels with URLs are useful for review copies. Tables can be emitted as GFM pipe tables, kept as raw HTML, or flattened into readable rows.
  5. Open Advanced when the target has house style requirements. Adjust the heading style, bullet marker, fenced or indented code blocks, merged-cell fallback, fence marker, emphasis delimiter, hard or soft line breaks, and removal of script, style, template, and noscript content.
  6. Read Markdown Document first, then check Conversion Ledger and Element Map. The ledger summarizes scope, profile, table policy, output size, unsafe element handling, and warnings. The element map shows how detected headings, links, images, lists, tables, code, quotes, and styling were represented.

When the warning list is clear or the remaining warnings are understood, copy or download the Markdown. For review handoffs, export the ledger or element map so another person can see the conversion choices without repeating the work.

Interpreting Results:

Markdown Document is the edited output. Treat it as a draft for the destination renderer, not as proof that the final page will match the original. The summary gives output lines, character count, source scope, table policy, block count, link count, and local-processing status, but the destination renderer still decides the final appearance.

  • Conversion Ledger is the audit trail for the run. It records the selected profile, source scope, structural coverage, link policy, table handling, merged-cell behavior, code handling, list handling, unsafe element handling, and review notes.
  • Element Map is the fastest way to find mismatches. It lists each detected construct family and how it became Markdown, including headings, paragraphs, links, images, lists, tables, code, block quotes, strikethrough, and removed script or style content.
  • JSON stores the Markdown, metrics, warnings, ledger rows, and element rows as a structured record for QA notes or repeatable migration work.

A short output is not always wrong, but it should be explainable. If a full page produces only a few paragraphs, the selected content region may have skipped the intended body. If the output includes menus, cookie notices, or repeated footer links, use a narrower source fragment or a different scope. If a document title is added above an article that already starts with the same H1, remove the duplicate before publishing.

Warnings deserve the same attention as visible Markdown. CommonMark with pipe tables can be a portability risk. Merged table cells may be preserved as HTML, flattened to rows, or forced into a pipe table that needs manual repair. Plain readable mode turns link and image URLs into text. Removed script, style, template, and noscript elements may be expected for migration, but their absence should still be intentional.

Preview the converted text in the real target system before treating it as final. Check headings, links, images, line breaks, task lists, strikethrough, table alignment, code fences, and any extension-specific syntax that the target may not support.

Technical Details:

HTML-to-Markdown conversion begins by parsing markup into a document tree, then reducing that tree to the content that should become plain text. A full page often contains non-article regions and invisible or executable material. Selecting the article, main region, body, or title-plus-body path changes the text that enters the conversion stage before any Markdown syntax is emitted.

The conversion then maps block structure before inline structure. Block-level elements such as headings, paragraphs, lists, block quotes, horizontal rules, tables, and code blocks set the outer Markdown shape. Inline elements such as emphasis, strong text, code spans, links, images, line breaks, strikethrough, subscript, superscript, and marked text are handled inside those blocks when the chosen profile can represent them.

Transformation Core:

HTML to Markdown transformation stages
Stage Mechanism Evidence to Review
Parse Fragments and full documents are read as HTML, so tags, text nodes, attributes, and document title can be inspected consistently. File status, source size, and empty-output warnings.
Select The selected scope chooses an article element, main element, body, fragment body, or document title plus body. Source scope and root label in the ledger.
Map blocks Headings, paragraphs, lists, quotes, horizontal rules, tables, and code blocks become Markdown blocks or preserved HTML. Structural coverage, table handling, and code handling rows.
Map inline content Links, images, emphasis, strong text, code spans, line breaks, and GFM-only styling are represented according to the selected profile and link policy. Element map rows for links, images, code, styling, and text output.
Clean Outer blank lines, repeated whitespace, reference definitions, and warnings are assembled into a readable final document. Markdown line count, warning list, ledger exports, and JSON record.

Element Mapping:

HTML constructs and Markdown conversion behavior
HTML Construct Markdown Handling Boundary to Check
h1-h6 Heading levels are preserved with ATX markers, or H1 and H2 can use Setext underlines. A document title can add another top heading.
p, br, hr Paragraphs use blank-line separation, line breaks follow the hard or soft setting, and horizontal rules become ---. Hard breaks are important for addresses, signatures, poems, and pasted email text.
a and img Links and images can become inline Markdown, reference-style Markdown, or readable text with URLs. Relative paths, missing alt text, private URLs, and empty labels still need review.
ul, ol, li Lists are rebuilt with the selected bullet marker, ordered lists are renumbered, and checkbox items become task list markers in GFM output. Deep nesting may render differently in systems with unusual indentation rules.
table, tr, th, td Tables become pipe tables, raw HTML tables, or readable text rows. Pipe tables cannot faithfully carry merged cells, captions, or layout-table intent.
pre and code Code blocks can be fenced or indented, and language hints are reused when available. Source code containing the selected fence marker may require the alternate fence marker.
script, style, template, noscript These elements are removed by default. If included, their text is treated as content, not executable behavior. Removal is content cleanup, not a complete sanitizer for later HTML rendering.

GFM output can represent more of a modern documentation page than CommonMark alone, but portability drops when the destination lacks those extensions. Plain readable output takes the opposite path: it reduces clickable and styled syntax into text that is easier to audit, but less useful as a ready-to-render Markdown document.

After the page loads, conversion runs in the browser. Pasted source and supported files are read locally, and the resulting Markdown, CSV ledgers, DOCX ledgers, and JSON records are created from that local conversion. The receiving editor or publishing system still controls final rendering and sanitization.

Privacy Notes:

Pasted HTML and supported source files are converted in the browser after the page has loaded. The conversion does not upload the source document or file contents for server-side processing.

  • Be careful with unreleased drafts, credentials, customer data, private links, internal screenshots, copied output, downloads, and shared URLs. Local processing does not remove those everyday exposure paths.
  • Removing script and style elements helps content migration, but it does not make later HTML rendering safe by itself. Apply the destination system's sanitizer and publishing policy before release.
  • The file picker accepts one HTML, HTM, or TXT-style file under 5 MB. Larger exports should be trimmed to the article, main region, or fragment that actually needs conversion.

Worked Examples:

Repository release notes. Paste an article with a top heading, a documentation link, a task list, a small compatibility table, and a code block. With GitHub-friendly Markdown, inline links, pipe tables, and fenced code, the result should keep the heading, produce task list markers, emit a pipe table, and include a fenced code block with any detected language hint.

CMS article export. A full page export may contain a title, header, menu, main content, sidebar, and footer. Start with Auto content root when the article or main region is the intended text. If menus or cookie copy appear in the Markdown, paste only the article fragment. If Document title plus body creates a duplicate H1, remove one heading before publishing.

Merged comparison table. A pricing table with rowspan or colspan cannot be represented exactly as a pipe table. Preserve affected tables as HTML when structure matters, flatten them to text rows for editorial review, or force a pipe table only when manual cleanup is acceptable.

Email template cleanup. Email HTML often mixes visible copy with style blocks, template tags, tracking scripts, and layout tables. Keep script and style removal enabled for content migration. If you intentionally include that text for an archive, keep the result out of a publishing workflow until it has been reviewed.

FAQ:

Does pasted HTML leave the browser?

The conversion runs in the browser after the page loads, and supported files are read locally. Downloads, copied text, screenshots, and shared page state can still expose private content, so handle sensitive source carefully.

Which Markdown profile should I use?

Use GitHub-friendly Markdown for GFM tables, task lists, strikethrough, and fenced code. Use CommonMark core when the destination favors strict portability. Use Plain readable Markdown when the output is mainly an audit copy or migration note.

Why are some tables preserved as HTML?

Pipe tables work for simple row and column grids. They cannot fully describe merged cells, captions, alignment nuance, or layout-only tables. Preserving the HTML can be more faithful than forcing a simplified pipe table.

Why did link or image syntax turn into plain text?

Plain readable mode keeps labels and URLs as text instead of Markdown references. That is useful for review copies, but it is not the right choice when clickable Markdown output is required.

Can the Markdown be published immediately?

Preview it in the destination first. Check headings, links, images, tables, code fences, task lists, line breaks, and warnings because Markdown support varies across renderers.