Conversion ready
{{ summaryHeadline }}
{{ summaryLine }}
{{ profileBadge }} {{ engineBadge }} {{ formatCount(metrics.blockCount) }} blocks {{ formatCount(metrics.links) }} links Local processing
HTML to Markdown converter inputs
Paste a fragment, article body, email snippet, or full HTML document to convert into Markdown.
{{ sourceActionHint }}
{{ fileStatus }}
Choose the Markdown dialect target before reviewing the generated document.
Select the source region to convert when pasting a full page instead of a small fragment.
Set how HTML tables should be represented in the Markdown document.
Choose the heading syntax that best fits your docs or CMS workflow.
Use the list marker expected by your target Markdown renderer.
Choose how pre/code blocks should be emitted.
Choose the fallback for tables with colspan or rowspan attributes.
Backticks are common; tildes avoid conflicts when source code already contains triple backticks.
Match the marker style used in your existing Markdown files.
Choose whether HTML br tags should become Markdown hard breaks.
Remove script, style, template, and noscript content before conversion.
{{ strip_unsafe ? 'Removed' : 'Included as text' }}
{{ markdownOutput }}
Check Value Detail Copy
{{ row.check }} {{ row.value }} {{ row.detail }}
Element Count Markdown handling Detail Copy
{{ row.element }} {{ row.count }} {{ row.handling }} {{ row.detail }}
{{ resultJson }}
Customize
Advanced
:

Introduction:

HTML describes a document with elements, attributes, and nested structure. Markdown describes a smaller set of document patterns with plain-text markers, so conversion works best when the source is article-like content: headings, paragraphs, links, lists, quotes, tables, images, and code.

HTML to Markdown conversion is useful when moving CMS pages, release notes, docs snippets, email bodies, or help articles into a Markdown-first workflow. The goal is not to preserve every browser layout detail. The useful result is a readable text document that keeps the main structure and makes the places that need review easy to find.

HTML tree converted through a selected content root into Markdown blocks and review notes

Markdown dialects also differ. A table that works in GitHub Flavored Markdown may need raw HTML in a stricter CommonMark target. A task list checkbox, a strikethrough span, or a highlighted word may look natural in one renderer and appear as plain text or unsupported punctuation in another.

The safest conversion habit is to treat Markdown as a cleaned handoff, not as a perfect clone of the original page. Review important links, code fences, tables, image text, and removed script or style content before publishing the result somewhere else.

How to Use This Tool:

Choose the Markdown target first, then adjust scope and structure handling until the result matches the place where the Markdown will be used.

  1. Paste HTML into HTML source, drop text onto the editor, or use Browse HTML for one .html, .htm, or .txt file under 5 MB. If the file message rejects the input, choose a supported text file or paste the fragment directly.
  2. Set Markdown profile. Use GitHub-friendly Markdown for tables, task lists, strikethrough, and fenced code; use CommonMark core when the target expects the smaller core syntax; use Plain readable Markdown when readability matters more than link or image syntax.
  3. Set HTML scope. Auto content root prefers an article or main content region when one exists, Body or fragment converts the parsed body, and Document title plus body adds the document title as a top heading.
  4. Choose Link style, Table handling, and Heading style. Reference links are useful when long URLs clutter paragraphs, raw HTML tables are safer for strict Markdown targets, and Setext headings affect only the first two heading levels.
  5. Open Advanced when the target has house style rules. Set the bullet marker, fenced or indented code blocks, merged-cell fallback for tables with colspan or rowspan, backtick or tilde fences, emphasis delimiter, hard or soft line breaks, and whether script, style, template, and noscript content should be removed.
  6. Review Markdown Document first, then check Conversion Ledger and Element Map for counts, warnings, and elements that may need manual cleanup. If a warning says the selected scope produced no visible content, switch scope or paste a smaller content fragment.

A good result has the expected content root, readable Markdown blocks, no surprise warnings, and element counts that roughly match the original HTML you intended to convert.

Interpreting Results:

Markdown Document is the artifact to inspect first. The summary line gives a fast sanity check with output length, line count, selected content root, and table policy, but the rendered meaning still depends on your destination Markdown renderer.

  • Conversion Ledger records the selected scope, profile, output footprint, structural coverage, link policy, table handling, code handling, list handling, unsafe-element setting, and review notes.
  • Element Map shows which HTML constructs were detected and how each was represented, such as headings, links, images, lists, tables, code, block quotes, and script or style content.
  • JSON keeps the same result details in a structured form for comparison or handoff.

False confidence comes from readable output that silently loses page-specific meaning. Merged table cells, CSS layout, interactive widgets, hidden metadata, embedded scripts, and unsupported Markdown extensions can all be flattened or omitted while the document still looks tidy. Use the element counts and warning text to decide what needs manual review.

Before publishing, preview the Markdown in the target system. Check the first heading, every critical link, code blocks that contain fence characters, complex tables, image alt text, and any note about removed script or style content.

Technical Details:

Markdown is a plain-text syntax for structured documents, not a full representation of the HTML document model. CommonMark defines the core block and inline rules, including headings, paragraphs, lists, block quotes, code spans, code blocks, links, images, hard line breaks, and raw HTML. GitHub Flavored Markdown builds on CommonMark with extensions such as tables, task list items, and strikethrough.

HTML conversion starts by parsing the input into a document tree and choosing a content root. That root matters because a full page can contain navigation, footers, metadata, scripts, and styling that do not belong in the Markdown article. An article or main region usually gives a cleaner document than converting every part of a full page body.

Transformation Core:

HTML to Markdown transformation stages and review outputs
Stage Rule What to Check
Parse The text is read as HTML, so fragments, full documents, and plain text files can be handled through the same document parser. File status, source length, and any empty-output warning.
Select root The selected scope chooses an article, main region, body, fragment body, or title plus body. Source scope in Conversion Ledger.
Map blocks Headings, paragraphs, lists, block quotes, code blocks, tables, and horizontal rules become Markdown block syntax or preserved table HTML. Structural coverage, Table handling, and Code handling.
Map inlines Emphasis, strong text, inline code, links, images, line breaks, strikethrough, subscript, superscript, and marked text are converted to the selected readable form. Element Map rows for links, images, code, and text styling.
Clean and warn Extra blank lines and trailing spaces are trimmed, reference definitions are appended when selected, and profile or safety notes are collected. Review notes and the warning alert above the results.

Element Mapping:

HTML elements and their Markdown conversion behavior
HTML Construct Markdown Handling Review Note
h1-h6 Heading levels are preserved as ATX headings, or H1 and H2 can use Setext underlines. Check that a document title did not create a duplicate H1.
p, br, hr Paragraphs use blank-line separation, line breaks follow the selected hard or soft policy, and horizontal rules become ---. Hard breaks can matter in poetry, addresses, signatures, and compact release notes.
a and img Links and images can be inline Markdown, reference-style Markdown, or plain labels with URLs. Check empty link text, missing image alt text, and long URLs before publishing.
ul, ol, li Lists are rebuilt with the selected bullet marker, ordered lists are renumbered, and checkbox items become task list markers in the GitHub-friendly profile. Deeply nested lists should be previewed in the target renderer.
table, tr, th, td Tables become GitHub-style pipe tables, raw HTML tables, or readable text rows. Pipe tables use the first row as the header and cannot fully represent merged cells or layout-only tables.
pre and code Code blocks can be fenced or indented, and language hints are reused when class names contain language markers. Inline code uses backticks, with doubled backticks when needed. If source code contains the chosen fence marker, switch to tilde fences or review the block manually.
script, style, template, noscript These elements are removed by default. If included, their text is treated as text rather than executable page behavior. Removing them improves content migration, but it is not a complete HTML sanitizer for every publishing system.

Text nodes are normalized for inline readability and Markdown metacharacters are escaped in ordinary text. That escape step helps prevent accidental emphasis, code spans, or links from being created by source punctuation, while code spans and code blocks keep their code-like characters more literally.

Reference-style links append definitions at the end of the Markdown document. Plain link mode keeps the label and URL visible without Markdown link syntax, which is useful for review copies, policy text, or systems that do not allow clickable links.

Privacy Notes:

Pasted text and supported files are converted in the browser after the page loads. The conversion workflow does not upload the HTML or selected file contents for server processing.

  • Use caution with unreleased content, credentials, customer data, and confidential links. Browser history, shared page links, screenshots, downloads, and clipboard contents can still expose sensitive material.
  • Local conversion does not make untrusted HTML safe to publish. If the Markdown will be rendered back to HTML, review links, raw HTML tables, and any included script or style text under the target system's security rules.
  • File input accepts one text-like HTML, HTM, or TXT file under 5 MB. Larger pages should be trimmed to the article or fragment that actually needs conversion.

Worked Examples:

Release notes for a docs repo. Paste an article containing <h1>Release notes</h1>, a paragraph with a documentation link, a task list checkbox, a small table, and a language-js code block. With GitHub-friendly Markdown, Auto content root, inline links, pipe tables, and fenced code, Markdown Document begins with # Release notes, keeps the link as Markdown, emits checkbox task markers, creates a pipe table, and includes a fenced JavaScript block. Conversion Ledger should show a GitHub-friendly profile, detected links, a table count, and code handling.

Full page with a useful main region. A CMS export may include a title, header, navigation, main article, and footer. Use Auto content root when the article or main region is the only content you want. Switch to Document title plus body when the browser title should become the top heading. Check Source scope before trusting the output, because that row tells you whether the converted root was an article, main region, body, or title-plus-body selection.

CommonMark table warning. If a source table is converted with CommonMark core selected and GFM pipe tables still active, the warning area notes that pipe tables were requested while the profile is CommonMark core. For a strict renderer, change Table handling to Keep HTML tables or Readable text rows. For a GitHub README, switch the profile to GitHub-friendly Markdown and preview the table there.

Script and style cleanup. A pasted email template or page export may contain CSS and tracking script blocks around the visible article. With Script and style tags set to Removed, Unsafe elements in the ledger reports the excluded count and the warning text explains that those elements were removed before conversion. If you need to archive the text inside those elements, switch the setting deliberately and keep the result out of a publishing workflow until it has been reviewed.

FAQ:

Does my pasted HTML leave the browser?

The conversion runs in the browser after the page loads, and supported files are read locally by the browser. Do not share a page link, screenshot, downloaded file, or clipboard result until sensitive content has been removed.

Which Markdown profile should I choose?

Use GitHub-friendly Markdown for GitHub-style tables, task lists, strikethrough, and fenced code. Use CommonMark core for stricter portability, and use Plain readable Markdown when visible text and URLs matter more than Markdown link syntax.

Why did my table not render correctly?

Pipe tables are a GitHub Flavored Markdown extension and do not capture merged cells or complex layout. Try Keep HTML tables when table structure matters, or Readable text rows when the table only needs to be reviewed as text.

Why are script or style contents missing?

Script and style tags are removed by default along with template and noscript content. Turn the setting off only when you need that text for an archive or review, and do not treat the converted result as sanitized publishing content.

What should I check when code fences look broken?

Open Advanced and switch Fence marker from backticks to tildes when the source code already contains triple backticks. Then preview Markdown Document in the target renderer before copying the final version.

Glossary:

Markdown profile
The selected target dialect, such as GitHub-friendly Markdown, CommonMark core, or plain readable Markdown.
CommonMark
A formal core Markdown specification for common block and inline syntax.
GitHub Flavored Markdown
A CommonMark-based dialect that adds extensions such as tables, task list items, and strikethrough.
Content root
The part of the parsed HTML document selected for conversion, such as an article, main region, body, or document title plus body.
Reference-style link
A Markdown link form that keeps link labels in the text and places URL definitions at the end of the document.
Fenced code block
A Markdown code block opened and closed with matching backtick or tilde fences.
Pipe table
A Markdown table form that uses vertical bars and a separator row, commonly supported by GitHub-style renderers.

References: