How to Convert HTML to Markdown (3 Ways)

Q: How do I convert HTML to Markdown with Pandoc?

Install Pandoc, then run `pandoc -f html -t gfm input.html -o output.md`. Use `-t gfm` for GitHub-Flavored Markdown (tables, task lists) and add `--wrap=none` so lines aren't hard-wrapped. To convert a whole folder, loop over the files in a short shell command.

Q: How do I batch convert many HTML files to Markdown?

Use Pandoc in a shell loop (`for f in *.html; do pandoc -f html -t gfm "$f" -o "${f%.html}.md"; done`) or a script built on a converter library. Both turn a directory of HTML into a parallel directory of `.md` files without manual copy-paste.

Q: Does converting HTML to Markdown keep links, images, and code blocks?

A good converter preserves all three: links become `[text](url)`, images become `![alt](src)`, and ` ` becomes a fenced code block. The two things to watch are relative URLs (resolve them to absolute) and the code-fence language hint, which weaker converters drop.

Search "convert HTML to Markdown" and you'll get three kinds of answers: an online converter, a Pandoc command, and a code snippet. What you won't get is much help deciding which one you actually need. They're not interchangeable. The right choice comes down to what you're starting from. Do you have raw HTML source? A folder full of files? A live page open in front of you right now? Each one points somewhere different. We build a Markdown converter for a living, so here are the three ways that genuinely work, when each one wins, and the gotchas that quietly mangle your output.

The 3 ways to convert HTML to Markdown at a glance

There's no single "best" method. There's a best method for your situation, and here's the short version before we dig into each one:

Method	Best for	Install?	Handles many files	Handles a live page
Online converter	One-off HTML snippets	No	No	No
Pandoc (CLI)	Batches, automation, docs pipelines	Yes	Yes	No (needs the HTML)
Browser extension	A rendered page you're looking at	One click	No	Yes

The deciding question is what you're starting from. Have a chunk of HTML on your clipboard? Use the online converter. Have a directory of .html files to migrate? Use Pandoc. Looking at a page in your browser and want it as Markdown? Use the extension. We'll take them in that order.

Three labelled paths converging on a single Markdown file: a clipboard with HTML feeding an online converter, a folder of HTML files feeding a Pandoc terminal, and a browser window feeding a one-click extension. — Three starting points (a snippet, a folder, a live page) and the method that fits each.

Paste it into an online HTML to Markdown converter

If you already have the HTML (copied from "View Source," an API response, an email export, a CMS field), the fastest path is an online HTML to Markdown converter. Paste the markup, get clean Markdown back, copy it out. No install, no command line. Done in seconds.

This is the right tool for one-off conversions: a single snippet, a table you need in a README, a fragment you're pasting into a doc. Our own online editor does this with a live preview. Paste HTML on one side, watch the Markdown render on the other, tweak it before you copy. And because it runs the conversion client-side in your browser (on the same Turndown engine our extension uses), the HTML you paste never leaves your machine. That last part matters if the markup contains anything private.

What "online converter" really means: the good ones convert in your browser; the sloppy ones POST your HTML to a server. If your content is sensitive, prefer a client-side tool and check that nothing uploads.

Where online converters fall down: volume and live pages. Pasting 200 files one at a time is misery, and you can't paste a page you don't have the source for. That's what the next two methods are for.

The online editor with raw HTML pasted in the left pane and the converted, rendered Markdown shown in the right preview pane. — Paste HTML on the left, get clean Markdown on the right, converted in the browser with nothing uploaded.

Have HTML files on disk? Convert them with Pandoc

When you have HTML on disk, whether one big file or a whole folder, Pandoc is the workhorse. It's a free, open-source document converter that handles HTML to Markdown (and dozens of other formats) reliably, and it's built for scripting.

Install it once (brew install pandoc on macOS, choco install pandoc or the official installer on Windows, apt install pandoc on most Linux distros), then convert a single file:

pandoc -f html -t gfm input.html -o output.md

A few flags do most of the heavy lifting:

-t gfm outputs GitHub-Flavored Markdown, so tables and task lists survive. Use -t markdown_strict if you need plain CommonMark instead.
--wrap=none stops Pandoc from hard-wrapping lines at 72 characters, which otherwise litters your output with mid-sentence line breaks.

The reason to reach for Pandoc is batch conversion. Migrating a documentation site or a folder of exported articles? One line turns the whole directory into Markdown:

for f in *.html; do pandoc -f html -t gfm --wrap=none "$f" -o "${f%.html}.md"; done

You can even pull a remote page through it by piping curl output into Pandoc:

curl -sL https://example.com/article | pandoc -f html -t gfm -o article.md

The trade-off is honest. Pandoc needs installing, it has a learning curve, and like any source-level converter it converts everything you give it. Feed it a full page and you'll get the navigation, the footer, and the cookie banner right alongside the article, because it has no idea which part is the "main content." For raw files on disk that's fine. For a live page, the extension solves exactly that.

A terminal window running a Pandoc command that converts a folder of HTML files into Markdown, with the resulting .md files listed below. — One Pandoc loop converts an entire folder of HTML into a parallel set of Markdown files.

Capture a live page with a browser extension

The case the first two methods can't handle well: you're looking at a page and you want it as Markdown, but you don't have clean source. It's JavaScript-rendered, buried in <div> soup, or wrapped in navigation and ads. That's where a browser extension wins.

Instead of asking you to find the HTML, the extension reads the page you're already on (the fully rendered DOM) and converts it in one click. Three things make this different from pasting source into a converter:

It captures JavaScript-rendered content. Because it works on the live DOM, single-page apps and lazy-loaded sections come through intact, where "View Source" would show you an empty shell.
It strips the page chrome. It extracts the main content (headings, paragraphs, lists, code, tables, images) and drops the nav, sidebars, and footers before converting, so you don't spend ten minutes deleting menus.
It fixes links automatically. Relative URLs like /docs/intro are resolved to absolute links so they don't break the moment the Markdown leaves the site.

There's also a select-area tool for when you only want one section. Draw a box around a single table or code sample and it converts just that. We covered the whole live-page workflow in depth in Convert Any Webpage to Markdown in One Click.

Which method should you use?

Map it to what you're starting from and the choice makes itself:

A snippet of HTML you already have → the online converter. Fastest for one-off jobs, nothing to install, conversion stays in your browser.
A folder of HTML files, or a repeatable job → Pandoc. It's scriptable, handles batches, and slots into a docs pipeline or a Makefile.
A page you're reading in the browser → the extension. It's the only one that captures rendered content and strips the page chrome for you.
A conversion step inside your own app or script → a library (see the FAQ): Turndown in Node, markdownify or html2text in Python.

A decision-tree flowchart starting from 'What are you converting?' and branching to online converter, Pandoc, browser extension, or a code library based on the answer. — A 10-second decision tree: pick the branch that matches what you're starting from.

Common mistakes (and how to avoid them)

Whichever method you pick, the same handful of issues trip people up:

Broken relative links. A link like /pricing becomes a dead link once the Markdown leaves the original site. Resolve relative URLs to absolute ones. The extension does this automatically; with Pandoc, start from the page's real base URL.
Lost code-fence languages. Weaker converters turn a highlighted js block into an unlabeled fence, so your renderer can't color it. Check that the language hint survived the conversion.
Converting the chrome along with the content. Source-level tools (online converters and Pandoc) convert everything you hand them, nav and ads included. Trim the HTML down to the article first, or use the extension's main-content extraction.
Hard-wrapped lines from Pandoc. Forget --wrap=none and you'll get line breaks mid-sentence that read fine in Markdown but break diffs and editing.
Expecting pixel-perfect layout. Markdown is structural, not visual. Multi-column layouts and custom styling flatten to one readable column. That's the format working as intended.

Whichever method you use, run the output past the 9 rules for clean Markdown before you commit it. Converted Markdown is exactly where unlabelled code fences and broken links creep in.

FAQ

What is the fastest way to convert HTML to Markdown? For a one-off snippet, paste the HTML into an online converter and copy the Markdown back: a few seconds, no install. For many files at once, Pandoc on the command line is faster overall. For a live web page you're looking at, the one-click extension wins.

How do I convert HTML to Markdown with Pandoc? Install Pandoc, then run pandoc -f html -t gfm input.html -o output.md. Use -t gfm for GitHub-Flavored Markdown (tables, task lists) and add --wrap=none so lines aren't hard-wrapped. To convert a whole folder, loop over the files in a short shell command.

Can I convert HTML to Markdown in Python or Node? Yes. In Node, the Turndown library converts an HTML string to Markdown in a couple of lines, and it's the same engine our tools use. In Python, markdownify or html2text do the same job. Reach for a library when conversion is one step in a larger automated pipeline.

How do I batch convert many HTML files to Markdown? Use Pandoc in a shell loop (for f in *.html; do pandoc -f html -t gfm "$f" -o "${f%.html}.md"; done) or a script built on a converter library. Both turn a directory of HTML into a parallel directory of .md files without manual copy-paste.

Does converting HTML to Markdown keep links, images, and code blocks? A good converter preserves all three: links become [text](url), images become ![alt](src), and <pre><code> becomes a fenced code block. The two things to watch are relative URLs (resolve them to absolute) and the code-fence language hint, which weaker converters drop.

There's no universally best way to convert HTML to Markdown. There's the one that fits what's in front of you. Have the source? Paste it into the editor. Have a folder? Script it with Pandoc. Looking at a live page? Grab the extension and do it in one click.

How to Convert HTML to Markdown (3 Ways)

The 3 ways to convert HTML to Markdown at a glance

Paste it into an online HTML to Markdown converter

Have HTML files on disk? Convert them with Pandoc

Capture a live page with a browser extension

Which method should you use?

Common mistakes (and how to avoid them)

FAQ

Best Markdown Team

Keep reading

Clean Markdown: 9 Rules for Readable, Lint-Free Docs

Markdown for ChatGPT & Claude: Stop Pasting Raw HTML

Convert Any Webpage to Markdown in One Click