You found a great article, and you want ChatGPT or Claude to work with it. So you copy the page, paste it in, and ask your question. That paste just cost you hundreds of wasted tokens and made the model's job harder — because raw web content is mostly markup the model has to read past to find the meaning. The fix is one step most people skip: convert the page to Markdown first. It's the single highest-leverage habit for feeding web pages to LLMs, and it takes one click.
Why raw HTML wastes your context window
Every model has a finite context window — the amount of text it can hold at
once. When you paste a web page, you're not just pasting the words you care
about. You're pasting the scaffolding around them: <div class="..."> wrappers,
inline styles, data- attributes, tracking pixels, SVG icon markup, and the
navigation and footer that surround the article. None of it carries meaning, but
all of it costs tokens.
The numbers are not subtle. In our testing, a content-heavy page routinely
drops to well under half its original token count once it's converted to clean
Markdown — and pages that are heavy on nested layout markup shrink even more.
A <h2 class="text-2xl font-bold tracking-tight">Getting started</h2> is dozens
of tokens; the Markdown ## Getting started is a handful. Multiply that across
every heading, list item, and link on a long page and the savings compound fast.
That recovered budget isn't abstract. It's room for more of the actual article, a longer follow-up conversation, or a second source pasted alongside the first. When you're working near a context limit — a long doc, a research session, a big codebase reference — the difference between raw HTML and Markdown is the difference between "fits" and "truncated."

Why ChatGPT & Claude read Markdown better than HTML
Saving tokens is only half the win. The other half is comprehension. Markdown isn't just smaller than HTML — it's clearer to a language model, for a reason worth understanding.
Markdown is the lingua franca of the data these models trained on: READMEs,
docs, forum posts, Stack Overflow answers, technical wikis. The models have seen
billions of tokens of well-structured Markdown, so the syntax maps cleanly onto
structure they already understand. A ## is unambiguously a heading. A - is
unambiguously a list item. A fenced ```python block is unambiguously code,
in a known language.
HTML can express the same things, but with far more ambiguity. Is a <div>
styled to look like a heading actually a heading, or just big text? Is that
<span> emphasis, or a color swatch? The model has to infer intent from a tangle
of tags and classes — and inference is where errors creep in. Clean structure in
means more reliable structure out: better summaries, more accurate extraction,
and Q&A that doesn't lose the thread of a document's hierarchy.
The rule of thumb: if a human would struggle to read your content as raw source, so will the model. Markdown is readable as plain text and as rendered output — which is exactly why it's the ideal interchange format between the web and an LLM.
The workflow: web page → Markdown → AI
Here's the whole loop, start to finish. It adds one step to your normal copy-paste, and that step does all the work.
- Open the page you want the model to read and let it finish loading.
- Capture it as Markdown. Click the HTML to Markdown extension icon. It reads the rendered page, extracts the main content — dropping the nav, sidebar, and footer — and converts what's left to clean Markdown.
- Copy the Markdown. It's on your clipboard, ready to paste.
- Paste it into ChatGPT or Claude inside your prompt, then ask your question.
Because the extension works on the rendered DOM, it captures JavaScript-loaded content that "save as HTML" and most URL-fetchers miss. And because it extracts the article before converting, you don't paste the cookie banner and the "related posts" rail into your context window along with the content you wanted.

If you only need part of a page — a single table, one section of a long doc — use the extension's select-area tool to capture just that region. Why feed the model 4,000 tokens of documentation when your question is about one 800-token section? We cover the capture mechanics in depth in our guide to converting any webpage to Markdown in one click, and the source-side equivalent in how to convert HTML to Markdown.
"Can't I just paste the URL?"
Sometimes — and that's exactly the problem. Pasting a URL and hoping the model browses to it is the least predictable option you have, for a few concrete reasons:
- Browsing isn't guaranteed. Depending on the model, the plan, and the mode, the assistant may not fetch the URL at all — it might answer from memory, or tell you it can't browse.
- What it retrieves is out of your hands. When browsing does work, the fetcher grabs whatever the server returns. That often includes the same navigation, ads, and boilerplate you were trying to avoid — and burns tokens on all of it.
- JavaScript and paywalls break it. Single-page apps, lazy-loaded sections, and login- or paywall-gated articles frequently return empty or partial content to a server-side fetcher, even though you can see the page fine in your browser.
Pasting Markdown you captured yourself removes every one of those variables. The model sees exactly the content you chose, in a format it reads cleanly, with no fetch step that can fail. It's the deterministic option — and determinism is what you want when the answer matters.
How to structure Markdown inside your prompt
Clean Markdown gets you most of the way; how you frame it in the prompt gets you the rest. A few habits make a measurable difference in answer quality:
- Wrap the content in a delimiter. Put the pasted Markdown inside triple
backticks or an XML-style tag (
<article>…</article>). This tells the model exactly where the source ends and your instructions begin, so it doesn't mistake a heading in the article for a command to you. - Put your instruction first, the content second. Lead with the task — "Summarize the key arguments in the article below" — then paste the Markdown. The model knows what it's reading for before it starts reading.
- Keep the structure, drop the chrome. You want the headings, lists, tables, and code. You don't want the "Share on Twitter" buttons or the author bio footer — the main-content extraction handles that for you.
- Trim to what's relevant. If your question is about one section, paste one section. Less irrelevant context means a more focused answer and a cheaper call.

Tip: if the captured Markdown needs a quick trim before you paste it — cutting a stray section or tidying a table — drop it into the online editor and watch the live preview as you edit.
Common mistakes (and how to avoid them)
The same few errors come up over and over when people start feeding web content to LLMs:
- Pasting raw HTML source. "View source → copy → paste" gives the model the worst possible input: maximum tokens, minimum clarity. Convert first, always.
- Capturing before the page loads. On lazy-loading pages, grabbing too early misses content that hasn't rendered. Let the page settle — scroll to the bottom if sections load on demand.
- Feeding the whole page for a one-section question. Use select-area. The extra context isn't free; it costs tokens and can dilute the model's focus.
- No delimiter around the content. Without a fence or tag, the model can blur the line between the article and your instructions — especially if the article itself contains commands or questions.
- Trusting "paste the link" for anything that matters. For a casual lookup, fine. For analysis you'll act on, capture the Markdown yourself so you know exactly what the model read.
| Approach | Token cost | Captures JS content | Strips nav/ads | Reliable |
|---|---|---|---|---|
| Paste raw HTML source | Highest | Sometimes | No | No |
| Paste the URL (browse) | Varies | Often no | No | No |
| Paste clean Markdown | Lowest | Yes | Yes | Yes |
FAQ
What's the best way to give ChatGPT or Claude a web page? Convert the page to Markdown first, then paste the Markdown into your prompt inside a delimiter like triple backticks. Markdown strips the token-heavy HTML markup, so more of your context window holds actual content — and models parse Markdown structure more reliably than raw HTML.
Does converting a web page to Markdown really save tokens? Yes. Raw HTML carries div wrappers, class names, inline styles, and tracking attributes that all cost tokens but add no meaning. In our testing, a content-heavy page often drops to well under half its original token count once it's converted to clean Markdown.
Can't I just paste the URL into ChatGPT or Claude? Sometimes, but it's unreliable. Browsing has to fetch the page, and what it retrieves often includes navigation, ads, and boilerplate — or fails on JavaScript-rendered content and paywalls. Pasting clean Markdown you captured yourself is deterministic: the model sees exactly the content you chose.
Is Markdown better than plain text for AI prompts? For structured content, yes. Plain text flattens headings, lists, tables, and code into an undifferentiated block. Markdown preserves that structure with lightweight syntax, so the model can tell a heading from a list item from a code block — which improves summarization, extraction, and Q&A.
How do I convert a web page to Markdown for free? Install a free browser extension like our HTML to Markdown extension, open the page, and click once. The conversion runs locally in your browser, needs no account, and gives you clean Markdown ready to paste into ChatGPT or Claude.
The next time you want an AI model to read a web page, don't paste the page — paste its Markdown. You'll spend fewer tokens, leave more room in the context window, and hand the model structure it actually understands — and the cleaner that Markdown is, the better it parses, so it's worth following the 9 rules for clean Markdown. It's one extra click that makes every answer that follows a little sharper. Grab the Chrome extension to capture any page in a click, or start in the online editor if you'd rather paste and polish first.
