You have a tab open — a docs page, a long thread, a release note — and you want one of the chat tools to reason over it. The reflex is to drop the link into ChatGPT, Claude, or Perplexity and hope. Sometimes that works; often the answer comes back vague, partial, or quietly wrong, because what the model actually received was not the page you were looking at. This post is about how to reliably send a web page to ChatGPT, Claude, and Perplexity so the model reads what you read, in the format it reasons over best: clean Markdown.
We will compare the three handoff methods — paste a link, paste raw HTML or text, paste clean Markdown — then walk through the quirks of each tool (context windows, file upload versus paste, native fetching), and finish with a one-click workflow that collapses the whole thing into a single copy. The tool we use to make that handoff a single action is BulkMD, a local Markdown converter; everything here works whether you convert pages by hand or automate the step.
Why a pasted link is the weakest handoff
When you paste a URL into a chat tool, you are not handing the model the page. You are handing it an instruction to go fetch the page, and what happens next is out of your hands. The tool's own fetcher requests the URL, runs its own extraction over whatever HTML comes back, chunks the result, and feeds some subset of that to the model. Every step in that chain can fail or degrade silently.
Three failure modes are common. First, the fetch is blocked: many sites return a bot wall, a cookie interstitial, or a login redirect to anything that is not a logged-in browser, so the tool retrieves a consent page instead of the article. Second, the extraction is lossy: the tool's server-side extractor keeps navigation, comment threads, and "related posts" while dropping the one table you cared about — and you have no visibility into what it kept. Third, and most insidious, the page is a single-page app that renders its content with JavaScript after load; a plain HTTP fetch sees an empty shell, so the model confidently answers from almost nothing.
The page you see in your browser has already been fetched, rendered, and laid out. Converting that rendered DOM to Markdown captures exactly what you read, including SPA content the model's own fetcher would miss. This is the same argument we make in detail in the comparison of server scrapers versus browser extensions: extraction at the point of rendering avoids an entire class of failures that server-side fetching cannot.
Paste a link, paste HTML, or paste Markdown
The three handoff methods are not equivalent. Here is how they compare on the dimensions that actually determine answer quality; for a fuller side-by-side that also weighs reader mode, see copy-paste vs reader mode vs Markdown.
| Method | What the model receives | Token cost | Fetch failure risk | SPA content | Citation accuracy |
|---|---|---|---|---|---|
| Paste raw URL | Tool's own re-extraction (unknown) | Varies, often high | High | Often missed | Inconsistent |
| Paste raw HTML | Attribute soup, nav, scripts | Highest | None | Captured if copied | Low — boilerplate dilutes |
| Paste plain text | Flat text, no structure | Medium | None | Captured if copied | Medium — headings lost |
| Paste clean Markdown | Headings, tables, links intact | Lowest (60–80% less than HTML) | None | Captured | Highest |
Raw HTML is the worst of the pasteable options, not because the model cannot parse it but because it is mostly noise. A typical article's HTML is dominated by <div> wrappers, class attributes, inline styles, tracking scripts, and navigation chrome — and every one of those tokens costs you budget while diluting the embedding the retriever uses to find the relevant chunk. Plain text fixes the token bloat but throws away the structure: the model can no longer tell a heading from a sentence, or a table row from a paragraph.
Clean Markdown sits at the optimum. It preserves the structural signals models reason over — heading hierarchy, GFM tables, link targets, language-tagged code fences — while stripping the boilerplate. On a boilerplate-heavy page the savings can reach up to roughly 93% fewer tokens; for a typical article, expect 60–80%. The token argument alone is covered end to end in how clean Markdown cuts LLM token costs, and the format-versus-format reasoning is in Markdown vs JSON vs plain text for LLM context. The short version: structure is information the model uses, and Markdown carries it at the lowest token price.
Why Markdown context yields sharper answers
There are two reasons clean Markdown produces better answers, and they compound.
The first is signal-to-noise. A retriever — the component sitting between your pasted text and the model in every file-upload or long-context flow — chunks the input, embeds each chunk, and ranks chunks against your question. If a chunk is 70% navigation and cookie-banner text, its embedding reflects that noise, and the chunk that actually answers your question ranks lower or gets split across a boundary. Markdown's clean structural breaks give the chunker natural seams, so a single concept stays in a single chunk. We unpack how each tool's retriever behaves on Markdown in how AI agents read Markdown context; the practical upshot is that the same content, reshaped as Markdown, surfaces the right passage far more reliably.
The second reason is attention budget. Even with large context windows, models do not attend uniformly across a long input — quality sags in the middle of a very long context, a well-documented effect. Because Markdown fits the same source content into a fraction of the tokens, you keep the model reasoning over the dense, relevant portion of its attention curve instead of padding it out with markup. A page that would consume 12,000 tokens of HTML might be 3,000 tokens of Markdown, and those 3,000 are almost all content.
Sending a page as clean Markdown instead of a raw link or HTML typically cuts the context to one-fifth its token cost while preserving the headings, tables, and links the model uses to find and cite the right passage — which is why the same question returns a sharper answer from the same source.
Per-tool quirks: ChatGPT, Claude, Perplexity
The three tools accept context differently. The Markdown itself is portable — one clean bundle works everywhere — but the handoff mechanics and the sizing differ.
ChatGPT
ChatGPT accepts both pasted text and file uploads. For anything past a few paragraphs, prefer uploading a .md file: pasting very long text into the composer can hit input limits and clutters the thread, while a file is retrieved against your question cleanly. Crucially, when you upload pre-converted Markdown, ChatGPT skips its own server-side extractor entirely — there is nothing to extract, so there is no extraction loss. If you instead paste a URL, ChatGPT's browsing tool fetches and re-extracts, reintroducing every failure mode from the first section. Upload the Markdown and you control exactly what the retriever indexes.
Claude
Claude's strengths are a large context window and reliable structural parsing — it treats Markdown headings and tables as structure, not decoration. You can paste sizeable Markdown directly into the message, or attach it as a file via the paperclip or Projects. For a single page, pasting is fine; for a multi-page corpus, attach files so each stays a distinct retrievable unit. Claude uses its own tokenizer, but for English prose it is roughly comparable to OpenAI's, so the same Markdown that saves tokens for ChatGPT saves a similar fraction for Claude. Lead each pasted page with a one-line ## Source: <title and URL> block so Claude can attribute its answer to the right page.
Perplexity
Perplexity is the outlier: it is built to read links natively and will fetch a URL you paste, citing it inline. That makes a pasted link genuinely useful here — for public, fetchable pages. But Perplexity's fetcher faces the same bot walls and SPA blind spots as any server-side fetcher. For a page behind auth, a JavaScript-rendered app, or anything where you need the model to reason over the exact content you see rather than a re-fetch, paste the clean Markdown instead. You keep Perplexity's strength — synthesis with citations — while removing the fetch gamble.
A note that applies to all three for publishers rather than readers: adding JSON-LD schema to your page gives no measurable lift in AI-Overviews citations (per Ahrefs analysis in 2026), and Google Search does not read llms.txt — that file is consumed by tools like Perplexity, Claude, and IDE agents. If your goal is to be cited by these engines, the lever is answer-first passages of roughly 200–500 tokens with high fact density, not schema markup.
A one-click workflow that works across all three
The manual ritual — open the page, select all, copy, paste, watch it mangle, re-clean by hand — is enough friction that most people just paste the link and accept the lossy result. The fix is to make "copy this page as clean Markdown" a single action, so the handoff to any of the three tools is just a paste.
That is the core of what BulkMD does. It is a free, 100%-local Chrome extension (Manifest V3) by Soft Web Grove: it runs Mozilla Readability and a Turndown-based serializer over the rendered DOM in your browser, produces clean Markdown, and puts it on your clipboard or sends it onward. Because the conversion happens locally on the page you are already viewing, it captures SPA-rendered content and authenticated pages that a server fetcher cannot, and it does so with no account, no telemetry, and no network call. (The optional AI summarize-and-clean feature is the single exception — it is opt-in, off by default, and uses your own API key.)
The workflow looks like this in practice:
1. Open the page you want the model to read.
2. Click the BulkMD action — or use the right-click context menu.
3. The rendered DOM is converted to clean Markdown locally.
4. Markdown lands on your clipboard (or "Send to AI" hands it off).
5. Paste into ChatGPT / Claude, or paste into Perplexity's box.
6. Ask your question. The model reasons over what you actually read.
For a single page, that is a copy and a paste. For a research session across many tabs, the same engine runs in bulk — converting up to 10 tabs in parallel and retaining around 500 results per batch — so you can assemble a whole Markdown corpus and paste or upload it as one bundle. A useful prefix to prepend to each page before pasting, so the model can cite cleanly:
## Source
Title: Structured Outputs in the API
URL: https://platform.openai.com/docs/guides/structured-outputs
Retrieved: 2026-06-02
## Content
<clean markdown of the page body follows>
That four-line header costs almost nothing in tokens and measurably improves attribution, because the model has an explicit place to point when it answers. It is the same discipline that pays off in retrieval pipelines, and it is trivial to automate as part of the copy step.
What to do when the page is huge
Some pages — a long changelog, an entire docs section, a sprawling forum thread — exceed what you want to hand a model in one shot, even as Markdown. Two tactics keep you inside budget.
First, convert and then trim to the relevant sections. Because Markdown preserves the heading hierarchy, it is easy to delete whole ## blocks you do not need before pasting; you are working with structured text, not a wall of HTML. Second, for genuinely large corpora, treat this as a context-budgeting problem rather than a paste problem — chunk the Markdown, keep a source header on each chunk, and feed the model only what the question needs. The mechanics of sizing a long-context budget are their own topic, and pasting a 40,000-token page when the answer lives in one 400-token section is the most common way people waste both money and answer quality.
The principle holds at every scale: send the model clean, structured content that matches what you saw, headed with its source, sized to the question. Markdown is the format that makes all three of those cheap.
TL;DR
A pasted URL hands the model an instruction to re-fetch and re-extract a page — a step that can be blocked, lossy, or blind to JavaScript-rendered content. Pasting raw HTML costs the most tokens and buries the answer in boilerplate. Clean Markdown is the handoff that wins on every axis: it captures exactly what you read, uses 60–80% fewer tokens than the source HTML, preserves the headings and tables models reason over, and pastes identically into ChatGPT, Claude, and Perplexity. The actionable next step is to stop pasting links and start pasting Markdown — prefix each page with a ## Source block, upload as a file when it is long, and let Perplexity fetch only when the page is public and static.
To make that a one-click habit across all three tools, install BulkMD from the Chrome Web Store and run your next prompt against the Markdown instead of the link.
Frequently asked questions
Should I paste the URL or the page content into ChatGPT?
Why does Markdown give sharper answers than the raw HTML of the same page?
Perplexity reads links already — do I still need to convert to Markdown?
Does sending Markdown to the chat tools send my data anywhere extra?
What's the best way to send several pages at once?
About the author
Independent software engineer building developer tools at Soft Web Grove. Creator and maintainer of BulkMD.
Reach the team at [email protected] — typically within 24 hours, any day of the year. Soft Web Grove also takes a small number of outside engagements; details on the about page.