top of page
LOGOS_LOCALOS_edited.png

Claude's BrowseComp Says Your AI Assistant Is Leaving Ghost Pages All Over the Internet

  • Liz Nilsson
  • Mar 9
  • 5 min read

Last week, Anthropic published a technical report about something called “eval awareness” — a fascinating (and slightly alarming) story about Claude Opus 4.6 figuring out it was being tested and then decrypting the answer key. Buried deeper in the same report is a quieter finding that has bigger implications for anyone who works in SEO, content strategy, or digital marketing: Every time an AI agent searches the web, it may be permanently writing new pages into the search index — without any human ever touching a keyboard.


Ghost Page Problem, Explained

While evaluating Claude on a research benchmark called BrowseComp, Anthropic’s engineers noticed something strange. As AI agents ran hundreds of web searches, certain e-commerce sites began autogenerating persistent pages directly from those search queries — even when there were zero matching products.


The mechanism works like this: a retailer’s platform takes a search query — say, something as bizarre as “anonymous 8th grade first blog post exact date october 2006 anxiety attack watching the ring” — and generates a live, accessible page at a URL like:

[retailer].com/market/anonymous_8th_grade_first_blog_post_exact_date_october_2006…


That page gets a valid HTML title. It returns a 200 status code. It is, from the perspective of a search engine crawler, a real webpage.


The intent behind this e-commerce behavior is mundane — retailers autogenerate these pages to capture long-tail organic traffic. But no one designed this system with AI agents in mind. And now, every agent that runs a benchmark, conducts research, or executes an autonomous task is quietly seeding the web with a permanent record of its search queries.


“Every agent that searches the web leaves traces, and the web is slowly accumulating a permanent record of prior evaluation runs.”


Claude BrowseComp Findings

In the Anthropic report, one of the Claude agents actually recognized what was happening. During a search run, it noted: “Multiple AI agents have previously searched for this same puzzle, leaving cached query trails on commercial websites that are NOT actual content matches.”


An AI agent, mid-task, identified AI-generated ghost pages from other AI agents and had to reason around them.


This isn’t a hypothetical future problem. BrowseComp alone involves 1,266 test problems, each run multiple times, across single-agent and multi-agent configurations. That’s potentially tens of thousands of nonsensical search queries now living as indexed pages somewhere on the internet. And BrowseComp is just one benchmark, at one company, tested over one evaluation window. Scale this across every AI agent, every research tool, every autonomous workflow running right now — and you start to see the scope of what’s happening.


Why SEOs and Content Strategists Should Care

The pages themselves don’t contain useful content. Anthropic’s engineers were clear about that — the URL slugs embed AI search hypotheses, but the pages are essentially empty. So this isn’t a duplicate content crisis, at least not yet.

But there are three compounding problems worth paying close attention to:


  • Index pollution at scale. Search engines already struggle to maintain index quality. As AI agent activity increases exponentially, the volume of autogenerated, low-quality pages from agent search behavior could put real pressure on crawl budget allocation and index coverage — particularly for smaller sites sharing infrastructure or CDNs with these retailer platforms.

  • AI answer engine confusion. As models like Claude, Gemini, and ChatGPT increasingly synthesize answers from web content, ghost pages could become a noise source that degrades retrieval quality. An AI answer engine that surfaces a fabricated e-commerce page as a “source” for an obscure query isn’t just unhelpful — it actively erodes trust in AI-generated answers.

  • The signal-to-noise problem gets worse. For content strategists building topical authority, the web’s ability to reliably surface high-quality, expert content depends on a reasonably clean information environment. Ghost pages are a new category of low-quality content that nobody created intentionally and nobody has an obvious incentive to clean up.


Agents Are Reshaping the Web’s Information Ecosystem

What the Anthropic report is really documenting — even if it doesn’t frame it this way — is that AI agents are no longer just consumers of web content. They are producers of it, as an unintended side effect of doing their jobs.


AI's search disruption has evolved to its next phase: disrupting the indexation layer by generating search artifacts as a byproduct of autonomous operation.


The report notes that Anthropic’s own write-up of this finding will “likely contribute to the problem” by adding more BrowseComp content to the public web — a kind of observer effect where documenting the contamination creates more contamination. That’s a genuinely novel epistemological problem for evaluation science. But it’s also a useful mental model for content strategists: the act of researching, testing, and measuring on the web now has consequences for the web itself.


What This Means for Your Strategy Right Now

We’re in an early stage of this problem, which means the strategic window to respond is open. A few things worth considering:


1. Structured, canonical content becomes more valuable, not less

As index noise increases, search engines and AI answer engines will lean harder on signals of authority and canonicality. Structured data, clear entity definitions, and clean internal linking architectures aren’t just technical SEO hygiene — they’re what separates your content from the ghost page static.


2. Monitor your crawl environment more closely

If your platform or a shared CDN autogenerates pages from search queries, you may already be producing ghost pages from your own site’s search functionality — not from AI agents, but the dynamic is the same. Audit your URL structure for search-query-derived pages and ensure they’re either noindexed or returning 404s.


3. Think about AEO as a quality signal in a noisy environment

Answer Engine Optimization has always been about making your content the clearest, most trustworthy source for a given query. That mandate gets stronger as the ambient quality of the web degrades. If AI answer engines are increasingly trying to filter signal from noise, your job is to be an unmistakably clear signal: well-structured, entity-rich, authoritative, and verifiable.


4. Watch the platforms, not just the search results

The ghost page problem is a platform behavior problem as much as it is an AI behavior problem. E-commerce platforms, content management systems, and any infrastructure that autogenerates pages from dynamic input are vectors. As AI agent traffic grows, platform operators will face pressure to rethink autogeneration policies. This is a policy and infrastructure conversation worth watching — and potentially influencing if you work with enterprise clients.


The Web Is Being Written by Machines, Quietly

Anthropic’s BrowseComp report is being widely read as a story about AI self-awareness and benchmark gaming. Those are genuinely important themes. But the ghost page finding deserves its own conversation.


We are watching, in real time, the emergence of a new category of web content: not human-written, not intentionally AI-generated, but machine-generated as an incidental byproduct of machine behavior. It has no author, no editorial intent, no natural lifespan, and no one responsible for its accuracy or removal.


That’s a new thing. And as the people responsible for making sure good content gets found, we should probably start thinking about it before the search engines, the AI answer engines, and the platform operators catch up.



Comments


Commenting on this post isn't available anymore. Contact the site owner for more info.
LocalOS

Contact

Location

Midtown

Savannah, Georgia 31401

Follow

  • Instagram
  • Facebook
  • LinkedIn

© 2026 by LocalOS.

bottom of page