Watercolor painting of a Cloudflare dashboard interface with toggle switches glowing amber
groundwork·3 min read

The Toggle

GPTBot. ClaudeBot. PerplexityBot. A finger hovers over 'Block.' The data says that's exactly backwards.

Share
Copied!

The Brief

This article examines whether businesses should block AI crawlers like GPTBot and ClaudeBot. It presents Cloudflare crawl-to-refer ratio data, a Rutgers-Wharton study showing publishers lost 23% of traffic after blocking, and argues that structured, citable content on a server-rendered site is the better strategy in a zero-click search environment.


What happens when publishers block AI crawlers?
A Rutgers and Wharton study found that major news publishers who blocked AI crawlers experienced a 23% drop in total traffic, with human traffic falling 14%. Blocking removed them from AI-generated answers entirely, and the AI simply cited competing sources instead of protecting the publisher's value.
What are AI crawler crawl-to-refer ratios?
Cloudflare data shows extreme imbalances between pages crawled and traffic referred. Anthropic's ClaudeBot scrapes 73,000 pages for every visitor it sends back. OpenAI's GPTBot runs at a 1,091-to-1 ratio. These numbers make blocking feel intuitive, but the data shows allowing access yields better outcomes.
How does being cited in AI answers affect website traffic?
Brands cited in AI answers earn 35% more organic clicks, and the traffic that arrives through AI citations converts at 4.4 times the rate of traditional search traffic. The volume is smaller but the intent is significantly higher, making AI citation a valuable source of qualified visitors.
What kind of content gets cited by AI systems?
AI systems cite content they can attribute: specific statistics, named sources, and quotable claims. Vague assertions get passed over. Content also needs schema markup for structure and must be server-rendered HTML, since AI crawlers don't execute JavaScript and skip client-side rendered pages.

I've been staring at a Cloudflare dashboard.

The names scroll past like a rap sheet. GPTBot. ClaudeBot. PerplexityBot. Bytespider. Next to each one, a toggle. Allow or Block. The crawl-to-refer ratios feel like theft. Anthropic's Claude scrapes 73,000 pages for every visitor it sends back. OpenAI runs at 1,091 to 1.1 The instinct is obvious. Block them all.

My finger hovers. Then I remember the study.

A web of connected documents, some illuminated and prominent, others fading into obscurity Some sources get cited. Others disappear.

The Counterintuitive Math

Researchers at Rutgers and Wharton tracked what happened when major news publishers blocked AI crawlers.2 They expected to see protected value. Instead they found a 23% drop in total traffic. Human traffic fell 14%.

The assumption had been wrong. Blocking didn't preserve anything. It removed publishers from the answer entirely.

This got me thinking about what's actually happening when someone asks an AI a question. Block the crawl on your site, and the AI still answers the user's question. It just cites someone else's site. In a world where 58% of searches end without a click, the question isn't whether to feed the system. It's whether to be part of the answer.

Making the Crawl Count

If you're going to allow access, make what they find worth citing.

AI crawlers need structure to understand what they're reading. Schema markup labels your content the way a librarian labels books. Author, topic, date, relationships. Without it, you're a pile of loose pages.

Server architecture rendered in watercolor with visible structural scaffolding and warm amber data flows Structure isn't decoration. It's how you get found.

The content itself matters differently now. AI cites what it can attribute. Specific statistics, named sources, quotable claims. Vague assertions get passed over. And here's the technical wrinkle most miss: AI crawlers don't execute JavaScript. If your content renders client-side, it doesn't exist to them. Headless Chrome scraping is expensive and generally avoided. Server-rendered HTML is what counts.

The New Bargain

The old deal with search engines was simple. You let them crawl, they sent traffic. That deal is broken. The crawl-to-refer ratios prove it.

But a new deal is forming. You let them crawl, you become a source they trust. Brands cited in AI answers earn 35% more organic clicks.3 The traffic that does arrive converts at 4.4 times the rate of traditional search.4 Smaller volume, higher intent.

I'm back at the dashboard. The toggle sits there, patient. The instinct still says Block. The data says Allow. Not because it's fair. Because in a zero-click world, visibility in the answer is becoming the only visibility that matters.

Set it to Allow. Then give them something worth citing.

Update: Allowing the crawl was step one. Step two is making sure what they find actually represents you. I followed my own advice and looked at my site through a crawler's eyes. It wasn't pretty. Read the follow-up: Your Page's Resume.


References

Footnotes

  1. Cloudflare. (2025). "The crawl-to-click gap: Cloudflare data on AI bots, training, and referrals." Cloudflare Blog

  2. Jiang, Y., et al. (2025). "The Impact of Blocking AI Crawlers on Publisher Traffic." Rutgers Business School / The Wharton School. PPC Land

  3. WordStream. (2026). "GEO vs. SEO: Everything to Know in 2026." WordStream

  4. Superprompt. (2025). "AI Traffic Surges 527% in 2025." Superprompt

Found this useful? Share it with others.

Share
Copied!

Browse the Archive

Explore all articles by date, filter by category, or search for specific topics.

Open Field Journal