Cloudflare Launches Tool to Monetize AI Crawlers

Cloudflare Logo

Earlier today, the content delivery network Cloudflare announced a new tool that it hopes will enable webmasters and site creators to charge AI crawlers for access to their content.

The idea is relatively simple. Since Cloudflare is a content delivery network (CDN), it sits between a website’s host and any bots that are attempting to access it. This new tool, entitled Pay Per Crawl, allows users to designate whether they want to enable AI bots to access their site, block them entirely, or, as the name implies, charge the bots for access.

The new system is currently in beta, and existing Cloudflare customers must request access to it. It’s unclear if any AI companies have signed up to participate in the system.

The end goal is to level the playing field between large and small publishers. Currently, large publishers, like the New York Times, Associated Press and Reuters, have all struck lucrative deals with AI companies. Smaller publishers, however, lack the infrastructure and breadth to secure such contracts.

So, how does the system work, and will it succeed? There are many questions we don’t have the answer to.

How it Works

For most webmasters, the choice of how to deal with AI bots is a binary one. You can either block them or allow them.

That blocking can take a variety of different approaches. One simple one is to use a robots.txt file to instruct bots on what they can and cannot access. However, such a tool is only a suggestion and many bots, including those from AI companies, ignore robots.txt restrictions.

Cloudflare’s approach is more direct. It simply prevents bots from accessing the content in question. With this approach, robots.txt becomes largely irrelevant. The unwanted bots simply can’t access the content.

Cloudflare and other CDNs have been using this approach for years. Blocking “bad” bots, including scrapers and resource-intensive bots, is already the norm.

This system adds a third option. It exploits the rarely used error code 402, which stands for “Payment Required.”

Under the Pap Per Crawl system, AI bots are initially greeted with a 402 error. Once it follows up, the bot will be presented with a domain-wide price for access. The bot can then either pay for access or decline and continue to be denied access.

However, to participate in this system, bots need to register with Cloudflare. This includes configuring their bot so that others cannot spoof it and providing accounting information to pay for sites accessed. It is unclear if any AI bot operators have signed up.

It’s an interesting idea, but will it work?

The Challenges and Opportunities

In many ways, Cloudflare is exceptionally well-placed for this kind of initiative. According to their data, they “manage and protect” traffic for 20% of the web. This includes many of the largest and most important publishers on the internet.

AI companies, eager to ingest all the content they can, will want cooperation from Cloudflare and other large CDNs. As such, they likely have strong motivation to play along with this system. This is especially true if other large networks use the same or a similar approach.

This is bolstered by the use of the 402 error code. The error code has been a part of the Hypertext Transfer Protocol (HTTP) since version 1.1, which was released in 1999. It’s just seen little use outside of API calls and other niche cases. In short, this system relies on already existing codes to function.

But there are also many obstacles it has to overcome.

First, most sites have already been scraped heavily by AI systems. Depending on the outcomes of the various AI lawsuits, publishers may have little recourse regarding that treasure trove of content. It would only work with new content and with the AI bots that used the system.

Second, many of the bots are owned by companies that operate search engines or social media sites. Both Google and Microsoft are major players in AI, with prominent search engines. Meta and X both operate significant social networking platforms, but also operate AI bots.

Websites are unlikely to limit access to search engines and social media sites. Those are significant sources of traffic that sites cannot live without. Some, like Google’s Gemini, use separate bots. Others, like Microsoft’s BingBot, do not.

Ultimately, it’s unlikely that smaller publishers would earn enough through this to make a significant impact. Cloudflare’s data shows that Google is crawling sites more regularly but sending less traffic. Most smaller sites won’t earn enough to offset that.

In short, this effort may be too little, too late to make a significant difference in the AI landscape.

Bottom Line

To be transparent, I’ve been very critical of Cloudflare over the years. From my perspective, many of their policies have been actively harmful to the internet.

However, this is a policy and an approach I support. It gives webmasters and creators an additional tool to address AI usage of their work. While I’m skeptical about its future impacts, I appreciate the work and thought that went into this.

That said, I can’t help but feel that this is a case of trying to close the barn after the horses have run away. If you have posted anything publicly on the internet, it’s safe to assume that it’s been ingested as part of various AI systems.

Whether that is legal or not is still being decided (despite news stories saying otherwise). Likely, we won’t have a definitive answer to that question for many years. That said, giving users tools to block or charge AI bots for access is a good step.

Oddly, I think the ones that should be most excited about this are AI companies. As Cloudflare noted, this provides certainty to them. If they pay the toll, they know they have legal access to the content and can use it. This saves negotiation and avoids any legal uncertainties.

However, if they don’t adopt it, the approach is unlikely to catch on and will likely fall by the wayside. This system requires buy-in from both creators and AI developers.

Cloudflare can only lay the pavement; it can’t make anyone drive on it.

Want to Reuse or Republish this Content?

If you want to feature this article in your site, classroom or elsewhere, just let us know! We usually grant permission within 24 hours.

Click Here to Get Permission for Free