Debunking The DMCA Caching Loophole

Recent controversies about the legality of caching services such as Duggmirror have drawn new attention to an often-ignored section of the DMCA, Section 512(b), the limitation of liability for system caching.

That attention has, in turn, been furthered by new caching services such as Omgili and Bitacle that follow very different rules about what to cache, when and how.

This, coupled with questions about how the section applies to spam bloggers and other content scrapers, has lead to a fear of a “caching loophole” that allows spammers and scrapers to get away with their activities under the guise of being a legitimate caching service.

Fortunately, that is unlikely to ever happen as the law itself is written specifically to avoid such a loophole. In fact, the law was stretched to include services such as Google Cache and, most likely, will not stretch much farther.

How it Would Work

The idea is simple enough. If you want to be a scraper or spam blogger, all you have to do is scrape a bunch of content, put it on your site and, when the copyright complaints came pouring in, you claim to be a caching service and walk away. Section 512(b) removes all liability on your part if you can prove yourself as such.

On the surface, this seems like a good plan. You are, in a strange way, caching the content and it might be of use in the event the original site goes down. This seems, at first glance, to make a halfway compelling argument for protection under the law and even fair use.

Fortunately, it only takes a moderately closer examination of the law to see that the logic is very flawed and unlikely to gain any traction.

A Brief History

Section 512(b), along with the rest of the DMCA, was passed in 1998, nearly ten years ago. However, 512(b) was never designed to protect Web services such as Google Cache. That is a much more recent development.

Instead, 512(b) was designed to protect large ISPs, such as AOL, that host internal caching servers. These servers save data on the Web locally so that, when it is requested again, there is no need download it from the Web. The ISP saves bandwidth not having to redownload the same file twice and the users saves time as the cached page will load faster than ones downloaded anew.

With the current expansion of broadband Internet, such caching services have become less and less common as they were most predominant with large dial-up ISPs. However, many still do exist, including on some broadband networks, and the protection is still very much necessary.

However, in January 2006 the case Blake v. Google addressed the issue of how 512(b) applied to the Google Cache. The case, which dealt with Google’s caching of a Web site owned by Blake, determined, among other things that Google Cache met the requirements under 512(b) and, thus, was not liable for any infringement.

Though the ruling was unclear as to why, what was clear is that Google Cache was found to meet the requirements for the law to the satisfaction of that judge. However, when one takes a look at those exact requirements, it becomes pretty clear that Google and a handful of similar services, will likely be the only ones to measure up.

(Some of) The Requirements

Section 512(b) lays down some pretty strict rules for what is and is not considered a caching service. Many of these rules will likely cause great problems for anyone trying to abuse the section to escape copyright infringement.

Automation: The law is very clear that, to be deemed a caching service, the storage has to be through an “automatic technical process”. If there is any input from the owner of the site, the process is not automatic. Google answers requests from users and builds its cache that way. Selection of keywords or manually picking RSS feeds makes the process no long automated.

Follows Acceptable Practices: Section 512(b) also requires that the caching be done “in accordance with a generally accepted industry standard data communications protocol”. This means specifically following robots.txt and META HTML rules for caching. Also, since scraping and reposting RSS feeds on public Web sites is generally not considered an “accepted industry standard”, it seems unlikely that it would pass this test.

No Modifications: The law also requires that caching services display its cache “without modification to its content from the manner in which the material was transmitted”. Since the content of a page is more than just the text, it is also the images, advertisements and formatting, sites such as Omgili will have likely problems with this requirement. Additionally, sites that engage in removal of copyright and/or author information will likely cross this aspect of the law.

Temporary: Caches are not meant to be permanent. A cache that is not regularly updated and purged will very likely run afoul of the law.

The long and short of it is that meeting the requirements in Section 512(b) of becoming a caching service is a very difficult challenge that only a handful will be able to meet. Though others might be able to attempt fair use arguments on other grounds, such as the transformative use of the Web Archive, 512(b) will only protect a select group of caching services.

In fact, most of the controversial “caching services” of recent memory run afoul of at least one, if not more, elements of the law.


Though a gross over expansion of 512(b) would be very worrisome, it doesn’t appear as if that is a threat at this time. Overall, it is a very narrow exemption with very little room to move.

Still, I have little doubt that people will try to push this exemption as far as they can, at least until a judge pushed back. “What I’m doing is no different than Google” is already one of the most common, and flawed, excuses for scrapers and spam bloggers.

There’s little doubt that others will try that excuse in court, perhaps with Section 512(b) as part of the reasoning. But the odds of success, given the requirements of the law, seem very slim.

Tags: , , , , , , , , ,

Want to Reuse or Republish this Content?

If you want to feature this article in your site, classroom or elsewhere, just let us know! We usually grant permission within 24 hours.

Click Here to Get Permission for Free