CAPTCHAs and the DMCA

By Jonathan Bailey • Nov 14th, 2007 • Category: Articles, DMCA, Legal Issues, Prevention

Yesterday I received an email Ben Maurer, one of the engineers for reCAPTCHA.

In addition to responding to a comment on a post from last week, he alerted me to a copyright case involving Tickmaster (TM) and RMG Technologies. According to the complaint and subsequent injunction (embedded below), RMG produced an application that allowed users to bypass a CAPTCHA system on TM’s site, thus enabling users to easily purchase thousands of tickets before actual humans could even get into the system.

According to the judge, this not only likely constituted an infringement of TM’s copyright, breach of contract and a violation of the computer fraud and abuse act, but also a violation of the DMCA anti-circumvention rules.

This ruling, if it actually stands up through the entire legal process, could have major implications for Webmasters who rely on CAPTCHA technology, including this one, and could introduce new ways to protect content on the Web, especially against automated tools such as scrapers.

Background

The anti-circumvention provisions of the DMCA are, with little doubt, the most controversial portions of the law. They are the portions that make it illegal to circumvent technological protections in order to gain access to copyrighted material as well as the providing of tools to circumvent either access or copy controls.

These rules have created a tremendous backlash due to their effect on fair use. Since it is a crime merely to produce tools that can circumvent copy protection schemes, copyright holders can lock down a work and prevent all use of the content, even use that would have likely been deemed fair if taken to court on its own merits.

However, this case put these provisions in something of a new light. According to the injunction, the CAPTCHA that TM used to protect its purchase pages constitutes a an access control mechanism and the page behind it is a copyrighted work. Thus, RMG’s software, which was designed to circumvent that CAPTCHA, amounts to a violation of the DMCA and, looking at the ruling, there seems to good reason to think that this logic will hold up.

In short, CAPTCHAs might not just be a form of protection against spammers and bots but might also themselves be protected under the DMCA.

A Tricky Application

CAPTCHAs are one of the most popular forms of site protection. They are used by everyone from Google to brand new blogs. Obviously, any additional legal protection CAPTCHAs can get will be a very big deal.

However, the TM case is a fairly unique one. Most bloggers use CAPTCHAs to protect their comment forms or emails, not multi-million dollar purchasing systems. To determine where a more typical use of CAPTCHA might fit in with with the DMCA, we first have to look at what one would have to prove to make such a claim.

  1. Ownership of a valid copyright on a work.
  2. That is effectively controlled by a technological measure, which has been circumvented
  3. That third parties can now access.
  4. That those third parties are unauthorized in their access
  5. That the access infringes a right protected under copyright law.
  6. And that the defendant made the product primarily for the purpose of circumvention, made it available despite limited commercial significance or promoted it as a tool for circumvention.

For most bloggers, the first two requirements are the greatest challenge. Though we use CAPTCHAs to protect comment forms and even our email addresses, neither of those things are copyrightable. One might claim the comment backend as being a copyrighted work, similar to Ticketmaster, but very few bloggers create their own platform meaning they don’t hold copyright in the code they use. Besides, it would be hard to call these files “effectively controlled” as most of them can be accessed directly from the Web.

Even if the blogger protects an email address with a CAPTCHA, that is just information and is not considered copyrightable.

The only exception would be if a blogger actually used the CAPTCHA to protect a copyrighted work. For example, if a CAPTCHA were used to protect a large MP3 file from leeching and another Webmaster implemented a service to let their users bypass the CAPTCHA and download the file directly.

These situations can and do happen, but are exceptionally rare. Fortunately, there are other laws, many of which we talked about when discussing scraping, that better fit this kind of abuse.

Still, there might be a place for these kinds of tactics, just not with your average blogger.

The Big Guns

The question becomes who could make the best use of this ruling? They would have to be someone who met the following criteria:

  1. Used CAPTCHAs heavily
  2. Protected copyrighted work they had ownership of with them
  3. Has the resources to target those who build such tools

Clearly, the list is short but the obvious answers are any of the big three, Google, Yahoo or Microsoft.

Of those three, Google fits best as they make very heavy use of CAPTCHAs, especially on Blogpsot, are frequent targets for circumvention and seem to be struggling to stay ahead of the software. However, it seems unlikely that they would use the law in this manner considering their hostile attitudes toward the DMCA in general.

However, any other company that meets the standards could certainly benefit from this case. It seems to only be a matter of time before a blogging platform takes advantage of this ruling in order to go after comment spammers and, possibly, scrapers.

After all, the DMCA not only applies to CAPTCHAs, but any other technological measure used to protect copyrighted works. I can think of many hosts and Webmasters eager to take advantage of that prospect.

Conclusions

I’m no fan of the anti-circumvention provisions of the DMCA, I want to make that clear. Also, I want to make it perfectly clear that this discussion is purely theoretical and academic and not an indication of a future legal strategy by any entity including Google, reCAPTCHA or anyone else mentioned in this. The best defense against CAPTCHA cracking remains better CAPTCHAs.

Still, even a bad law can be used for some good. Though I am no fan of walled gardens either, they are necessary sometimes. To that end, protecting the content behind a technological measure, such as a CAPTCHA, greatly increases the legal options you have should someone circumvent those protections.

However, the place this is most likely to assist bloggers and Webmasters is in the area of image and file hotlinking. If you use a technological means to prevent such hotlinking and another site circumvents those protections, there is a good chance it would be a violation of the DMCA, giving you legal ammunition above and beyond just traditional copyright claims.

In short, if you are going to restrict access to your content for any reason, make sure to protect it with technology that would have to be circumvented to gain access to it. Not only will this prevent a great deal of the infringement it could also greatly improve your legal options should an infringement occur.

I might disagree with that decision personally, but there is little doubt that, legally, it could open up some new doors.

Jonathan Bailey is The Webmaster and author of Plagiarism Today, which he founded in 2005 as a way to help Webmasters going through content theft problems get accurate information and stay up to date on the rapidly-changing field. He is also a consultant to Webmasters and companies to help them devise practical content protection strategies and develop good copyright policies.
Email this author | All posts by Jonathan Bailey

7 Responses »

  1. Hi Jonathan,

    Thanks again for the great post and very detailed emails on this topic.

    I think your point that making better CAPTCHAs is the best way to prevent CAPTCHA breaking is worth repeating. In this specific case, the website owner got lucky and was able to trace down the people responsible for abuse on their website. That ability was likely due to the fact that a financial transaction occurred with gave them more traditional tools to track down the people behind this scheme.

    Most types of abuse would be much harder to track down to actual people, and much (if not the vast majority) of abuse comes from outside of the US making it hard to apply US law to these cases.

    I think the tech world has learned that relying on rules and regulations to protect the internet from “bad people” is a lost cause. While having these rules helps discourage and creates punishments for those who cause harm on the Internet, at the end of the day, only technical means can protect users form this harm.

    As a final point, I think this case also highlights the fact that no abuse protection system can be set-and-forget. This is one of the reasons I think reCAPTCHA is so great — our team monitors for signs of abuse and can take action to protect our users. In a sense, using reCAPTCHA gives us, the reCAPTCHA team, the burden of keeping the system up to date.

  2. Ben,

    Excellent points all around there. I can’t find anything to take issue with, which is rare for me!

    It is very important to remember that changing technology is easier than trying to change the law. It takes only a short while to upgrade a CAPTCHA system, well, at least when you compare it to the years and years it can take to track the person down who cracked it and bring them to court.

    I also want to stress that I love reCAPTCHA in part for some of the reasons you stated. It makes much more sense than installing a plugin that has to be upgraded every few months as new cracks arise.

    I was very skeptical about using reCAPTCHA on my site, or any CAPTCHA system, but am glad I did.

    I have zero regrets and nothing but praise. Your system is the first real deal bloggers have had available.

  3. At the risk of repeating myself, One strives to make the system idiot-proof, but they keep making better idiots:)

    I do agree that making better CAPTCHAs is the answer, because no abuse protections system is ’set-and-forget’. One needs to be constantly vigilant.

  4. Recliners:

    Agreed on all points. Fortunately though, as long as the nice folks at reCAPTCHA are monitoring the situation, there is little need from me to follow the situation.

    Someone has to stay vigilant, just not always me. Doesn’t mean my eyes aren’t open for the next big more in this area tough.

  5. [...] bypassing CAPTCHAs can be a violation of the DMCA anti-circumvention laws. This is really interesting because you could technically go [...]

  6. “I think your point that making better CAPTCHAs is the best way to prevent CAPTCHA breaking is worth repeating.”

    As an engineer for the reCaptcha project I would have hoped you thought differently. Anyone familiar with fourier transforms and hidden markov models already have your audio cracked. Why don’t you guys just simply remove the captcha portion and analyze the traffic itself? You’ve already got enough adoption to make it useful.

    On to the topic…

    The unfortunate thing about this story is the abuse and misinterpretation of the word “copy”. The act of viewing a website, and thus containing the website in your computer’s RAM should absolutely not, under any circumstances, be considered willfully copying the content. This *entire* case hinges on the fact that this misinterpretation, unfortunately, has precedence in the courts.

    If this is considered copying the content, I invite everyone to rally up and begin a class action lawsuit against every ISP that uses a performance enhancing proxy who is willfully engaging in blatant copyright infringement on an enormous scale. Not that I hate the intarwebs, but it’d prove a point.

    Again, if they go that far they should be blinding us because I guarantee I can recall at least a full sentence of this website and reproduce it from short term memory alone.

    I think TicketMaster should have gone after something more along the lines of tortious interference, or “simply” place a value on the traffic from the automation, the likelihood of turning away future customers due to latency arguable caused by him, etc, etc.

    This is quite literally how we wind up with stupid laws. Some jackass convinces a judge who has no idea what the intertubes are that something way over his head is illegal.

    You wanna know the answer to the CAPTCHA problem? Establish precedence in the court of law that tortious interference of a TOS, and knowingly breaking a TOS while continuing to use the service, is illegal. Problem solved, and hey, no more making our copyright law even more insane.

  7. Nwill:

    I feel pretty certain that they do analyze the traffic too. There is no such thing as absolute security, just layers. Also, I doubt that the audio CAPTCHA is a target for cracking because it is much more CPU intensive to crack than an image one. Not as viable for cracking large numbers of CAPTCHAs, like what spammers have to do.

    I could be wrong on that though.

    I think you are confused about an element of the case though. Ticketmaster is not accusing the company of copying anything, but rather, accessing. The DMCA makes it illegal to traffic in tools that allow you to circumvent access controls or to use them.

    Caching, as you describe, is actually very well protected under the DMCA. Furthermore, there is an implied license to copy a page as is standard practice for viewing it on the Web. In short, I don’t think anyone is going to court over personal caching.

    As far as torturous interference goes, that is part of the suit as well. Like most lawsuits, it is a shotgun approach, with many different torts in it, we were just focused on the copyright and DMCA issues here.

    Brach of contract and interference are other matters that will be decided later as well.

    Hope that clarifies things!

Leave a Reply