On Wednesday of last week, the 11th Circuit Court of Appeals upheld a lower court verdict that ruled a content aggregator had infringed a medical blog when it scraped the site’s RSS feed to include it in a subscription service sold to academic libraries.
Though the decision hasn’t attracted a great deal of media attention, it addresses and answers an incredibly old and divisive issue that has been pitting aggregators against authors for over 15 years.
However, it’s also a ruling that, for most people, comes a bit late. RSS scraping long fell out of fashion for spammers and with tech solutions such as RSS truncation and modification, it’s not something that been at the forefront of most authors’ minds in some time.
That said, it’s nice to bring closure to such a long-running issue so it makes sense to look at what will likely be the final word on RSS feeds and the alleged “implied license” thereof.
Understanding the Case
The case pits ThriveAP, a blog publisher formerly known as MidlevelU, against the content aggregator ACI Information Group, also known as Newstex.
ThriveAP sued ACI in 2018 over alleged copyright infringement. It claimed that ACI was republishing its freely available blog articles in a database of academic blogs and abstracts named the Scholarly Blog Index. Access to that index was then sold to various academic libraries.
At the lower court, the case went to a four-day trial. There, a jury ruled in favor of ThriveAP and found the site proved it had a valid copyright on some 43 articles and that ACI had willingly infringed upon them. They awarded ThriveAP some $202,500 in statutory damages.
ACI appealed the case to the 11th Circuit and made the argument that, by distributing its content over an RSS reader without restrictions it granted, “implied permission to copy and publish that content on another website.”
However, the court roundly rejected that argument saying that ACI failed to introduce evidence that other websites republished content in that manner, that the practice was customarily accepted or that ThriveAP was aware of the copying.
ACI also attempted to argue that what they were doing was no different than Google’s index. However, the Appeals Court rejected that as well saying that an implied license to use a web crawler does not necessarily grant permission to scrape RSS feeds.
The court also rejected a fair use argument saying that “reasonable minds could differ” on the issue.
As such, the court upheld the lower court decision, saying that the jury’s verdict was reasonable.
Implied Licenses and RSS Scraping
The question of implied licenses and RSS scraping is one of the oldest topics I’ve focused on here. Articles on the topic go back to 2005 but the first directly on the issue, Why RSS Scraping Isn’t OK, was published in August 2006, nearly 15 years ago.
At that time, blogging was still relatively new and RSS feeds were seen as an exciting (relatively) new technology for distribution. However, the boundaries of what that distribution entailed were not fully hashed out and there weren’t really norms.
Many argued that publishing an RSS feed, even if you didn’t realize you were doing it, granted an implied license for others to take that content and republish it elsewhere.
This type of RSS scraping became an extremely popular tool for creating spam sites. The idea was simple, grab a list of RSS feeds, scrape them and republish them (sometimes using an article spinner to make them seem original) as a means to fill thousands of pages without writing a single word.
However, the golden age of this kind of spamming was short-lived, lasting only from 2005-early 2011. It was in February 2011 that Google released its Panda/Farmer updates, which were designed to target and reduce the amount of low-quality content in its index.
Suddenly, a slew of techniques such as content farms, article marketing and RSS scraping fell by the wayside as largely outdated tools for search engine optimization. Though new spammy approaches did follow, RSS scraping never really returned as a mass problem.
This was doubly true considering most sites that objected to RSS scraping simply adopted technical solutions such as blocking scrapers, truncating RSS feeds and appending strong copyright notices.
As such, RSS scraping remained a major controversy on the internet, but not in the courts, making this case one of the first times it’s come this far.
The Non-History of RSS Scraping in Courts
For all the ink spilled over RSS scraping, little legal action came about it. Simply put, legal action was rarely practical and there have long been technological solutions that are easier and cheaper to apply.
That said, there was one notable case involving RSS usage, even if not RSS scraping.
In December 2008, GateHouse Media, a company that owns hundreds of newspapers, filed a lawsuit against the New York Times, the owners of the Boston Globe for, among other things, copyright and trademark infringement.
The Boston Globe, as part of its Boston.com website, was publishing automated links and synopses of other website’s content via RSS as part of their “Your Town” sites. One of the sources was GateHouse-owned Wicked Local websites.
To be clear, the case only involved headlines, links and snippets. However, it still could have been enlightening on the issue of RSS use. Unfortutely for those following the case, it was settled before a trial with the New York Times agreeing to stop using GateHouse-owned feeds and GateHouse promising to put technical restrictions on their end.
That said, one case did make it to a decision but not in the United States. In February 2013 an Israeli court handed down a ruling that said publishing a full-text RSS feed was an implied license to publish that content elsewhere.
The Israeli case dealt with a webmaster, Tomer Ofaldorf, and News1, an Israeli news agency. News1 grabbed content from Ofaldorf’s feed and republished it on their site. Though the content was removed after he complained, he filed a copyright infringement lawsuit.
However, a judge ruled that his use of full RSS feeds did grant an implied license for republication and, considering that News1 stopped after he objected, no infringement had taken place.
Up until the ThriveAP case, this was the best-known court ruling on the subject. However, that is set to change.
To be clear, this is mostly a dead issue. RSS scraping hasn’t been much of an issue in over a decade and the situation between ThriveAP and ACI is very unusual and is unique to them (and others sites ACI has been scraping).
However, the fact that this case is so unusual is what makes it interesting. Though the result isn’t shocking or particularly impactful in a practical sense, it brings some closure to a debate that has been going on for 15 years, even if it hasn’t been relevant for 10 of them.
RSS scraping may not be en vogue but the legal issues it created are still very real and may impact future technologies.
As such, it was still important to bring some resolution to this question, even if it’s about a decade late for most bloggers.