The past forty eight hours or so has seen an uproar in anger regarding Spanish startup Bitacle. The uproar has been so loud that it has even produced both a WordPress plugin to stop Bitacle (see English description below) and a stop blog on the subject.
Bitacle, for their part, seems largely defiant. Though their site has been down for much of the weekend and they have made some minor changes to their site to appease some of the easier problems, the larger issues remain unaddressed.
If anything, this has added more fuel to the controversy. Generating a never-ending stream of angry posts and other forms of public backlash.
Odds are that this will get much worse before it gets any better.
What is Bitacle
Much like those sites, it contains a built in search engine for sorting through blogs, Web sites and more. One of the tabs on the search feature points to a search feature called “Aggregates”. A search there pulls results from blog entries, much like the regular blog search, but the results don’t lead to the original site, but to cached copies on the Bitacle server.
It’s those cached copies that have generated the bulk of the controversy. Originally, the cached copies offered the content under a Creative Commons License that permitted commercial use, offered no clear attribution to the author, no permanent link back to the original piece and were surrounded by ads. Though the ads remain and no clear attribution has been offered, the CC license has been removed and a link to the original work has been added. There is even a comment form on each piece to let the reader discuss the entry without visiting the original site.
This has done little to abate the backlash and, as more and more bloggers are finding their material posted on Bitacle, the chorus has grown louder. They have been called thieves, a massive splogger, accused of having blown blogger ethics and much more.
Bitacle, despite making tweaks to their site, have been oddly quiet through the whole ordeal. Outside of a comment to a July post on the issue, there have been few, if any public statements from the company.
Email sent to the company, including to the “official” addresses listed in the comment above, go largely unanswered, including a letter from myself, and the issue seems far from resolved.
Comparisons to Other Services
In their comment Bitacle made comparisons between their service and other, seemingly similar, services such as Google, Technorati and online RSS readers. However, on close scrutiny, those comparisons fall through.
To be clear, Bitacle is not the first service to scrape and reuse RSS feeds. Online blog readers such as Bloglines have long since displayed full RSS feeds to users who subscribe to them. However, in those cases, the full feeds are not shown to the public or surrounded by ads. Furthermore, results in the public search engines lead directly to the entry itself, not a cached copy.
Other blog search engines, such as TechnoratiIcerocket, have long since scraped feeds for their results, but have only displayed small clips publicly, instead directing users to the entry itself.
Finally, archiving systems such as the Coral Cache, and the Web Archive all cache copies of Web pages, but they do so without displaying ads and for the intent of some other public good, such as maintaining sites that go down or keeping a historical records. They also preserve the entire page, not just the content, reducing confusion of ownership.
Furthermore, all of those services obey the standards of the robots.txt file, something that Bitacle reportedly does not.
So, even as Bitacle and others have tried to compare Bitacle.org with these more benign services, subtle, but important, differences separate it.
The Legality of Bitacle
According to its whois information, Bitacle is located in Spain. Though Spain has significantly different copyright legislation than the U.S., Bitacle also takes advantage of U.S.-based services that do have to operate under U.S. law.
However, in either venue, it seems likely that Bitacle would face some stern legal challenges.
A fair use argument, at least in an American court, would be an uphill battle for Bitacle. At least three of the four factors seem to tilt heavily against them. The character of the use is decidedly commercial, the amount used is the work in its entirety and the affect upon the potential market is severe, especially considering that Bitacle offers their own comment form and no reason for for the viewer to look at the original work.
But as stated previously, the legality of scraping depends less on copyright issues and more on other matters including trespass of chattels, TOS violations and computer fraud/abuse. This is why unwanted scraping has been called “a legal minefield“.
Still, the copyright challenges to Bitacle seem to be the strongest and will likely be the key to stopping their behavior.
If you view Bitacle as an issue and want to prevent them from accessing your content, there are several steps that you can take to prevent it from happening.
- Consider truncating your RSS feed: Though shortened feeds are not good for all Webmasters, some may find it a practical solution. It won’t stop Bitacle’s scraping, but it will encourage visitors to click the link to go to your page.
- Install the Anti-Bitacle Plugin: WordPress users can install the specially-designed anti-Bitacle plugin. Some have also said that the WordPress Anti-Leech plugin is effective at stopping Bitacle.
- Use .htaccess to Block Them: One could, theoretically, use the.htaccess file to block the IP addresses associated with Bitacle.
- Use Feedburner to Protect Your Feed Copyright: Feedburner may not be able to prevent the scraping, but it can add copyright footers and other “flares” to the feed to direct visitors back to your original site. It can also detect Bitacle’s scraping and alert you to a potential problem even before your work appears on the site.
Getting Content Removed
Of course, stopping future infringement does little to appease bloggers who have already had content lifted. Though stopping Bitacle in the future is relatively easy, putting an end to what has already taken place is much more difficult.
A great deal of the frustration surrounding Bitacle is not just that they have been uncooperative in copyright related matters, but that, as a Spanish company, they are immune to the traditional means of resolution. DMCA notices are meaningless to them.
However, there may be way to still get content removed.
- Consider Contacting the Host: Since Spain is part of the EU and the EU has a recognized notice and takedown procedure. It may be worth the time to notify their host of their actions. However, the host, is located in Spain and its site is, predictably enough, in Spanish. Any correspondence with them would likely have to be in Spanish as well. There is room here for someone that is far better at the language than I to track down the contact information and, perhaps, pen a stock letter where only the links need be changed.
- Send DMCA to Adsense: Since Bitacle uses Adsense, it may be worthwhile to send a DMCA notice to Adsense. What effect this will have is unclear, but it is clear that simply clicking on the “Ads by Google” link will not do any good.
- DMCA the Search Engines: There are reports that Bitacle has had over one million pages indexed in Google, much of it from other blogs. Though sending DMCA notices to Google only gets a few pages removed at a time, if Google gets enough they are likely to ban the whole domain. The same goes for Yahoo and MSN.
Though only the host or Bitacle can definitely remove the content, the other methods will either pressure Bitacle to play by the rules or reduce the negative impact that their misuse has on the work.
In the end, it seems unlikely that Bitacle will survive this backlash without changing its ways. It seems as if the Aggregates section of their site is so small and unimportant to their larger plan, to create a personal home page, it would be foolish to keep it as is.
Odds are that the tweaks and changes they’ve been making quietly are just a part of what is coming. However, that doesn’t mean that bloggers can or should let up the pressure.
But even if Bitacle is able to maintain its course and avoid legal responsibility for what it’s doing, it will be very hard for it to promote a Web service, one largely targeted at bloggers, when a large percentage of them feel that Bitacle deserves a special place in Hell.
Despite this history, if they are able to move past these issues and repair their relationship with the blogging community, it’s very likely that they could carve out a special niche in the crowded personal home page market.
Otherwise, it’s only a matter of time before things go from bad to worse. Something I seriously doubt Bitacle can survive.
Other Links of Interest