lunch

Scrape Defender: The Anti-Scraping Company That Scraped Me

ScrapeDefender LogoRecently, a client of mine asked me if I had heard about a company named Scrape Defender. A company he was considering to use to protect his site from an ongoing scraping issue I was helping him with. At the time I hadn’t, but there are many companies out there in this field I’m not aware of and new ones are springing up all the time.

However, it was then he told me that at least two of my articles were being used on the site without proper attribution. Surprised that an anti-scraping company would lift content from me (or anyone else) I investigated and found it to be true. Two articles including one about getting around an IP block (original) and the recent story about Trackment (original) were on their site in full text.

scrapedefender1

Though I make my content available for reuse through a Creative Commons License, the license has requirements that must be complied with and those terms are not met with these posts. There is no mention of the CC license, no link and no mention of me as the author. Just a non-working “View original” link and the domain.

However, it wasn’t just my posts being repurposed. Their “Education” section of the site contains countless copied articles in its “News” listing. This includes at least two articles from Distil (previous coverage), that provides an anti-scraping content delivery network and is a direct competitor to Scrape Defender.

Worst of all, in each case the infringement is apparently done through some form of scraping. Most of the articles have errant code at the end of them, indicating that they were likely grabbed automatically, most likely though a middleman bot (such as Google or an RSS reader).

I reached out to Scrape Defender, contacting them on Thursday both via their press email and their contact form. The email bounced back and I have received no response from the form. I was seeking either the removal or the correct usage of my works on their site, so far, neither have happened.

For now though, I’ve decided to leave my articles up and not file a takedown notice. Not only is Amazon, their provider, very slow to respond to DMCA notices on their Web services, but I want to leave them up for now to illustrate what they are doing and let others know about this company.

Why I’m Writing This

I don’t like calling out bad companies, I much prefer highlighting the good ones and, fortunately, in the area of content protection, there are many great companies and services out there I can highlight. Most people who get into the industry do so with good intentions and, even if I don’t review their product highly, I hope they take my suggestions and improve it and make something great.

However, there are also companies and organizations that aren’t ethical and do behave in a way that I can not condone. Normally, it’s companies like Trackment that are making questionable claims or offering up dubious products.

However, Scrape Defender’s case is especially egregious. While I don’t know anything about the product, the fact that a content protection company would scrape and infringe other people’s copyright, including my own, says more than any review of their product.

The hypocrisy is nothing short of gobsmacking. This behavior, to me, is akin to a firefighter starting fires or a doctor creating diseases. It completely destroys any trust that I might have had in the company.

Regardless of their product, this is a company to avoid.

Bottom Line

In addition to the scraping and infringement, the company itself doesn’t appear to be very active. The last (seemingly) original content was posted in May of 2012 and the lack of a response in over three days indicates that the site may be on autopilot.

Regardless, it is unconscionable for an anti-scraping company to have ever engaged in this behavior so, even if the company is not active, it still needs to be known, especially since one can still sign up for their services.

As for me, I’m leaving my content up for right now, not only do I want to keep the above links active, but I have some tests I want to use the matches for. However, I am helping others file DMCA notices against the site and will do so for anyone else that wants it for free.

The only difficult question I have is how my content monitors missed this site. I’ve been experimenting with various tools and, so far, all have missed these pages. I may be taking this up with the developers of those tools in the near future.

Still, I’m grateful to my client for bringing this to my attention and I wanted to pay that forward, letting others know so they can avoid this company and support ones that are both ethical and provide high-quality services.

(Note: All Scrape Defender links in article are nofollowed.)

Leave a Reply

STAY CONNECTED