Back March, Cloudflare introduced a new product, ScrapeShield, that is designed to detect when scrapers have grabbed your content and where else it appears.
Initially, I was very skeptical of the idea. There have been many systems out there designed to detect scraping and other copying and none of them have really worked. The reason is that all of the services have required the scraper to grab a beacon of some type, usually a small image, that is then tracked back to where it is republished.
While this sounds great, the problem is that scrapers are often very selective about what they grab and republish, routinely stripping out images or skipping them altogether. If the trackers aren’t scraped or aren’t republished, the copying is undetected. Generally, it’s much easier to block scrapers and track text content through the use of statistically improbable phrases or, in other cases, digital fingerprints.
Still, I wanted to give Scrapeshield a fair chance so I set up Cloudflare on my site and ended up letting it run for several months (I had only intended for a few weeks, but hurricanes, illnesses and other factors pushed back the testing).
So what did I learn after several months of using ScrapeShield? Not a whole not. In fact, I didn’t see a great deal that I couldn’t easily have done myself.Continue Reading