In a recent interview on b-l-o-g-g-e-r.com, Dennis de Bernardy commented on the issue of content theft and spam blogs
Bernardy, the creator of Semilogic Pro, which is a pack of themes and plugins for WordPress targeted at business users, said that the problem was due to RSS feeds and recommended what many would call drastic action.
There is absolutely nothing you can do about that (content theft)…. Except putting your content outside the RSS feed which is what I usually recommend my customers to do.
In short, Bernardy recommended putting all of your blog’s important content on static pages outside of the feed and using the feed primarily to point to those entries and announce changes to them.
Though it is, in many ways, an appealing idea due to its simplicity, it runs counter to how most blogs operate, what most RSS readers expect and what search engines seem to want.
However, its worst crime is that it won’t stop content theft and may, in the long run, actually make the situation worse.
The logic itself is fairly straightforward, if the content is not in the RSS feed, it can not be scraped. If it isn’t scraped, spam bloggers won’t be able to pick it up and put it on their sites. This is the same logic behind using partial feeds.
Many blogs already use this technique to some degree, though not necessarily for content theft protection. This site, for example, has the Stopping Internet Plagiarism section of the site that remains static but is updated from time to time.
This content is often called cornerstone content and frequently plays a key role in a blog, even one that is mostly dynamic.
However, even in those cases, the vast majority of the content of the site is still placed into actual blog entries and is picked up in the RSS feed.
For most blogs, using this method is out of the question. It would either require the creation of huge amounts of static pages or simply writing a great deal less.
Fortunately, there is little reason to actually consider taking these steps.
Flaws in the Thinking
Contrary to what Bernardy said in his interview, putting all of your important content into static pages does not make it “theft proof”.
For one, it doesn’t prevent RSS scraping. Anything that is actually posted in the RSS feed, be it a pointer to the static content or some less important content, can and will still be scraped. Since search engines can’t easily tell important content from unimportant content within the single site, especially when the unimportant works are on the home page and cornerstone works are buried, this can still have consequences with the search results.
Furthermore, even if you include a link to your content, many scrapers strip out HTML content when they republish and, even if they don’t, since the link doesn’t point back to your original post, there is little guarantee the search engine will recognize yours as authentic. They’ve done exceptionally bad with this in the past.
Second, though it may prevent the content from being scraped by spam bloggers, it increases the likelihood that a human plagiarist will discover it. Where a blog post might have a viable life-span of a few days, static content can be discovered months or years later and be lifted by a plagiarist taking a more hands-on approach. These cases, though less common, are more dangerous in terms of author reputation, search engine confusion and reader siphoning.
Finally, as much as readers have rebelled over partial RSS feeds, this would seem to be much worse, providing absolutely no content from the article itself.
Following these suggestions, especially to its extreme, would seem to cripple your blog without providing much additional protection for your content.
It is a poor trade off that I doubt many will accept.
My Biggest Issue
What I find most worrisome about this article though is not the suggestion that bloggers bury their content on static pages, but rather the notion that, when it comes to content theft “there is absolutely nothing you can do about that.”
That is, of course, completely incorrect.
If that were true, this site would not exist, I would not have been able to shut down over 600 plagiarists in the past few years and the Web would be a very different place.
Fortunately, there are laws, including the DMCA, that make fighting content theft practical and plugins such as CopyFeed and Antileech that enable you to protect your feed from scraping, be it full or partial.
If, after reading this site, you are worried about content theft and feel you have no other options than to try this technique, consider switching to a partial feed first.
Doing so will offer you at least the same amount of protection, if not more and also provides the same reciprocal links in the event of scraping. Furthermore, it won’t require you to change the structure of your blog, will upset readers less and follows blogging conventions much more closely.
Personally, given the tools available and the backlash against partial feeds, I think even that step is often misguided. However, it is certainly more reasonable than turning a blog into a predominantly static Web site just for the sake of avoiding content theft.
Simply put, extreme measures are only needed when less extreme ones don’t work. Fortunately, this doesn’t seem to be the case here.