Protecting Content by Using Static Pages

In a recent interview on b-l-o-g-g-e-r.com, Dennis de Bernardy commented on the issue of content theft and spam blogs

Bernardy, the creator of Semilogic Pro, which is a pack of themes and plugins for WordPress targeted at business users, said that the problem was due to RSS feeds and recommended what many would call drastic action.

There is absolutely nothing you can do about that (content theft)…. Except putting your content outside the RSS feed which is what I usually recommend my customers to do.

In short, Bernardy recommended putting all of your blog’s important content on static pages outside of the feed and using the feed primarily to point to those entries and announce changes to them.

Though it is, in many ways, an appealing idea due to its simplicity, it runs counter to how most blogs operate, what most RSS readers expect and what search engines seem to want.

However, its worst crime is that it won’t stop content theft and may, in the long run, actually make the situation worse.

The Logic

The logic itself is fairly straightforward, if the content is not in the RSS feed, it can not be scraped. If it isn’t scraped, spam bloggers won’t be able to pick it up and put it on their sites. This is the same logic behind using partial feeds.

Many blogs already use this technique to some degree, though not necessarily for content theft protection. This site, for example, has the Stopping Internet Plagiarism section of the site that remains static but is updated from time to time.

This content is often called cornerstone content and frequently plays a key role in a blog, even one that is mostly dynamic.

However, even in those cases, the vast majority of the content of the site is still placed into actual blog entries and is picked up in the RSS feed.

For most blogs, using this method is out of the question. It would either require the creation of huge amounts of static pages or simply writing a great deal less.

Fortunately, there is little reason to actually consider taking these steps.

Flaws in the Thinking

Contrary to what Bernardy said in his interview, putting all of your important content into static pages does not make it “theft proof”.

For one, it doesn’t prevent RSS scraping. Anything that is actually posted in the RSS feed, be it a pointer to the static content or some less important content, can and will still be scraped. Since search engines can’t easily tell important content from unimportant content within the single site, especially when the unimportant works are on the home page and cornerstone works are buried, this can still have consequences with the search results.

Furthermore, even if you include a link to your content, many scrapers strip out HTML content when they republish and, even if they don’t, since the link doesn’t point back to your original post, there is little guarantee the search engine will recognize yours as authentic. They’ve done exceptionally bad with this in the past.

Second, though it may prevent the content from being scraped by spam bloggers, it increases the likelihood that a human plagiarist will discover it. Where a blog post might have a viable life-span of a few days, static content can be discovered months or years later and be lifted by a plagiarist taking a more hands-on approach. These cases, though less common, are more dangerous in terms of author reputation, search engine confusion and reader siphoning.

Finally, as much as readers have rebelled over partial RSS feeds, this would seem to be much worse, providing absolutely no content from the article itself.

Following these suggestions, especially to its extreme, would seem to cripple your blog without providing much additional protection for your content.

It is a poor trade off that I doubt many will accept.

My Biggest Issue

What I find most worrisome about this article though is not the suggestion that bloggers bury their content on static pages, but rather the notion that, when it comes to content theft “there is absolutely nothing you can do about that.”

That is, of course, completely incorrect.

If that were true, this site would not exist, I would not have been able to shut down over 600 plagiarists in the past few years and the Web would be a very different place.

Fortunately, there are laws, including the DMCA, that make fighting content theft practical and plugins such as CopyFeed and Antileech that enable you to protect your feed from scraping, be it full or partial.

Telling people to give up achieves nothing, especially when they are business owners that have a financial stake in their content.

Conclusions

If, after reading this site, you are worried about content theft and feel you have no other options than to try this technique, consider switching to a partial feed first.

Doing so will offer you at least the same amount of protection, if not more and also provides the same reciprocal links in the event of scraping. Furthermore, it won’t require you to change the structure of your blog, will upset readers less and follows blogging conventions much more closely.

Personally, given the tools available and the backlash against partial feeds, I think even that step is often misguided. However, it is certainly more reasonable than turning a blog into a predominantly static Web site just for the sake of avoiding content theft.

Simply put, extreme measures are only needed when less extreme ones don’t work. Fortunately, this doesn’t seem to be the case here.

8 comments
Sort: Newest | Oldest
JB
JB

Jeremy: Sounds good. First, we have to start registering all of our sites with the USCO and then pay lots of money to big pricey attorneys to file suit and then have to pay more attorneys to find them in their home countries and states.

Any bloggers not bankrupt by the start of the trial would be eligible to collect the huge winnings, which would be negated when the spammer was declared indigent.

And yes, I am kidding too...

Still, I like my earlier idea of using Grokster to stop scraping...

http://www.plagiarismtoday.com/2006/07/28/using-grokster-to-stop-scraping/

JB
JB

Jeremy: Sounds good. First, we have to start registering all of our sites with the USCO and then pay lots of money to big pricey attorneys to file suit and then have to pay more attorneys to find them in their home countries and states.

Any bloggers not bankrupt by the start of the trial would be eligible to collect the huge winnings, which would be negated when the spammer was declared indigent.

And yes, I am kidding too...

Still, I like my earlier idea of using Grokster to stop scraping...

http://www.plagiarismtoday.com/2006/07/28/using...

Jeremy Steele
Jeremy Steele

Hmm, maybe if the RIAA/MPAA win their campaign against torrents and scare users from using bittorrent we will have to start suing the pants off of all those sploggers for $150,000 per infringement and scare them so much they won't want to mess with bloggers anymore ;)

Just kidding of course.

Jeremy Steele
Jeremy Steele

Hmm, maybe if the RIAA/MPAA win their campaign against torrents and scare users from using bittorrent we will have to start suing the pants off of all those sploggers for $150,000 per infringement and scare them so much they won't want to mess with bloggers anymore ;)

Just kidding of course.

JB
JB

RS: I've been in the trenches for a while. That just counts plagiarists of my own work through my various sites. Not the work I've done for others here.

You're very welcome for the site but I feel that I should be thanking you for your comments and thoughts, they are always appreciated.

JB
JB

RS: I've been in the trenches for a while. That just counts plagiarists of my own work through my various sites. Not the work I've done for others here.

You're very welcome for the site but I feel that I should be thanking you for your comments and thoughts, they are always appreciated.

Recording Studio
Recording Studio

Wow! I did not know that you have closed down over 600 sites for plagiarism. That is impressive.

I am learning a lot from visiting your blog and am truly grateful for the opportunity.

Thank you.

Recording Studio
Recording Studio

Wow! I did not know that you have closed down over 600 sites for plagiarism. That is impressive.

I am learning a lot from visiting your blog and am truly grateful for the opportunity.

Thank you.

Trackbacks

  1. [...] Bailey of Plagiarism Today talks about the idea of Protecting Content by Using Static Pages. Sounds like a lot of extra work and, according to Jonathan, may not be worth the [...]

  2. [...] Protecting Content by Using Static Pages One way to help protect Web content from scraping. Too drastic for me, but it is an option. On Plagiarism Today. (tags: plagiarism blogging feed) SHARETHIS.addEntry({ title: “links for 2007-12-12″, url: “http://www.marialanger.com/2007/12/12/links-for-2007-12-12/” }); [...]

  3. [...] From the feeds Several posts that I would like to recommend: Five Media Hosts for Media Offloading Are Creative Commons Licenses Confusing? MyFreeCopyright: Free Copyright Verification Protecting Content by Using Static Pages [...]

  4. [...] Offloading Are Creative Commons Licenses Confusing? MyFreeCopyright: Free Copyright Verification Protecting Content by Using Static Pages Limitations of Fair Use [...]

  5. [...] Protecting Content by Using Static PagesOne way to help protect Web content from scraping. Too drastic for me, but it is an option. On Plagiarism Today. [...]

  6. [...] Protecting Content by Using Static Pages – One way to help protect Web content from scraping. Too drastic for me, but it is an option. On Plagiarism Today. [...]