If you need another reason to worry about content theft, consider the recent case involving the gossip blog Celebrity Hack.
Yesterday, a much lower-quality scraped version of the same article was submitted to Digg. The article, which has since been removed, was promoted to the front page and not buried, earning over 800 Diggs.
This has lead many, including Muhammad Saleem at Pronet Advertising, to worry that social networking is making the problem of content theft issue worse.
Sadly, Saleem seems to be right. Even as social news sites like Digg are increasing the rewards for creating original and interesting content, they are also rewarding those who steal it.
Content Theft Goes Social
For years Webmasters have been worried about the duplicate content penalty. The fear is that, since search engines place a higher value on unique content and intentionally bump down duplicate pages, that the search engine might bump down a site simply because it is being scraped and reposted elsewhere, even it is taking place without permission.
Though Google has addressed that and said that it should not be a major concern, I have worked with several businesses that have had their marketing material scraped, only to have the plagiarist rank as high or higher they.
Even if there is no “penalty”, scraping is a cheap and fast way for sites to compete for the same keywords with the same strategy and, with good linking such as comment spam, achieve similar or greater results.
But with social networking and social news begins a new problem. If search engines, with their advanced algorithms and overhead view of the Web, have a difficult time telling duplicate content from original material, individual users have at least an equal challenge.
Sadly, it is often a challenge that they fail to meet, especially when they are in a rush to submit an article to their favorite social networking site.
A Ripe Target
With sites like Digg, Slashdot and Reddit capable of directing incredible volumes of traffic to a site (Note: This site has been both Dugg and Slashdotted at varying points), they’ve become very ripe targets for spammers.
The desire for placement on Digg has even generated a few start ups, including the most recent Subvert and Profit. These companies, reviled by most Digg users, let submitters purchase Diggs and pay members for Digging up stories.
The popularity of these social news sites has also attracted spammers who, often times, copy high-quality and legitimate content, post it to their site and then actively promote it on these networks.
Legitimate Webmasters, who passively promote these social networks with badges, links and buttons, often don’t get as much push as the sites created almost exclusively for the networks. Legitimate users of the network, genuinely interested in the content, unwittingly promote the scraped version until it reaches the front page, thus rewarding the plagiarist for his acts.
Sometimes the theft is caught in time, either by an astute reader or the original author. Other times, as with the case in the Celebrity Hack story, the plagiarized work goes straight to the front page and the plagiarist enjoys the traffic and revenue that the site brings in.
The beauty of this approach from the angle of the spammer is that it only takes one story to take off to justify the effort. Much like with spam blogging, many attempts can fail so long as one succeeds. The rewards for success far exceed the penalties for failure.
Reasons for Concern
This type of plagiarism is especially damaging to Webmasters as it can impact them very deeply and very directly.
- Only the Best Content: Unlike spam bloggers, who scrape indiscriminately, plagiarists targeting social network sites are only interested in the best-quality content. They select articles, copy them by hand and paste them to their own site without attribution or under their own name.
- A Public Plagiarism: This plagiarism is especially damaging as it takes place in the most public forum on the Web and in the most deceptive way. The plagiarists involved often use hand-crafted sites, not just easily-detected computer-generated sites like spam bloggers, and many viewers may genuinely believe the plagiarized copy to be the original work, even if there is significant evidence to the contrary.
- Closing The Door: Worse still, it detracts from one of the major incentives to post original, high-quality content on the Web. By taking their content to the social news sites, the original authors lose that ability to get their work to that large audience. It greatly impacts the original author’s ability to use the social news sites to build their reputation and readership.
In short, this type of plagiarism doesn’t just risk replacing the original in the eyes of the SERPS or a few readers, but in the minds of tens of thousands of people and on some of the largest, most popular sites in the world.
Reasons Not To Worry
On the flip side of the coin, there are several reasons why spammers may not be drawn to the social networking sites over the long run.
- Limited Ad Revenue: Users on social networking sites tend not to promote sites that have many ads and studies have shown that traffic from social networking is much less likely to result in ad clicks than via search engines.
- Limited Search Engine Benefit: The SEO benefit of getting a page listed on Digg or another social news site is dubious a best. A single link that scrolls off the home page quickly will not carry as much weight as multiple links on more static sites.
- Higher Costs: Surviving a Digg or Slashdot effect can be difficult, as I have personally found out. Doing so requires either investing more into hosting or setting up shop on services such as Blogspot that have less credibility on the sites.
In short, while these sites can definitely generate a large amount of traffic, they don’t always generate a lot of money for the people they link to. Professional spammers will likely be seeking out other methods that should result in larger checks.
The more likely candidate for this type of plagiarism is a new Webmaster trying to grow his blog or blog network and is very misguided on how to do so. They are frustrating and annoying, but are generally easily stopped once they are spotted.
Simply put, in these cases the shame of being discovered a plagiarist is usually, in and of itself, enough to send the plagiarist into hiding.
What Webmasters Can Do
Despite the drawbacks, it is obvious that the problem is both real and ongoing. Bloggers and Webmasters, especially those that wish to leverage social news sites, need to consider taking a few steps to guard against this.
- Track Popular Articles: Monitoring and protecting the RSS feed will stop spam bloggers, but not plagiarists who manually select content. You need to track your best articles to ensure they aren’t copied without permission. Consider using Google Alerts to monitor those articles automatically.
- Monitor the Social News Sites: Sometimes the plagiarized articles can hit the social news sites before Google or even Technorati picks them up. Follow the social news sites in relevant categories, you probably should be regardless, and catch plagiarists early.
- Report Infringement to the Social News Sites: In addition to the usual steps of getting the works removed from the plagiarist site and, possibly, getting their ad revenue severed, report them to the social news site they submitted to. Most sites will ban domains and sites that host plagiarized content.
Though these steps can not guarantee that a plagiarist will not slip through and get an article of yours on the front page of a social news site, they can help thwart that vast majority of those who might try.
The important thing to remember is that, even with blogs, plagiarism goes well beyond just RSS scraping. As social news sites have bolstered the rewards for hosting and creating content that stands out, many plagiarists will start taking only that content, often bypassing the RSS feed completely.
It’s another concern for Webmasters and bloggers, many of whom spend hours a day creating content only to have it ripped off by others almost immediately. It’s also a concern that is difficult to deal with.
It is also an issue that we will be talking about this more in depth over the next few weeks as it doesn’t appear to be going away any time soon.
In fact, it seems likely that we’ve just now seen the beginning of it.