Google Knol: The Future of Spam

google-knol.pngWelcome Guardian Readers! I am continuing to follow this story and will be updating on Google Knol in the coming weeks. If you are interested, feel free to subscribe to this site’s RSS feed.

Over the past few days, Google has been taking a huge pounding over its Google Knol service. One of the harshest criticisms came from Aaron Wall, who said that it was “Google’s Latest Attack on Copyright” and cited threats to search engine rankings by both well-intended Knol submitters and spammers alike.

Others have accused Google of favoring Knol in its search results, either directly or indirectly, noting that many of the early articles are already ranking very high in the search engines,

Though some havecome up to defend Knol, at least to an extent, the general consensus seems to be one of unease with Google Knol and what its implications for bloggers are.

Unfortunately, for the most part, I have to agree with this feeling and share the concern of my fellow bloggers. Google’s answer to Wikipedia does not inspire the same sense of community spirit that its predecessor does and seems to raise more questions than it does answers, especially at this time.

Knowledge Served with a Side of Spam

wikipedia-logo.pngIf Google Knol is Google’s answer to Wikipedia, it stands to reason that Knol will face the same challenges as its forefather, at least in the long run.

At the top of that list of challenges will, most likely, be spam. Unfortunately, Google has shown us that, on its hosted services at least, it is almost incapable of dealing with spam.

For example, though several offenses against spam on Blogspot have improved things there, Blogspot still remains one of the most popular services for hosting spam content and one still has little trouble finding spam blogs on the service. Likewise, Google Groups has become a popular spam target and it seems predictable that Knol will as well.

For Google, the problem is two-fold. Even if they are not artificially favoring Knol, there will be a perception that they are and the site will still, without assistance, likely rank well. Thus, it will be a major target for spammers for the exact same reasons Blogspot and other Google services have traditionally been targeted.

The second is that Google’s spam defenses have historically been extremely weak. Their CAPTCHA system has been broken, reporting spam to them works both slowly and irregularly and filing a copyright notice is a painful process that produces a slow response.

I can only imagine that many spammers already have their aim set on Google Knol or at least watching it to see when they should consider exploiting it.

Should the spammers come, Google will likely be caught off guard and that could both sink Knol as a project and trash the rankings of legitimate Webmasters who have had their content misused.

Google’s Precautions

google-knol-flag.pngHowever, Google almost certainly saw this problem coming and took several steps to counter it. After looking at the site, I see several features that were designed to make it more spam resistant than its wiki brother.

Login Required: With Wikipedia, anyone can edit almost every entry, regardless of whether or not they are logged in. Google Knol requires uses to log in with the Google Account before posting.

Nofollow Links: Google Knol automatically “nofollows” all links in its entries, meaning that they will not receive any PageRank from the Knol, thus limiting the site’s usefulness as a link farm.

Flag Inappropriate Content: Though Google’s “flag” feature on Blogspot is notoriously useless, the one on Knol seems to carry more weight and lets users not only specify the nature of the complaint, but leave comments and specific information.

Unfortunately, requiring a login is unlikely to have any significant effect on preventing spam. Spammers already have Google accounts and seem to be able to generate them almost at will. Also, this removes a safeguard found in WIkipedia where, if you found your content was infringed, you could reach in and remove it yourself without even signing up in many cases.

That is not possible with Knol. Instead, we have to rely on quick action from Google, something that is unlikely given Google’s history.

While the nofollow links will likely keep some spammers at bay, it’s also a source of ire for many Webmasters. Even if a Knol author attempts to give credit for work they copied (with permission) the link will be meaningless to the search engines, virtually ensuring that the duplicate will outrank the original given Knol’s likely standings.

The final element remains untested but it is worth noting that it provides a link to yet another DMCA policy, Google’s fourth, that has the same requirements of a handwritten signature sent via fax or email.

Fortunately, my workaround should still be applicable to Knol.

All in all, these are good precautions but they raise more issues than they solve likely will have almost no impact on any spam wave that does strike.

Other Knol Issues

Spam is not the only issue for Knol to worry about and certainly not the only reason bloggers should keep an eye on the service. There are other oddities and concerns that may strain Google Knol’s relationship with the rest of the Web.

  1. Duplicate Content: As Wall pointed out in his post, if content appears in both Knol and another site, even a major one, Google’s duplicate content filter seems to favor Knol over the original. This opens the door to ranking sabotage and even one accidentally removing themselves from the search index by sharing content with Knol.
  2. Default Licensing: Though it is exciting to see Google get behind Creative Commons, the default license for all content posted into Knol is CC-BY, meaning that it can be used for any purpose with attribution. This is even less restrictive than the license on Wikipedia and few will be likely to change it, even if they don’t intend to give away so many rights. This is illustrated by the fact that most existing Knols use that license though contributors can select “All Rights Reserved” when creating the post. The problem is that, if a spammer scrapes content and posts it to Knol. other sites may feel, incorrectly, that they have the right to use it and that could help spread duplicate content past Knol.
  3. The Profit Motive: Google Knol allows anyone who creates a Knol to place ads next to it. This will not only likely attract even more spammers, the same way the Adsense policy on Blogspot has done so there, but also attract more human plagiarists. With click-through rates and payouts as they are, the only people that can expect to earn significant money from this are those that focus on quantity over quality. This is likely to reward those who bend and break the rules more than those who follow them.

The end result is that it is not a matter of if Knol will be spammed, but when it will happen and what form it will it take. Will it be SEO spammers using it to kick competitors out of the rankings? Will human plagiarists use the Adsense system to submit lifted content for the promise of high search results and ad revenue? Will it be something else altogether.

Google Knol is simply too tempting of a spam target to ignore and, if it achieves the rankings many predict it will, it will definitely have a bulls-eye on its back.

We can only hope that Google is ready for it.

What Webmasters Can Do

Since participating in Knol could easily cannibalize your own rankings, especially since you can’t effectively link to your own content, deciding what to do is tricky.

The best thing, right now is to keep on top of how your content is used. Use Google Alerts with static content and Digital Fingerprints to track any content in an RSS feed.

The good news is that, even if Knol achieves search engine dominance, there will likely be a period of time between when a new article hits the search engines and when it ranks well. If one acts quickly after an infringement, there is a good chance that any damage can be mitigated.

Fortunately, right now, the problem is just one sitting on the horizon. If we address the issue now, we can prepare, speak up and stop any negative effects before they happen.

Conclusions

Though many of the criticisms hurled at Knol also apply to Wikipedia (nofollow links, default licensing, etc.) Wikipedia has the benefit of being a true community effort. This provides safeguards against spammers and lets copyright holders protect themselves. With Google, there are very limited community safeguards and we have to rely on them to remove any infringing works.

Also, Google does not seem to trust Wikipedia’s authority to the exclusion of other sites, something it has already done in a couple of cases with Knol, and that, in turn, raises other questions.

However, it seems the biggest problem that many have with Knol is that Google, despite claims to the contrary, is becoming a publisher. However, it is important to note that Google has operated a blogging service, a Web site publishing tool, a forum service and and other hosted offering for many years, all of which are currently indexed in their search engine.

Though it is worrisome to see a search engine like Google become so involved in the creation of content, even going so far as to paying authors directly, other search engines, including Yahoo! and MSN, have much greater conflicts of interest.

The problem is not that Google is paying authors to write for their site and then ranking the content well, other search engines do that all of the time, the problem is that Google is exposing the jugular vein of Webmasters when it comes to SEO and leaving the door open for spammers, competitors and plagiarists to strike it.

Though only time will tell what becomes of Google Knol and what its impact is, there is a lot of good reason for Webmasters and authors to be very worried.

Update 07/30

The YouTube video below demonstrates clearly that the automated generation of Knol articles is already possible. The spam wave is already set to begin:

The video also clearly illustrates the mindset of a spammer in a way that I can not illustrate with thousands of words of text, so it is worth the watch alone simply to understand how and why spammers operate.

Want to Reuse or Republish this Content?

If you want to feature this article in your site, classroom or elsewhere, just let us know! We usually grant permission within 24 hours.

Click Here to Get Permission for Free