On Friday, Reddit user mi-16evil, an administrator in the /r/Movies subreddit, posted an entry announcing that he and the other admins had banned a site, Gaoom, from being posted on the subreddit.
According to mi-16evil, the site was engaged in rampant plagiarism, copying and reposting articles and other content from various sites on the Web and then encouraging others, including many legitimate users, to post the links to Reddit.
The administrators were tipped off to the issue when the spam filters kept pulling down Gaoom links despite seemingly legitimate users submitting them. This meant something in the voting pattern tripped the spam filters, even though they might have originally been submitted by a valid user.
That, in turn, prompted them to investigate and that was when they discovered the plagiarism issues, which resulted in the full ban and the post. However, the ban might have been a moot point as, shortly after the post became popular on Reddit, Gaoom went down, giving a “Bandwidth Exceeded” error. It hasn’t returned as of this writing.
But while the case of Gaoom seems to have a happy ending, mi-16evil and the rest of Reddit may have, unintentionally, exposed the future of plagiarism and spam on the Web, making this a less-happy ending for legitimate creators everywhere.
What Gaoom Was and Why It Was Dangerous
Though Gaoom is closed, the basics of what it was is fairly simple. Gaoom was a site that took articles from various other sites on the Web and reposted them under the names of authors, likely fake authors, that it claimed wrote for the site.
Based upon the screenshots and what users have said, Gaoom looked and behaved like a legitimate website. It had a professional, if simple, design, high-quality content not all from the same source and, by all appearances, a team of authors writing for it. It even has a Twitter account that, since the post, has fallen inactive.
Of course, it was all fake and the site was nothing more than plagiarized content. Though it’s not known if the content was automatically scraped or copy/pasted by a human, given that some of the articles were edited slightly, the latter seems more likely.
While it’s tempting to dismiss Gaoom as just another spam site, it’s different from the mass plagiarism and Web spam we’ve been confronted with in the past.
Historically, most spam operations have been sites that focused on grabbing as much content as they could for the purpose of fooling search engines. It was very much a quantity over quality issue, trying to beat the odds through a brute force effort rather than improving the chances.
Gaoom, and sites like it, are a different approach. One with a different tactic and, most importantly, a different target.
A Shifting Climate, A Shifting Focus
Life has been tough for Web spammers over the past year or two. Recent updates by Google, including Panda and Penguin, have done a great deal to penalize and demote sites that host duplicate or otherwise unoriginal content. This has had huge impacts not just on scrapers and Web spammers, but also article marketers and some “content farm” websites that focus on quantity of writing rather than quality.
While this hasn’t stopped spammers from finding creative ways to grab your content, it has made it less lucrative. Though automated content scraping is still dangerous to the legitimate sites they pull from, especially since Google sometimes accidentally penalizes the original site, it’s begun to fall out of favor with many spammers.
This has been coupled with another trend that has been going on for much longer, namely the rise of social media and social networking as a source of traffic. For many sites, Google and other search engines are taking a back seat to Facebook, Twitter, Google+, Reddit and other social networks/communities in terms of the traffic they provide.
With search engines harder to fool and human networks playing a larger role in driving traffic, the solution is clear. Make spam sites not for the search engines, but with the goal of getting tweets, likes and, in the case of Gaoom, upvotes.
Simply put, humans are much easier to fool with duplicate content than search engines. A semi-professional site combined with an air of legitimacy can convince most that a site is original and worthy of being shared. While Google will likely know where the content came from, a human visitor is less likely to know, especially if it comes from an obscure source.
Couple that with the fact that many spammers already have a networks of fake users on major social networking and community sites, they can further game the system to get their content in front of far more eyeballs, far more consistently, than by shooting for Google alone.
And that, in turn, is the real worry with this type of spam. With search engine spam, the copycats were at a disadvantage from the outset. Their content came later and Google, usually, would recognize the original. With this human-oriented spam, the spammers have the upper hand in a significant way.
For the playing field to be leveled, social networking sites are going to have to do better work detecting spammers and, much like /r/Movies, work actively to ban and blacklist the sites that violate their rules.
Most likely, the next ten years of the fight against content theft and Web spam won’t look like the last ten. While traditional scraping, splogging (spam blogging) and mass content theft will remain a continuing problem to a degree, the landscape is shifting and more spammers will start to look like Gaoom than your average computer-generated spam blog.
Unfortunately, while making a search engine smarter is just a matter of updating an algorithm, making people smarter is much more difficult. Despite being the butt of jokes for many years, the Nigerian email scam, along with several variants, still routinely ensnare many unsuspecting peoplebwqswttbutdsbbyvestsyafrrbcsaswwsrza.
If an outdated scam with such publicity still fools people, the odds of your average social media user being able to reliably tell between well-disguised spam and legitimate content is, sadly, almost nil.
Content creators are going to have to be vigilant, While the Gaoom case was detected and stopped by a community effort, as these types of sites grow in number and variety, the burden is going to fall on the people who create original content to tend their gardens and shut down the sites that seek to to them, and others, harm.