Earlier this week, Newsguard, a company that provides tools for vetting news sources, released a report on “Unreliable AI-Generated News and Information websites” or UAINs.
According to the report, their UAIN site tracker has already grown from 49 sites to 277. They claim that the sites have generic-sounding names, but feature news content created through the use of generative AI, with little to no human oversight.
This includes one site that churns out an average of 1,200 articles per day, nearly all of which are produced by AI.
But, while it would be easy to dismiss these sites as harmless garbage, the sites are likely very lucrative thanks to the widespread presence of automated advertising platforms on them.
According to the report, the sites featured some 393 programmatic ads that included some 141 major brands. Though the ads were only on a minority of such junk sites, 55 of the studied 217, over 90% of those ads were served via Google’s AdSense program.
In a statement to Gizmodo, a Google spokesperson said that their preexisting policies address this issue, with the policies themselves noting that “text generated through automated processes without regard for quality or user experience” is not allowed.
Though Google seems to have been caught somewhat off balance on this issue, anyone who has been on the internet long enough knows that we’ve been here before.
This isn’t the first time that garbage content flooded the internet and made a mint doing so. It also isn’t the first time that Google enabled it.
Party Like It’s 2005
The idea behind article spinning was very straightforward. A spammer would take an original article and then use a “spintax” to replace various words and phrases with close synonyms. This made it easy for spammers to generate hundreds, thousands, or even millions of pages of “original” content.
This helped kick off a low-quality content war that lasted over half a decade. Spammers started out spinning original content, but quickly took to either stealing content from other sites, sometimes through RSS scraping, or misusing article marketing website libraries to find source material.
This low-quality content began to take over significant portions of the internet, including Twitter, Facebook and other social media platforms.
What fueled these sites, to put it simply, was Google.
Google search results drove traffic to these sites, and Google Adsense provided an easy way to turn that traffic into revenue.
However, the ads were often not directly on the spam sites. The sites would often use link marketing to drive traffic (both direct and indirect) to other sites that were less spammy in nature that would carry the ads.
That said, Adsense was far from the only revenue tool spammers used, others linked to questionable storefronts, some pointed to malware and phishing sites while still others would simply sell the links to clients hoping to boost their search engine performance.
Regardless of the revenue stream, the reliance on Google for search ranking became the weak point in the operation. In February 2011, Google released a series of algorithm updates meant to target “content farms”, which were sites that used low-paid human authors to write large amounts of low-quality content.
Though they may not have been the direct target (at least not publicly), spinning sites were certainly caught in the crossfire. With those changes, article spinning became less desirable as spammers moved on to other approaches.
And, now, those other approaches appear to include AI.
Running in Circles
If all that sounds familiar, it’s because that is where we are today when it comes to AI-generated spam.
Spammers have found a quick and easy way to generate a large amount of “content” that, at least currently, seems to fly under Google’s radar both in terms of search engine rankings and AdSense.
Much as it was back in 2005 with spun content, the detection of AI-generated text is subpar currently, though it seems likely that advances in this space might make it easier to find as time goes on.
What is different this time is that, in 2005, Google didn’t own an article spinning company or publish software for the purpose of creating “original” text. In 2023, Google has released their own AI tool, Bard, and has invested heavily in generative AI.
This puts Google in a difficult position and explains why, though it has reiterated its position “low quality” content, it hasn’t taken a firm position on AI. Instead, the company says that the use of AI content doesn’t disqualify a site from appearing in search results or using AdSense.
However, the Newsguard report points to the flip side of this coin. Though AI may have many benefits and legitimate muses, spammers were always going to be among the earliest of the early adopters. The rise of AI-generated spam was one of the most easily-predicted abuses of AI, and it appears that Google either unsure of how to handle it or, even worse, may have a conflict of interest.
Google, in a rush to get involved in the AI craze, has turned a blind eye to a very obvious and very damaging abuse of the tool. They have failed to plan for how to deal with these abuses and now are feeding it in multiple ways.
Right now, Google may not think this is a problem. The Newsguard report points to drastic growth in this space, but right now, the problem is fairly small. It probably doesn’t have a significant impact on search results or on AdSense revenues.
However, 2005 showed us just how fast these issues can grow. In 2005, article spinning was something of a sideshow. By 2006, several major competitors were entering the space, all ethics were being thrown out the window and, not much time later, it became a more pressing issue for Google and the internet at large.
This time around, the growth will likely be faster. Not only is the technology better, but there is widespread mainstream investment in it. Adoption among those that wish to abuse it will be quicker and growth will likely be faster.
If Google doesn’t want a repeat of the late 2000s, the time to start preparing is now. Finding ways to detect AI writing, improving tools to spot low-quality efforts, and setting clear rules about the use of AI is important.
Unfortunately, by the time Google begins to feel the impact in other areas of their business, it may well be too late.
In 2011, Google had to do a major shake up to purge its search results of garbage content. In a few years’ time, that may not be possible.
Sadly, the internet’s fate is tied heavily Google’s. So, their failure won’t just hurt them, but the internet at large. But even though it’s our shared internet at stake, all we can do is hope that Google recognizes the issue and takes the needed action.