Article Spinning: Generative Writing Before Generative AI
Generative artificial intelligence (AI), is undoubtedly the biggest copyright and biggest plagiarism story of 2022 and 2023.
Whether you are looking at the issue from a legal perspective, an ethical one or, as this site does, at the interplay of both, AI is the dominant story and has been for some time.
But for those of us who have been around a long time, parts of the AI story sound more than a little bit familiar. Perhaps not an example of history repeating itself, but of it rhyming.
That’s because generative AI, more than anything else, makes it easy to generate large amounts of “original” content. That content can then, theoretically, be used for anything, including submitting as classroom assignments, creating spam email, creating fake news articles and much, much more.
But technology to do exactly that existed nearly 20 years ago. A tool named Articlebot and an approach named “article spinning” made it easy to generate thousands of “original” articles that could be used for a variety of purposes and, if done well, could be difficult for even humans to detect.
So while it’s not exactly generative AI, it’s worth taking a look at what content generation looked like nearly two decades ago and see what it can teach us about the issues that are ongoing today.
What Is (Was) Article Spinning
The idea of article spinning is fairly simple: Most words in the English language (as well as other languages) have a number of synonyms. Therefore, by changing out words for their close synonyms, you can create “new” content that means largely the same thing as the original.
It’s easy to generate a very large amount of new content this way.
For example, if you take the sentence “The man ate a hamburger”. You can replace the words man, ate and hamburger with close synonyms but retain the original (or similar) meaning, that sentence can become “The boy devoured a burger”, “The gentleman chewed a sandwich” and “The fellow inhaled a cheeseburger” to name a few.
If you have a 500-word article and can replace 125 words with three synonyms each, you can create over 300,000 combinations.
The technique was often combined with other approaches, such as stitching elements of separate articles together, to further multiply the number of “original” versions that could be created.
Predictably, this approach was loved by web spammers. Who, between 2005 and 2010, used article spinning to great effect. The approach was equally hated by human authors on the internet, who routinely found their content being turned into fodder for spun content.
However, in 2011, Google made a series of algorithm changes that directly target spun and other “low quality” types of content. Content farms, including those that were fed by spun content, were devastated.
But, by 2020, article spinning began to make something of a comeback. Google changes favoring the newest content made it so that spammers found a use for the old approaches. Though never reaching the highs of the late 2000s, many sites found their content being used in this way.
However, most likely, those that were looking back to article spinning have moved on to AI generated text. It is more difficult to detect, raises fewer copyright issues (for the spammer at least) and seems to be at least tacitly endorsed by Google.
Still, it’s easy to see the similarities between 2005 and 2023, even if the lessons from 18 years ago were not learned.
The Similarities
The similarities are pretty obvious. Both generative AI and article spinning are ways of “generating” large amounts of seemingly original content. The approaches are different, but the hoped-for outcome is the same.
Both also raised serious copyright and ethical issues. A lot of the questions being asked today about AI-generated content were being asked nearly two decades ago, as creator and spammer alike struggled to identify what made content “original” in this context.
This debate became particular pointed as web content creators were unhappy that their work was being ingested and spit out again by article spinners. They pushed back, often using copyright as their primary weapon, something we’re seeing today with AI systems.
Also, as with AI spam, Google initially felt that the situation was well in hand and that they didn’t need to take any action. It took six years for Google to take direct action on the issue, and that was only after multi-billion dollar empires had been built on low-quality content, and they were impacting a large percentage of results.
Another similarity is how both technologies debuted. Though article spinning didn’t reach a mainstream audience the way generative AI has, both debuted and became extremely popular very quickly. Both also underwent a process of rapid evolution as new automations were introduced and new competitors jumped into the field.
In both cases, there was an initial splash followed by a rush to tap the market. However, what is different is who the splash was with and who was rushing to fill and grow the market.
The Differences
The differences are equally obvious, the biggest being that article spinning never really reached a mainstream audience.
Students didn’t use article spinning to hide plagiarism from their teachers, lawyers didn’t use spinning to generate legal briefs and so forth. The reason was simple: Article spinning was and is a tool to generate thousands of articles, it’s not a tool to write one of anything.
(Note: As pointed out by @beilinglaoshi, automated “paraphrasing” tools operate on some of the same principles and have been used by students in a bid to defeat plagiarism detectors. I was more focused on the article spinning software itself, but this is definitely a good point and illustrates how the tech evolved.)
Article spinning is/was a tool almost exclusively useful for generating spam. That’s what it does. Generative AI, rightly or wrongly, has many more applications and, though spammers are definitely exploiting it, that is only a piece of the picture.
There’s also a big difference in who is responsible for both. Article spinning was an act done entirely by the spammer. Those that sold article spinners were, in essence, selling the tools to allow others to spin articles. Spammers pulled their content from a variety of sources, some legal and some illegal, but it was the user in control of the AI.
In most cases, that’s not true for generative AIs, where the user only sees the output and has no control in the process. That changes the legal focus and is part of why AI systems are facing multiple copyright infringement lawsuits.
It’s also worth noting that there never was an article spinner or equivalent that worked on images, audio or video. Generative AIs exist across a wide variety of media types and that was not something that article spinners ever did or really could ever do.
And, finally, there’s a huge difference in who is developing and building those systems. Article spinners were largely built by individual developers or small companies acting on the fringes of the internet. They were simple applications that didn’t require large teams to create.
AIs, on the other hand, are being developed by Google, Meta, Microsoft, Apple, Adobe and other tech giants. The very companies that form the backbone of the internet are the ones developing and pushing generative AIs.
That, in the end, may be the most important difference between the two. We’ve gone from Google dismissing and then combatting text generation to literally spending millions of dollars to create a text generator.
Bottom Line
Obviously, there are major differences between article spinning applications and modern generative AI systems. How they work, what they do, who uses them and who is creating them are all significant points of separation.
But the fact remains that we have been down elements of this path before. We’ve seen what happens when reckless and unethical people get a hold of tools that enable them to generate large amounts of “original” text. We’ve seen the abuses that come with that and how it impacts human creators.
The lessons from 2005 were not learned. The abuses and misuses we’re seeing from AI today were easily predicted by anyone who was around then. But, then again, they were easily predicted without that history too.
There’s no doubt that generative AI is going to bring about significant changes. But the goal should be that those changes lead to a better future for both creators and consumers of media.
That can’t happen unless we work to mitigate the negative effects of such technology and ensure that the implementation of it is both ethical and legal.
Sadly, that is not where we are with AI, and the current rush into the field is not conducive to finding that path forward.
Want to Reuse or Republish this Content?
If you want to feature this article in your site, classroom or elsewhere, just let us know! We usually grant permission within 24 hours.