6 Ways Web Spammers Get Content
Though it’s easy to dismiss Web spammers as being lazy leeches on the Web, the truth is that spamming isn’t an easy profession.
There are frequent Google algorithm shifts, any of which can erase months of work in a heartbeat. There are hosts that aren’t too keen to have your content around and can quickly terminate your accounts. Finally, there’s the most difficult question of all: Where do you get your content from?
After all, spamming, if nothing else, is a numbers game. If one out of every 10,000 pages you generate become successful, you have to generate 10,000 pages to have a hit. Doing that, however, means populating 10,000 pages with content.
To put it modestly, that is not simple. To have humans write that content, even low-quality content, would be cost prohibitive. After all, even at just $1 per page, it’s unlikely the writing would generate enough revenue to be worthwhile.
As such, spammers have to automate their content creation and that means finding a reliable source of content that is both keyword-relevant and likely to pass muster with Google.
So where does this content come from? While there are many sources, here are the six main ones that most spammers seem to get their work from.
1. Online Article Sites
For a long time there have been websites that have encouraged marketers to submit articles for others to freely use. The basic idea was that authors could submit content for others to grab and, in exchange, they would get an author credit with a link from every site that used it.
While these sites have fallen out of favor due to recent Google algorithm changes, they historically have been (and still are) popular sources of content for spammers.
Advantages for Spammers: Large amounts of legally, available content available in a format that’s both easy to search and easy to grab.
Disadvantages for Spammers: Content is widely duplicated on the Web already. Author boxes also can be used by search engines to identify the content as spammy, creating a decision between infringing or using less-effective content.
2. Private Label Rights Content
Private label rights (PLR) content is content that is sold, usually in bulk, for others to do with, more or less, as they please.
Sometimes referred to was “White Label” or “Resale Rights” content, it is usually sold in large bundles at a heavily-discounted price. Spammers are then free to republish it, rewrite it, break it apart or otherwise modify it, all without worrying about attribution.
Advantages for Spammers: Eliminates legal issues of obtaining content and there’s far less of a duplicate content issue than with article sites since only others who bought the content have the legal right to use it.
Disadvantages for Spammers: PRL content is usually bought sight unseen and quality varies wildly. Also, PLR content, despite its low per-item price, is cost-prohibitive for large projects.
3. Scraping
Content scraping is simply grabbing content from another source and republishing it on another site. Spammers have traditionally done this through RSS feed scraping but, increasingly, site scraping has become a growing issue.
Scraping is, in most cases, a copyright infringement. However, spammers can often set up new sites as fast or faster as they are shut down, making fighting them through copyright means difficult.
Advantages for Spammers: The content that can be obtained through scraping is nearly limitless in both quantity and variety. RSS feeds are plentiful, filled with useful higher-quality content and are easily scraped.
Disadvantages for Spammers: The copyright infringement issues cause problems for spammers, in particular with their hosts. Also, scraped content, by its very nature, is already duplicative and usually pulled from better-established sites that will likely outrank any copies.
4. Spinning
Spinning is a way of generating new content out of existing content. There are several ways to do it but the most common are to either swap out a large number of the words in the article for synonyms, .com/2005/12/05/synonymized-plagiarism-a-new-threat/”>often called synonymized plagiarism, or by using an automated tool to translate the work into another language and then bring it back.
The goal of these techniques are to scramble the content enough so that Google (and the original owner) doesn’t recognize it while still being close enough to the original to be intelligible.
Advantages for Spammers: It’s possible to generate a nearly unlimited supply of content from a single source through spinning. One article can become thousands and, if the original is from a legal source, such as PLR content, it can be legal.
Disadvantages for Spammers: Quality of content is generally weak. Automated spinning programs can’t detect subtle nuances in language that even the most novice writers can. Also, one spin often isn’t very different from another, creating a mass of nearly-identical content.
5. Patchworking
Patchworking involves stitching together a new article from a group of similar articles found online. The goal is to create an article that is both somewhat cohesive and, on the whole original.
The process is automated through a variety of programs that search the Web for relevant content and, through their algorithms, try to generate a new work by stitching together sentences and fragments from various sources.
Advantages for Spammers: If done well, duplicate content and copyright issues largely disappear. So little is used from each source that it’s not likely to be seen as a duplicate or an infringement.
Disadvantages for Spammers: Quality of content is, at best, dubious. Articles are often unintelligible and human readers, usually, can spot a patchwork article quickly. Search engines are also growing wiser to this technique.
6. Content Generation
Content generators are simply that, programs that generate content out of thin air. How they do it varies but some examples include programs that try to mimic human writing directly and combination spinners/patchworkers that attempt to mashup existing content to the point where nothing from the original remains.
Regardless of the method, the result is the same, an entirely work is created.
Advantages for Spammers: Copyright and duplicate content issues are done away with. It’s also possible to generate a nearly-unlimited number of articles, which, if done well, each one will be as different from its peers as it is other content on the Web.
Disadvantages for Spammers: Generated content is, generally, the lowest in terms of content quality. Often becoming unreadable. Also, search engines have increasingly gotten wise to content generation and are better at spotting artificial language patterns.
Bottom Line
When it comes to fight Web spam, legitimate content creators are trapped in the crossfire between the search engines and the spammers themselves.
This crossfire takes many shapes including spammers infringing content, Google enacting penalties that hurt legitimate sites and so on.
Unfortunately, this isn’t a conflict that’s likely to ever end. As long as search engines wield so much power, there will always be those who seek to game them for their own purposes.
That puts the rest of us somewhere in the middle, trying to create good content but needing the help of the search engines to drive others to it, all the while hoping not to run afoul of rules that are in place to block the spammers who want to abuse the system.
It’s an awkward position to be in and the contortion act isn’t going to stop. The best we can do is keep creating, watch our gardens for misuse and hope for the best.
Want to Reuse or Republish this Content?
If you want to feature this article in your site, classroom or elsewhere, just let us know! We usually grant permission within 24 hours.