What is spam?
Spam, in the early days of the Internet, was a pretty simple concept. The term was largely synonymous about unsolicited commercial email (UCE). This is why the famous CAN-SPAM act of 2003 deals solely with email and nothing else.
But that act is nearly a decade old and the Web has changed a lot since then. This means it’s time to revisit the question and ask ourselves “What is spam in 2013?”
These days, there’s no simple answer to that question, no easy synonym we can draw. Even our real world parallels have fallen apart. Is spam like junk mail? Or is it like snipe sighs? Or something else altogether?
This isn’t an easy question but it is an important one. How we define spam is important because it determines how we treat it and what we do to stop it.
So it’s worth taking a moment to analyze the term “spam” and what it means as we approach the 10-year anniversary of the act that carries its name.
A Brief History of Spam
Dictionary.com defines spam as, “Disruptive messages, especially commercial messages posted on a computer network or sent as e-mail.”
The definition is a solid working definition and it plays true to the roots of the term, which comes from a Monty Python sketch about the luncheon meat of the same name (video). In the scene, a diner owner tries to push spam on a guest who doesn’t want any while Vikings constantly chant “Spam” over the conversation.
This engrained the idea of “Spam” being something that is thrust upon you against your wishes and something that interrupts the natural conversation, making it an easy analogue for the rise of spam online.
But the first spam wasn’t email, it was on Usenet, people posting junk messages in Usenet groups, often cluttering them up for people who wanted to read the content. However, as email became more popular and more mainstream than Usenet, spam rose on it as well beginning a cat-and-mouse game of spammers and spam blockers trying to outdo each other.
But that trend has continued on the Web. As new technologies have emerged to foster communications between people, spammers have sought to abuse it. This includes Web spam, instant messenger spam, text message spam, forum spam, comment spam, social media spam, social news spam and the list goes on.
In that regard, spam has diversified beyond what it was in 2003. While new forms of it are constantly being created, the old ones aren’t going away. There are still Usenet spammers and, though filtering technology has improved to keep most email spam out of our inbox, over 70% of all email is junk.
As a result, every method of electronic communication we have invented has spam, or an equivalent, on it.
While this makes dealing with spam important, it makes defining the term in a practical way difficult.
However, we have to get past the broad definition if we’re going to really nail down what makes spam spam in the new year.
The Problem with the Definition
As workable as the above definition is, it only really has one defining characteristic of what is or is not spam, that it is “disruptive”. However, none of the other definitions are really any beter. Google defines Spam as “Send the same message indiscriminately to (large numbers of recipients) on the Internet” and Merriam-Webster’s definition doesn’t even make room for spam beyond email.
But even the focus on disruptive is difficult, because what is or is not disruptive is relative. If I post a job opening I have on a career site, it’s the exact content the site wants. If I post the same message on a Pokemon forum, it’s disruptive.
However, the differences aren’t always so extreme. Look at a large community like Reddit, which has many sub-communities, or subreddits, in it. Individuals constantly want to post their content in the most popular subreddits they can, ideally one of the default subreddits, so they can get more views and more exposure. However, the debate about where a piece of content really belongs is endless.
For example, does a funny picture belong in r/funny or r/pics? Does a story about a major technological breakthrough belong in /r/technology or /r/worldnews? These are just a few of the questions involving the default subreddits, without even looking at the subcommunities that have sprung up around some of the larger communities over the years. (Note: Obviously, the communities have their set of rules to answer the questions, but even then there’s often overlap.)
So, if a person mistakenly posts something that that the community says should be placed in another forum, that is certainly disruptive, but I think most people would be reluctant to call it spam, especially if it were a first time. Likewise, a hateful commenter or a “troll” can be disruptive but we rarely call that behavior spam.
So not all disruptive messages are spam and, as the Dictionary.com definition alludes to, not all spam is commercial. If I emailed out my political manifesto to millions of strangers, that would be spam the same as if it were an ad for foreign pharmaceuticals.
But the biggest problem is that not all spam is even in message format (at least not in the sense of a message from one person to another). Consider web spam or blog spam, which is posted as websites. The goal isn’t to disrupt any message platform, but Google itself.
So, basically not all disruptive messages are spam, not all actual spam is commercial and, finally, not all spam even involves messages.
This leaves us with a good, but highly imperfect, definition of spam and it forces us to look deeper at the problem.
Striving for a Better Definition
With that in mind, we have to step back and look at what all of the things we think of as spam have in common. Unfortunately, that list is fairly short.
- Unwanted: Spam, no matter where it is posted, is unwanted by everyone other than the spammer.
- Malicious Intent: The sender/submitter has to recognize that the content is not wanted and send it regardless. At the very least, there must be gross negligence.
- Digital Communication Based: Though telemarketing and junk mail are, in many ways, similar to Internet spam, they are generally not called spam.
There were a couple of terms I initially included in that list, only to remove them when I realized that not all spam fit that description:
- Promotional: Though better than saying “commercial” not all spam promotes anything either. Some spam tests filters without any promotional material and others attempt to sabotage sites rather than promote one.
- Bulk: Though most spam is done in bulk, even one egregious posting or submission can be considered spam, especially in communities that quickly clamp down on such posts. Furthermore, Google routinely de-indexes pages for on-site behavior, regardless of duplication.
What this leaves us is with an overly-broad and useless definition: Spam is any unwanted content posted or submitted via a digital means by someone with malicious intent or gross negligence.
But that doesn’t fit either. With that definition, there is nothing to separate spam from other unwanted online behavior, such anti-social comments or activities, including “trolling”.
However, if you want to include every seemingly accepted use of the word spam, that’s the problem you run into. We’ve used the word so heavily that it’s become almost meaningless, used to represent almost any unwanted content and this creates a serious challenge in terms of enforcement.
If you were going to draft a terms of service, a policy or even legislation against spam, would you approach web spam the same as email spam? Comment spam the same as instant messenger spam? The problems are endless.
Talking about spam broadly has become almost meaningless because the term itself has. This opens up some very serious real world challenges when talking about the problem and there aren’t any easy answers.
The problem with the definition of “spam” is the way it has evolved. Every time a new technology has come along and others have abused it for their personal gain, we call the behavior spam, even if technically, legally and even ethically it’s a very different behavior. The term spam is now more applied to a feeling than it is to an act.
In that regard, spam is akin to to Justice Potter Stewart’s famous quip about pornography, “I know it when I see it.” Basically, rather than trying to define pornography, Justice Stewart turned to his gut instincts on the matter. While probably accurate, it’s almost useless from a practical standpoint.
Spam has become a “I know it when I see it” problem. However, that definition creates problems with battling it as there can be real differences in opinion as to what spam is. The best thing you can do right now is work to define what you consider spam on your sites and stick to those definitions.
For the broader debate, we need to be vigilant when talking about the different types of spam. With the generic term almost meaningless, we need to look at using “email spam”, “web spam”, “Twitter spam” and so forth. Those terms have a much more clear meaning than just “spam”.
In the end though, how we define spam plays a big role in how we fight it. Right now, on the broadest level, we’re fighting a ghost, shape-shifting foe with no real substance. It’s only when we break it down into its parts that we begin to find something more tangible and actionable.
Spam is no longer just email spam. In fact, spam is not even a thing. Spam is now an idea unto itself and, while you can fight ideas with other ideas, you can’t beat one with technical measures or legislation. If we’re going to really address the issue, the focus of how we talk and think about spam is going to have to change.
Otheriwse, the next iteration of spam could catch us completely off guard.