On the Web, aggregation is generally defined as taking content from multiple sources and combining it in one place.
That, understandably, covers a wide variety of activities on the Web. Search engines and social media sites, for example, can be considered aggregators but so can spam blogs and unethical news sites.
Yet, when the the topic of aggregation comes up, it’s inevitable that some attempt to lump all aggregation into one category and then define it as either good or bad.
The truth is though, aggregation isn’t a single act. It actually is an umbrella term that describes an incredible variety of ways people interact with content every day and while a lot of it is very harmful to creators, much of it is also helpful.
So rather than either demonizing or canonize aggregation as a whole, we need to instead look at what makes an act of aggregation either good or bad and try to provide guidance to help aggregators stay on the right side of both the law and the ethics.
However, it doesn’t take long to see that such a task is nearly impossible.
We’re All Aggregators
To be clear, this isn’t a particularly new debate. Five years ago, Mark Cuban penned a lengthy piece slamming aggregators, calling them “vampires”.
The year before, the New York Times company reached a settlement with its competitor Gatehouse, which owns the Boston Globe, that forced Gatehouse to stop scraping the Times’ RSS feeds, even though they were only displaying headlines and links.
But the debate in these stories and others like them have been along a two-part theme.
- We Are All Aggregators: Whether you post on blog, post on Facebook or use Pinterest. You are taking text, images, audiovisual content and ideas from multiple sources and combining them with original work to make your site.
- The Exact Boundaries of Aggregation Are Hotly Debated: The best practices with aggregation are hotly debated and, no matter what you do, someone will likely take issue with it.
In short, if you are posting content online at all, even if it is only loosely inspired by the works of others, odds are someone, no matter how few in number, takes issue with what you’re doing. For example, as recently as 2010, Iceland’s most-visited website had a policy against “deep linking” or simply linking to individual pages on the site.
On the inverse, no matter how egregious your misuse of the content might be, there are others who will always agree that it is ethical and should be legal.
Most people, however, are somewhere in the middle. But it’s precisely in that middle where aggregation enters its murkiest legal and ethical waters.
WWI-Era Aggregation Debates
In 1918, almost 100 years ago, the Supreme Court of the U.S. ruled in a case known as International News Service (INS) v. Associated Press (AP). According to the AP, after INS was shut down from using Allied telegraph lines to report on WWI, it then gained access to AP reports and began to rewrite and republish the news without attribution.
The court found that, while the facts reported on were not copyrightable, there existed a “quasi property” right in the facts since they were gathered at expense and that the practices of the INS amounted to unfair competition.
The ruling was largely forgotten for nearly a century but came up again the AP used it again, this time to target an organization named All Headline News (AHN), which it also accused of rewriting AP news reports and selling them without attribution.
This case, however, didn’t reach a trial. After it survived a motion of summary judgment, the two parties began to mediate and reached a settlement. That settlement, however, involved AHN paying an undisclosed sum to the AP and admitting that it improperly used content from the wire service.
These cases, however, illustrate the difficulty of turning to the law to deal with aggregation issues. The primary laws that govern aggregation, copyright, in particular fair use, and hot news, are nebulous and deliberately flexible. There is no definitive answer without a court ruling what is or is not a fair use or hot news misappropriation.
Then there is the issue of the law not always lining up with broadly accepted ethical standards. For example, if you take someone’s idea for a blog post and rewrite it or copy a bunch of individual sentences into a new post, it’s probably not a copyright infringement, regardless of attribution. However, copying slightly more than acceptable under fair use (whatever that may be in the case) but with appropriate attribution is still an infringement.
That’s because the law doesn’t directly take attribution into consideration when looking at whether a use is copyright infringing. While it is sometimes factored in, an infringement without attribution is, most likely, still an infringement with attribution.
In short, staying right within the law and within the ethics are two different things and, in both cases, what is the “right side” is nebulous and often varies from person to person and case to case.
Symbiotic vs. Parasitic
Back in 2010 when I was responding to Mark Cuban’s post, I drew a distinction that I still hold today, the distinction between symbiotic and parasitic aggregators.
Some aggregators, through a combination of limited use and proper attribution, seek to support and help those that they pull from, creating mutually-beneficial relationship. Others, seek to simply exploit the work of others for their personal gain and return nothing of value to the creator.
So when I find myself looking at an aggregation, I ask myself a simple question:
Is the aggregator providing more value to the original creator than it is taking away?
If the answer to that question is yes, then I tend to think of the aggregation as positive. If it’s not, then I either think the aggregator needs to find ways to better support those it pulls from or, in some cases, shutter.
To be clear, this has nothing to do with the value that aggregator provides its audience. If aggregators sap value from creators and make original content creation non-viable, then they’ve done a disservice to their audience in the long run.
How can aggregators ensure that they are treating content fairly? There are four elements I generally look for:
- Attribution: Strong, clear and with a link. Attribution should be front and center and in a way that doesn’t confuse the reader or block search engines.
- Limited Use: Take only what you need, Thumbnails, headlines, intro paragraphs, etc. are usually adequate. Facebook, Google and others have set down standards in this area.
- Added Value: Simply aggregating a bunch of content from various sources isn’t particularly useful. Aggregators should add value to the content whether through editorial selection, algorithms, commentary or a combination thereof. They should provide something that can’t be gleaned by just reading from the source.
- Right of Refusal: Finally, even if you do everything the best you can, some will still not want to be included. An ethical aggregator removes those that don’t want to be included, even if there is no legal obligation for them to do so. There are exceptions to this rule though, in particular with aggregators that merely links or creators simply trying to avoid criticism.
However, the more fundamental key is this. Ethical aggregators try to find ways to build upon and add value to the works of others while supporting the original creator.
Proper and ethical aggregation is a win-win for the creator and the aggregator. Anything else is just an unfair, and likely illegal, extraction of value.
Obviously this is greatly simplified. With anything as complex as aggregation four guidelines can’t account for every scenario.
Still, if you’re aggregating content, these are some good guidelines to think about and to weigh in as you decide how you will integrate the work of others.
Best of all, not only is it promoting ethical content reuse, but also legal. If you add value (creating a transformative use) and minimize the amount of content you copy, you’re likely ensuring that your use is a fair use.
So while its easy to agonize over what word count is acceptable or how large a thumbnail can be, the much more useful questions take a broader look and examine why we want to limit the way we use other people’s work.
Both the legal and ethical codes are there to ensure that those who invested the time, energy, money and expertise in creating a work enjoy the benefits from it. That too should be the goal of the aggregator, who can in turn reap the benefits of their added value, whatever it may be.