RSS Brief: Another Scraping/Spam Threat

Yesterday, the makers of the controversial Pay Per Post service launched a new tool designed to make blog reading faster, RSS Brief.

The idea is that the service takes long posts, like what you might expect here on Plagiarism Today, and condenses them down into a few short sentences.

Though the service sounds convenient and useful, it also raises significant copyright and spam issues that the company has not addressed as of yet.

Though the service is only in alpha, the time to consider these issues is now, before the service is completed and becomes an active part in many people’s blogging lives and it is too late to change course.

How it Works… In Brief

The idea behind RSS Brief is pretty simple. You punch in the URL of your favorite blog, RSS Brief will read the entries in the feed and use what its creators refer to as “natural language technology” to parse the text down to a few sentences.

The idea is that, unlike traditional truncating that simply cuts off everything but the first few sentences, you will receive an effective summary of the post. This should, in theory, allow you to get the basic idea of the post and move on.

The technology, however, is questionable at this point. Plagiarism Today’s RSS Brief page shows some of the weaknesses. Though PT is the type of site targeted by this service, it utterly fails to give a meaningful summary of any of the stories in the RSS feed. Instead, on most stories, it seems to simply do the kind of truncating it claimed to avoid.

However, finding glitches in alpha-stage technology is not as disturbing as the copyright and spam issues that this service raises. It seems that, in the rush to create this service, the programmers completely avoided any and all issues about the copyright issues it might raise and how their technology might be abused.

Copyright Issues

What RSS Brief does, fundamentally, is take a lengthy post and make a derivative work of it. Under copyright law, the creation of derivative works is the sole right of the copyright holder.

Though there is a decent fair use argument for RSS Brief in that the use is largely transformative and only takes a small portion of the original, there is a strong argument against them as well. Their use of the work, by their own design, takes the heart of the original material, it does so for a commercial purpose, and RSS Brief is designed to replace the original work, thus damaging the market for the author’s work, especially if the author has ads in the feed.

Worse still, the service continues to “summarize” even shorter works, some as short as sixty words. This severely raises the amount of the original work used and lowers the likelihood that the use will be found fair.

However, most damming of all is the 1841 case Folsom v. Marsh (PDF) that found the following when dealing with the issue of “transformative” use:

(if a user) cites the most important parts of the work, with a view, not to criticise, but to supersede the use of the original work, and substitute the review for it, such a use will be deemed in law a piracy.

Though it is impossible to predict whether or not a use will be deemed “fair” until it goes before a judge and/or jury, there seems to be a lot of reason to doubt whether or not RSS Brief will pass muster in that situation.

Most damming of all being its stated attempt to replace the original work and the lack of any opt out mechanism, such as the one Google uses to ensure its cache is fair use.

The Spam Issue

Though many readers would love an “important parts only” feed, so would spammers. Fortunately for them, RSS Brief offers up just such a feed on their service, one that essentially scrapes, processes and rebroadcasts the original feed in their “brief” format.

Spammers will, most likely, grow to love these feeds. Not only are they keyword rich and to the point, but can easily be combined with other feeds from the same service to create rapid-fire blogs with short posts, something search engines seem to love.

Already spammers take advantage of Technorati, Icerocket and Google Blog Search feeds for much the same purpose. They enjoy the keyword density those feeds provide and the fact that they raise fewer copyright issues than scraping full feeds.

Though an RSS Brief feed might be less keyword rich, it would also be much more modified from the original, making it harder for search engines and Webmasters to spot. Depending on the nature of the spammer, they might find this RSS Brief feeds preferable to the existing alternatives.

Also, much like the search feeds, RSS Brief strips out any and all digital fingerprints as well as copyright information contained in the feed. It’s rush to get to “just the facts” causes it leave out some very critical elements to bloggers. This also makes the use of RSS Brief feeds impossible to track, unless they report usage to FeedBurner, and leaves Webmasters in the dark about how many are subscribing to the feed and how they are using it.

Finally, since Pay Per Post is not a search company, it’s not in a position to punish people who do scrape their feeds. Technorati and Google can blacklist sites that scrape their search results, Pay Per Post has no such card to play.

If spammers aren’t already looking at RSS Brief as a new tool, they likely will be soon. They seem to seize on new technology as fast as they can and I doubt this service will be any exception.

Conclusions

As interesting as the idea of RSS Brief is, it is poorly executed. As of this writing, there is no means for Webmasters to opt out, no clear safeguards against spam blogging and no consideration to Webmasters. There is

Though Pay Per Post has always been a controversial company, they have always been a company that seemed to value bloggers and the role they play, albeit in a somewhat backhanded way. That is why it seems so odd to me that they created this service with so little consideration to them.

One day they are paying bloggers for reviews, the next they are taking their feeds, without permission or an opt out mechanism, and creating derivative works to be redistributed over RSS.

Hopefully they can get these issues as well as their technical glitches straightened out. The idea is interesting but doing so in the way they are doing it is very dangerous to both them, bloggers and the Internet at large.

It borders on irresponsible and if Pay Per Post is going to change their image, they need to put the good of the Web and of bloggers first. They made that mistake when they first launched their primary service and it seems that history is, in a strange way, repeating itself.

Hopefully that won’t be the case.

Note: If there is an interest in an excerpt-only “just the facts” feed for this site, I will create one. Wordpress has the tools to do that and I’ll simply create the second feed this weekend. If interested, please post a comment below or send me an email.

Want to Reuse or Republish this Content?

If you want to feature this article in your site, classroom or elsewhere, just let us know! We usually grant permission within 24 hours.

Click Here to Get Permission for Free