<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Plagiarism TodayScraping | Plagiarism Today</title>
	<atom:link href="http://www.plagiarismtoday.com/tag/scraping/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.plagiarismtoday.com</link>
	<description>Content Theft, Plagiarism, Copyright Infringement</description>
	<lastBuildDate>Mon, 13 Feb 2012 06:51:37 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Distil: The Anti-Scraping Content Protection Network</title>
		<link>http://www.plagiarismtoday.com/2012/01/26/distil-the-anti-scraping-content-delivery-network/</link>
		<comments>http://www.plagiarismtoday.com/2012/01/26/distil-the-anti-scraping-content-delivery-network/#comments</comments>
		<pubDate>Thu, 26 Jan 2012 19:00:40 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[cdn]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[distil]]></category>
		<category><![CDATA[DNS]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[RSS scraping]]></category>
		<category><![CDATA[Scraping]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=12404</guid>
		<description><![CDATA[Distil is a new company promising to combat scraping while improving your site's performance. But how well does it work?]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2012/01/distil-logo.jpg" alt="Distil Logo" title="Distil Logo" width="240" height="84" class="alignleft size-full wp-image-12417" />I&#8217;ve talked a lot on Plagiarism Today about the dangers of scraping including both <a href="http://www.plagiarismtoday.com/2011/05/09/faqs-the-basics-of-rss-scraping/">RSS scraping</a>, where someone copies the content in your RSS feed and, usually, republishes it elsewhere, <a href="http://www.plagiarismtoday.com/2011/11/16/scraping-not-just-for-rss-feeds-anymore/">and site scraping</a>, where search-engine like crawlers grab your site&#8217;s content for various purposes. </p>
<p>Defending against scraping, however, is incredibly difficult. Though some plugins and tolls like <a href="http://wordpress.org/extend/plugins/bad-behavior/">Bad Behavior for WordPress</a> and <a href="http://www.javascriptkit.com/howto/htaccess13.shtml">simple blocking of bots</a> can help, they aren&#8217;t perfect or complete solutions and in some cases, can deeply drain both your time and your site&#8217;s resources.</p>
<p>However, <a href="http://www.distil.it/">the team over at Distil thinks they have found a better way</a>. By acting as an intermediary between the Web and your site, they claim to not only be able to filter out most scrapers and infringers, but also to speed up your site and improve its performance.</p>
<p>How it works is by combining the their anti-scraping and bad bot technology with a robust content delivery network, this enables them to not only filter out threats to your site, but also serve much of your static content quickly and from servers located nearest to your visitors. </p>
<p>But is Distil worth the time and money? I decided to give it a trial and see what I found.<span id="more-12404"></span></p>
<h4>What is Distil?</h4>
<p><img style=' float: right; padding: 4px; margin: 0 0 2px 7px;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2012/01/threat-summary.jpg" alt="Distil Threat Summary" title="Distil Threat Summary" width="312" height="269" class="alignright size-full wp-image-12419" />The closest comparison one can make to Distil is <a href="http://www.cloudflare.com">Cloudflare</a> as both use DNS changes to better protect and speed up your site. </p>
<p>With Distil (or Cloudflare) you edit your DNS settings, which can usually be found at your domain registrar or in your site&#8217;s control panel, to direct visitors not to your server, but to a custom nameserver from Distil. Visitors will then query Distil for your site, which first filters out any malicious users and then delivers any content it can from its servers, which are spread all across the world. Anything it can&#8217;t deliver, it queries from your server and then provides to the user directly. </p>
<p>The end result, if all goes well, is that most of the content of your site is delivered directly from Distil&#8217;s servers, which should be faster than coming from your own, and most malicious users, including scrapers, are filtered out before they ever reach your site or your content. Best of all, the process is completely invisible to end users (other than the potential speed increase).</p>
<p>To find out, if it works as advertisers, I switched Plagiarism Today over to Distil last weekend and, as of this writing, have been using it for the better part of a week.</p>
<h4>Setting Up and Using Distil</h4>
<p>To start using Distil, you have to first sign up for an account and have it activated. Once that&#8217;s done, you&#8217;ll be given an address that, using your DNS settings, you will direct both your www.domain.com and domain.com (as well as any other subdomains you want to redirect).</p>
<p>Then, after the DNS servers propagate, you should be using Distil&#8217;s service. From there, you can log into the Distil dashboard, which lets you configure a variety of options including:</p>
<ul>
<li>Site Acceleration Settings (if available)</li>
<li>Rate Limiting</li>
<li>Blocking Known Violators</li>
<li>Blocking Bad User Agents</li>
<li>Browser Integrity Checks</li>
<li>Filter By Country</li>
<li>Block Bad Referrers</li>
<li>Whitelist/Blacklist</li>
<li>WWW/Non-WWW Routing</li>
</ul>
<p>You also get a bevy of statistical data including information about the number of unique sessions, the total number of requests, total human requests and the total bot requests. Bot requests are then further broken down by the number of search engine requests (which are always allowed) and the number of blocked requests (as well as the reasons for being blocked). The blocked bots are then further broken down by bot type, IP address and more.</p>
<p>The result is that you get an overall perspective of what&#8217;s going on with your site, both in terms of human traffic but, more directly, the security threats you&#8217;re facing. </p>
<p>But does that make Distil worth trying? A lot of it depends on your needs and what you&#8217;re looking to get out of it.</p>
<h4>The Good of Distil</h4>
<p>The one thing that immediately struck me about Distil is the granular level of control it gives you over security issues. Though Cloudflare offers a good deal of site security, it&#8217;s focused on spammers and attackers and only lets you set a broad level of security (low, medium, high or basically off). With Distil, you can set individual options to your liking both to target the threats most relevant to your site and, more importantly, make sure you don&#8217;t interfere with legitimate users.</p>
<p><img src="http://www.plagiarismtoday.com/wp-content/uploads/2012/01/distil-settings-sample-500x163.jpg" alt="Distil Settings Image" title="Distil Settings Image" width="500" height="163" class="alignnone size-large wp-image-12433" /></p>
<p>Over the past few days I&#8217;ve had no reports of legitimate visitors being hassled by Distil, something that was an occasional problem with Cloudflare, especially for visitors from outside the U.S. and Europe. </p>
<p>So, even though Distil did not block as many bots as Cloudflare (likely because I have the security settings for most features turned down or off), it did a better job staying out of the way and still seemed to stop the most egregious offenders. Over time, I plan on slowly increasing the settings to see if they block more and continue to be non-intrusive.</p>
<p>Beyond security, my first concern after switching to Distil was that my site might take a performance hit. Having been a Cloudflare user for many months, I was used to the power of a robust CDN. However, I did a series of tests both before and after the change and found that Distil was usually slightly faster than Cloudflare, often shaving off 30% of the site&#8217;s loading time. </p>
<p>Compare these two example results, first before: </p>
<p><img src="http://www.plagiarismtoday.com/wp-content/uploads/2012/01/PT-cloudflare-performance-500x197.jpg" alt="" title="PT CloudFlare Performance" width="500" height="197" class="alignnone size-large wp-image-12405" /></p>
<p>And then after:</p>
<p><img src="http://www.plagiarismtoday.com/wp-content/uploads/2012/01/pt-distil-test-500x174.jpg" alt="PT Distil Test" title="PT Distil Test" width="500" height="174" class="alignnone size-large wp-image-12406" /></p>
<p>(Note: While this example isn&#8217;t an apples-to-apples test due to differing endpoints, the results were consistent regardless of endpoint. Also, obviously there were other changes made in the four days between the tests, though no major alterations, frontend or back, were made.)</p>
<p>Finally, the support team at Distil is, simply put, the best of any company I&#8217;ve worked with. They answered every question I had very promptly, usually within 15 minutes and it didn&#8217;t seem to matter what time of the day I was asking it. This enabled me both to get my site set up quickly with Distil despite some confusion and questions and deal with an issue with Google Analytics (that turned out to be my own fault). </p>
<p>All in all, Distil did a good job in providing granular security control, a site performance boost and great support.</p>
<h4>The Problems with Distil</h4>
<p>The biggest initial problem with Distil is that, in its current form, it is not very simple to use. Not only do you have to wait for your account to be activated by a human, but the process of switching over your DNS is not as straightforward as Cloudflare. </p>
<p>If you aren&#8217;t comfortable working with DNS and aren&#8217;t familiar with how to edit CNAME and A records, the process is going to be intimidating. Sadly, unlike Cloudflare, there isn&#8217;t a great deal of hand holding unless you contact support. While I agree with Distil that&#8217;s better to not hand over total DNS control to a third party, as you have to do with Cloudflare, it&#8217;s also the much more difficult route for the user.</p>
<p>Another issue I have with Distil is the current pricing structure. The free account, which does not have content acceleration, offers only 5 GB of traffic per month, an amount even a modest blogger will likely blow through quickly. A site Plagiarism Today&#8217;s size fits (barely) under the cap for the small account, which offers 50 GB of transfer for $29 per month. However, Cloudflare&#8217;s free plan allows for unlimited traffic and it&#8217;s pro account, which offers additional statistics and monitoring, is only $20 per month. Other CDNs, such as MaxCDN, charge only $50 for 1 TB (1000 GB) of data. </p>
<p>Distil told me that they are considering restructuring their pricing in the coming weeks, a move that, most likely, will help with this problem.</p>
<p>For now at least, Distil is a terrible deal as CDN though its security features may help to make it more compelling to webmasters concerned about scraping and content misuse.</p>
<p>Finally, Distil, obviously, won&#8217;t be able to help with at least some kinds of scraping. RSS scraping likely won&#8217;t be blocked unless the bot doing it is already in the system and it is unclear just how many are. However, if you know the bot you can add it yourself in your control panel. Also, <a href="http://www.plagiarismtoday.com/2012/01/19/plagiarism-for-hire-the-changing-business-of-plagiarism/">any human copying won&#8217;t be blocked</a> because the system is designed precisely to allow humans to access your site.</p>
<p>Despite these limitations, there&#8217;s still a lot of webmasters who would likely benefit from Distil, even if that number could be a great deal larger down the road.</p>
<h4>Bottom Line</h4>
<p>Distil isn&#8217;t perfect. It&#8217;s a new company and it&#8217;s product certainly has its share of flaws. Right now, it&#8217;s aimed at a fairly niche market of webmasters who are technically savvy, want a great deal of granular control over their site&#8217;s security and are willing to pay extra to make it happen.</p>
<p>However, with some changes to its setup procedure, pricing and control panel, it could become a compelling option for many more sites. </p>
<p>In short, Distil is going to be a company to watch in the coming months and years. As it refines its tools and pricing, it could become a major force for helping content creators protect their work. </p>
<p>In the meantime though, other webmasters just wanting a CDN to improve their site&#8217;s performance will, most likely, want to look up other solutions, such as Cloudflare and MaxCDN as they are significantly cheaper and, in the case of Cloudflare, provides better analytics, easier setup and at some decent, if simplified, security features.</p>
<p>Still, if you&#8217;re in Distil&#8217;s niche, which is likely to grow, I can see why it would be a very powerful solution to a complex problem. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2012/01/26/distil-the-anti-scraping-content-delivery-network/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Feed Expanders On Rise, End Short RSS Advantage</title>
		<link>http://www.plagiarismtoday.com/2011/11/30/feed-expanders-on-rise-end-short-rss-advantage/</link>
		<comments>http://www.plagiarismtoday.com/2011/11/30/feed-expanders-on-rise-end-short-rss-advantage/#comments</comments>
		<pubDate>Wed, 30 Nov 2011 19:30:00 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[RSS]]></category>
		<category><![CDATA[RSS scraping]]></category>
		<category><![CDATA[Scraping]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=11897</guid>
		<description><![CDATA[For those who still believe that truncated RSS feeds protect you from scraping, here are two services that prove just how wrong you are.]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/11/feed-expander-logo-300x75.jpg" alt="Feed Expander Logo" title="Feed Expander Logo" width="300" height="75" class="alignleft size-medium wp-image-11905" />Earlier in November, <a href="http://www.plagiarismtoday.com/2011/11/16/scraping-not-just-for-rss-feeds-anymore/">I talked about how scraping was no longer limited to feeds</a>. That, in turn, <a href="http://www.plagiarismtoday.com/2005/10/18/truncated-rss-feeds-a-temporary-solution/">was a follow-up to a 2005 article which explained why truncating RSS feeds was a temporary solution</a>.</p>
<p>Today, after some serious thought, I&#8217;ve decided to show exactly how it&#8217;s being done and highlight two services that, along with providing a service to end users who are tired of partial RSS feeds, are also helping to feed spammers as well. </p>
<p>These services are RSS expanders, meaning they convert short RSS feeds to full ones. Both of these services are public, free to use and have been in operation for some time. However, more recently, I&#8217;ve been seeing them used by spammers as clients come to me confused as to how a site is scraping their full content when they have a partial feed.</p>
<p>To understand how they work and what they mean, we have to take a look at two example services and what they can do.</p>
<h4>What is a Feed Expander</h4>
<p>The idea behind a feed expander is fairly simple, it takes a short RSS feed, one that either has truncated content only or is headline-only, and converts that into a full RSS feed. This is done by looking at the URLs in the feed, extracting the content from the Web page and then creating a new feed out of that.</p>
<p>Traditionally, spammers have performed feed expansion on their servers, reading the RSS feed and doing the scraping themselves. However, several new tools have been made available to the public, including <a href="http://www.feedex.net/" rel="nofollow">FeedEx</a> and <a href="http://www.feedexpander.com/" rel="nofollow">FeedExpander</a> that provide this service for free and to anyone willing to paste in a short RSS feed.</p>
<p>In short, there&#8217;s no need for a potential spammer to set up software of their own. They can simply feed your partial feed into the feed expander and scrape the full one it produces, no work or expense required.</p>
<p>THough the services don&#8217;t work on all sites, in particular those with unusual formats, they do work on most and, even though they aren&#8217;t perfect, they are already more than reliable enough for spammers, as evidenced by the ones I&#8217;ve seen using these and other services.</p>
<h4>Is Feed Expansion Legal?</h4>
<p>To help with this article, I reached out to both sites I mentioned above. I only heard back from FeedEx, where I got a response from Nikolay, who says his site respects robots.txt and doesn&#8217;t scrape content where robots are barred. </p>
<p>However, it&#8217;s unclear if such action is enough to make these services completely legal. The reason is that, while robots.txt work for search engines, search engines don&#8217;t redistribute the content and distribution is one of the rights that copyright protects. Furthermore, any implied license argument about this kind of use would be weak at best as the webmaster, by having a truncated feed, indicated pretty clearly that they don&#8217;t want their content distributed via that means.</p>
<p>While these services could mitigate this by truncating their outgoing feeds (<a href="http://www.plagiarismtoday.com/2010/01/28/google-reader-now-for-non-rss-sites/">as Google did with its service to produce feeds from any page</a>), that would defeat the purpose of the service.</p>
<p>In short, these services take the content from other websites, copies it and posts it on another page (remember, an RSS feed is fundamentally a specially-formatted webpage). This is, quite frankly, the very definition of what copyright infringement online is, however, if these services are used by non-spammers, the rightsholder is unlikely to know or care.</p>
<p>However, this just deals with the copyright issues. As discussed previously, <a href="http://www.plagiarismtoday.com/2006/08/24/linkworthy-scraping-as-a-legal-minefield/">the scraping of a site brings about a variety of other issues</a> including trespass to chattels, Computer Fraud and Abuse Act violations and more (Note: The original PDF linked to is offline, I&#8217;m working to find a replacement, in the meantime, <a href="http://www.plagiarismtoday.com/2011/08/17/five-years-later-why-rss-scraping-still-is-not-ok/">see this article as well</a>.).</p>
<p>All of this combines to paint a pretty bleak legal picture, but yet these services soldier on.</p>
<h4>FeedEx Responds</h4>
<p>As mentioned above, I reached out to both sites before writing this article but only FeedEx responded (However, I will update this article should I hear back from FeedExpander). </p>
<p>When told that some webmasters are upset at his service, Nikolay responded that the complaints may not be as high as some would expect, saying that, &#8220;During all years of feedex.net presence, I have received just 2 complaints. And something around 100 of improvements requests.&#8221;</p>
<p>Nikolay also stated that, even though his site has a DMCA policy, he handled the requests with a simple email, blocking his bots from accessing those feeds. </p>
<p>When asked why he created the service, Nikolay, said that he did it first for himself as he wanted a more mobile way to view websites and was tired of sites with partial feeds and of alternatives such as Readability, especially on his tablet and phone.</p>
<p>That being said, Nikolay did acknowledge that spammers have used his service but that he has no means of stopping them, &#8220;I know that some spammers using my service bad way. At the moment I have no automated methods to ban them all and I cannot do that manually. So, I ban only those feeds, for which I have received complaint.&#8221;</p>
<p>FeedExpander, <a href="http://www.feedexpander.com/faq.html#" rel="nofollow">in its FAQ</a>, says something similar, calling itself a &#8220;double edged sword&#8221; that is used both for legitimate purposes and for scraping.</p>
<p>Both sites claim that they designed their service for legitimate uses only and that the misuse of it is a side effect of the intended purpose.</p>
<p>Whether you feel that&#8217;s true or not, it&#8217;s clear that these services and ones like them are here to stay and the legal issues are, for the most part, purely hypothetical.</p>
<h4>Fighting Back</h4>
<p>As I mentioned in my article earlier this month, there are ways you can fight back. This includes linking to yourself regularly, including footers in your posts and breaking apart content. </p>
<p>However, my goal with this post is not to cause these services to get flooded with removal requests. I really don&#8217;t think it would do much good. While I believe they will block your feeds (if they are even using them), they are only two services and most spammers that use this method and most spammers still prefer to use their own technology rather than rely on a third party.</p>
<p>What&#8217;s important to note is that the use of truncated RSS feeds is an almost complete waste. The only way it will help protect your content is by stopping those too lazy to use a feed expander. Given that there are so many full feed RSS sites out there, that protection might be worth something, but a dedicated scraper can easily get your content if motivated.</p>
<p>That being said, there is another benefit: Time. If you truncate your feed and someone passes it through one of these services, there&#8217;s going to be a delay in when it appears on their site. According to Nikolay, that time can be between 5-10 hours if your feed isn&#8217;t popular with the service. That, hopefully, will be plenty of time for Google to spot your site as the original and treat the scrapers as the spammers they are.</p>
<h4>Bottom Line</h4>
<p><a href="http://www.plagiarismtoday.com/2006/09/26/why-my-feeds-are-long/">As I said back in 2006</a>, I have always felt that partial RSS feeds were a bad trade. These services do little more than highlight how ineffective it is against spamming.</p>
<p>That being said, there are better way to track and prevent reuse of your content. All you have to do is plan in advance and not believe truncated feeds to be a silver bullet against the problem.</p>
<p>This is why my goal isn&#8217;t so much to take these services to task but to merely highlight that they exist. The spammers already know about them and you should too. Their existence doesn&#8217;t change much on the front of content theft and spamming, but hopefully they will raise awareness to what has been around for many, many years.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2011/11/30/feed-expanders-on-rise-end-short-rss-advantage/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Scraping: Not Just for RSS Feeds Anymore</title>
		<link>http://www.plagiarismtoday.com/2011/11/16/scraping-not-just-for-rss-feeds-anymore/</link>
		<comments>http://www.plagiarismtoday.com/2011/11/16/scraping-not-just-for-rss-feeds-anymore/#comments</comments>
		<pubDate>Wed, 16 Nov 2011 21:43:33 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[RSS scraping]]></category>
		<category><![CDATA[Scraping]]></category>
		<category><![CDATA[webmasters]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=11828</guid>
		<description><![CDATA[For spammers, scraping has usually required an RSS feed, but that is increasingly not the case as more spammers are now using the site itself.]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/11/pencil-copy-sample-300x129.jpg" alt="Pencil Copy Image" title="Pencil Copy Image" width="300" height="129" class="alignleft size-medium wp-image-11829" />Back in 2005, I wrote an article entitled &#8220;<a href="http://www.plagiarismtoday.com/2005/10/18/truncated-rss-feeds-a-temporary-solution/">Truncated Feeds: A temporary Solution</a>&#8220;. It was about a trend that was popular at that time of truncating or shortening RSS feeds to discourage scraping. </p>
<p>The reason for feed truncation is simple, most RSS scraping takes place through the RSS feed so truncating it, or showing only the first few paragraphs of it, meant scrapers couldn&#8217;t grab the whole post and could only do limited damage to your site and its content.</p>
<p>However, as I pointed out in my article, the technology was already available (and already a decade or more old) to scrape content out of the Web page itself. This meant that truncating an RSS feed, while useful against most scrapers (at the risk of angering readers) was a temporary solution until spammers started using mare advanced methods.</p>
<p>Now, it appears I might have been ahead of the curve. At least two clients and several others I&#8217;ve talked with have reported that, despite either having no RSS feed or only a truncated one, that their site&#8217;s full content is being scraped. The problem seems to be growing and it seems likely that it will get a lot worse before it gets any better.<span id="more-11828"></span></p>
<h4>Why RSS Scraping Was (IS) King</h4>
<p>RSS Scraping became popular because of how simple it was. RSS feeds, unlike regular HTML, have a predictable format and structure that makes it easy to extract the content from them. It&#8217;s how RSS readers work as well as RSS scrapers.</p>
<p>RSS feeds are plentiful and their ease of use also opens up the doors to other kinds of manipulation, such as changing out words, inserting links and clipping off unwanted portions. </p>
<p>However, RSS feeds also have a serious problem. For one, webmasters, after they learn about the scraping often truncate the feed or insert warnings into it to make it useless to the spammer. Second, webmasters often alert Google and other search engines via RSS when a new post goes live, making it so that the spammer has a difficult time getting to the search engines first.</p>
<p>Combine that with the fact that recent Google updates, <a href="http://www.seomoz.org/blog/googles-farmer-update-analysis-of-winners-vs-losers">including Panda and Farmer</a>, have been pounding sites seen as content farms or spam, many spammers have been seeking alternate ways to get their content in recent months.</p>
<h4>Non-RSS Scraping Comes to Town</h4>
<p>Non-RSS scraping came to the forefront in January of last year when Google announced that <a href="http://googlereader.blogspot.com/2010/01/follow-changes-to-any-website.html">Google Reader could track changes on any site</a>. <a href="http://www.plagiarismtoday.com/2010/01/28/google-reader-now-for-non-rss-sites/">Though the feature wasn&#8217;t evil in and of itself</a>, only displaying truncated content, it brought the issue to the forefront and proved that the technology was practical, scalable and functional.</p>
<p><a href="http://googlereader.blogspot.com/2010/09/turning-off-track-changes-feature.html">Even though it was shut off just nine months later</a>, the proof of concept was still there and it seems at least some spammers took notice.</p>
<p>However, Google wasn&#8217;t the first nor was it the last, it was merely the biggest. <a href="http://pipes.yahoo.com/pipes/">Yahoo! Pipes</a> and <a href="http://page2rss.com/">Page2RSS</a> have long done the same thing for years (Note: Once again, I&#8217;m not saying these are the services the spammers are using, they are once again mere proofs of concept) and there are countless downloadable applications that can run on a server. This is, most likely, the approach being taken by spammers.</p>
<p>Obviously though, the tech is there and has been for some time, but now it seems to have the attention of at least some spammers and that, in turn, is going to change the game for content creators sooner rather than later.</p>
<h4>How Webmasters Can Fight Back</h4>
<p>These tools work because A) It&#8217;s trivial to detect changes in a website and B) even though sites are different from one another, the pages within a site tend to remain fairly consistent.</p>
<p>This is a big part of how sites are operated today as most are template-driven. To make matters worse, with CMSes like WordPress, Drupal, Joomla, etc., there&#8217;s often a lot of consistencies between sites that use the same platform, making it even easier.</p>
<p>This makes fighting back against this kind of scraping much more difficult. However, the usual tips for fighting against scraping remain relevant, just no longer solely for the RSS feed:</p>
<ol>
<li><strong>Link to Your Content Regularly:</strong> Try to include one or two links in each of your posts that reference a page or a post on your site. This helps tell the search engines which site is the authentic one and pass along the credibility accordingly.</li>
<li><strong>Include a Footer:</strong> Including a footer in the content area of your site may cause it to get picked up along with your text, especially with highly automated scrapers. <a href="http://www.plagiarismtoday.com/2006/10/04/digital-fingerprints-to-detect-rss-scraping/">You can also include a digital fingerprint</a> in the content area, though when searching for it later you&#8217;ll have to omit your site.</li>
<li><strong>Breaking Apart Content:</strong> Breaking content apart across multiple pages is a controversial strategy (one that your readers will likely not care for) but it can also hinder this style of scraping similar to how truncated RSS feeds work.</li>
</ol>
<p>All in all, this type of scraping is going to be much more difficult to combat as there is no feed you can simply disable, alter or truncate if things get out of hand. It&#8217;s a problem that will have to be met head on and one webmasters will have to be vigilant about.</p>
<h4>Bottom Line</h4>
<p>The good news in all of this is that this type of scraping is still not very common, at least not that we know of. However, if you have a full RSS feed, it may be difficult to know where spammers are getting the content from.</p>
<p>Still, this type of scraping does appear to be on the rise. The only question is if it will be a long-term trend or a short-term fad. Right now, it&#8217;s looking more likely it will be the former. </p>
<p>This is why webmasters need to be aware of this problem so they can be vigilant against it and, when needed, deal with the problem.</p>
<p>In the end though, the only real question I have is why did it take so long for spammers to pick this up? Usually on the cutting edge, they seem to be at least five years behind the curve on this one, if not more.</p>
<p>Maybe someone can provide an explanation for that one below&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2011/11/16/scraping-not-just-for-rss-feeds-anymore/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>5 Ways Technology Is Changing RSS</title>
		<link>http://www.plagiarismtoday.com/2011/10/18/5-ways-technology-is-changing-rss/</link>
		<comments>http://www.plagiarismtoday.com/2011/10/18/5-ways-technology-is-changing-rss/#comments</comments>
		<pubDate>Tue, 18 Oct 2011 18:00:00 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[copyrightlaw]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[RSS]]></category>
		<category><![CDATA[RSS scraping]]></category>
		<category><![CDATA[Scraping]]></category>
		<category><![CDATA[Social-Networking]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=11283</guid>
		<description><![CDATA[RSS is dying, Long Live RSS! RSS is changing and, with it, how content creators use it must shift too. What does the future of RSS looks like?]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/10/feed-icons-logo-300x61.jpg" alt="Feed Icons Logo" title="Feed Icons Logo" width="300" height="61" class="alignleft size-medium wp-image-11520" /><a href="http://www.plagiarismtoday.com/2005/08/02/were-live-baby/">When I started Plagiarism Today in 2005</a>, RSS was a fairly mundane technology that was growing rapidly in popularity. The most common use of it was RSS subscription services, such as Google Reader and Bloglines. It was, basically, a way for people to get your latest content in a place that was convenient for them and to ensure they got your updates regularly.</p>
<p>However, times have definitely changed. Last year I wrote about how <a href="http://www.plagiarismtoday.com/2010/09/13/the-changing-face-of-rss/">the role of RSS was changing</a>. By most accounts, the use of feed readers peaked in 2008 at about 11% and has been declining since. The broader public found feed readers too complicated and not useful enough for regular consumption.</p>
<p>But at the same time, RSS usage has grown in very big ways. Currently millios of people are reading RSS feeds without realizing they&#8217;re doing so. Countless Twitter accounts and Facebook Pages are being fed via RSS and are serving them much like a feed reader was supposed to, sending people near-instant updates and letting them read all of their content in one place.</p>
<p>This shift is changing what RSS is and means, turning it away from being a means to read a site and into the engine that enables sharing and content discovery.</p>
<p>This, in turn, is impacting how webmasters and bloggers use and interact with RSS and is also shifting the ways in content creators protect their works and how users interact with it. </p>
<p>Here are just five examples of how that is happening right now.<span id="more-11283"></span></p>
<h4>1. Fewer, If Any, RSS Buttons</h4>
<p>If you go to <a href="http://techcrunch.com/">TechCrunch</a>, you won&#8217;t find a single RSS button on their home page. Since their recent redesign, the RSS link has been moved to the footer, three little letters at the bottom of their site.</p>
<p>Meanwhile, their Facebook &#8220;Like&#8221; box is prominently displayed in their sidebar and Twitter sharing buttons line the entire site. Webmasters have been steadily downplaying RSS subscription in favor of social networking. </p>
<p>RSS just doesn&#8217;t have the &#8220;cool&#8221; factor any more and it&#8217;s been moved to a behind-the-scenes player in content distribution. This is why many webmasters, myself included, have been slowly scaling back RSS subscription efforts in lieu of other, more popular alternatives.</p>
<h4>2. Better RSS Control</h4>
<p>RSS by its nature has historically been completely open. Anyone could be accessing it. A visitor to an RSS feed could be single user looking at it in Outlook or it could be Google Reader preparing to send it to hundreds of subscribers. This opened the door for scrapers and others who wanted to misuse the content in the feed as everyone had to be let in.</p>
<p>However, the number of distribution channels is dropping. This makes it possible to limit who has access to the feed and only let in permitted clients. <a href="http://www.plagiarismtoday.com/2007/07/02/using-htaccess-to-stop-content-theft/">Though you&#8217;ve always been able to block scrapers</a>, this would change the system from one where everyone has access until they&#8217;re booted to one where only the permitted users are let in at all.</p>
<p>This could stop scrapers before they start, or at least force them to pull from other channels to get the content.</p>
<h4>3. Greater Tolerance of Truncated Feeds</h4>
<p>Five years ago, having a truncated feed was a sure-fire way to turn away potential subscribers. The issue was such a hot-button topic that <a href="http://www.plagiarismtoday.com/2006/10/03/petition-against-partial-feeds/">a petition was circulated around against partial feeds</a> and it gained a bit of traction. </p>
<p>However, with the new subscription channels, people are more used to getting a preview and clicking through. They are more about content discovery than content consumption, making partial feeds roughly as useful as full ones.</p>
<h4>4. Loss of Platform Control</h4>
<p>While the ability to control access and the ability to greater openness to the use of partial feeds gives webmasters more control, it also comes with drawbacks.</p>
<p>Previously, if a single RSS reader or site using your content did something you didn&#8217;t like you could always block them, file a takedown notice or take other action. However, if Facebook decides to display RSS feeds in an in appropriate or controversial way, there&#8217;s not much one can do as that is a large percentage of the audience.</p>
<p>The good news is that Facebook and Twitter both don&#8217;t integrate RSS directly and, instead, use third party apps to do it. However, that&#8217;s no guarantee in and of itself as decisions by these two can impact and even cut off how RSS flows through their systems.</p>
<p>In short, even though you can always switch apps, Facebook and Twitter are still very much in control. </p>
<h4>5. Losing Sight of What RSS Even Is</h4>
<p>With RSS disappearing from sites and fewer bloggers even using them, it seems likely that even fewer people will be aware of RSS in just a few years&#8217; time. Even those who know of it and use it somewhat now will, with time, probably forget about it as both the name RSS as well as the famous icons will be all-but-meaningless to end users.</p>
<p>This also means that fewer webmasters will be thinking about it and fewer will be weighing the issues and decisions that come with having an RSS feed on your site.</p>
<p>This may, in turn, open the doors for others with less-than-pure intentions to exploit the naivete of webmasters, who are unaware of how they are gaining access to their site&#8217;s content. </p>
<h4>Bottom Line</h4>
<p>All in all, the changing role of RSS is a mixed bag for webmasters and content creators. While it will make it easier to block and reduce the impact of traditional scrapers, the loss of control over the platform and lack of front-of-mind understanding of what RSS is and how it works still opens up some serious vulnerabilities.</p>
<p>However, this is a transition that is happening slowly and will continue to do so for some time. Most likely we still have several more transition years before we truly reach the point with RSS where it is meaningless to users. </p>
<p>That being said, with so many major blog eschewing or downplaying RSS, it may be that the transition is happening much faster than once thought possible. It may simply be that the simplicity and large presence of Facebook, Twitter and other social networks are just overpowering to the traditional RSS model and we may be mourning RSS&#8217; demise as a destination sooner rather than later. </p>
<p>Either way though, RSS will live on, behind the scenes, driving social media and marketing for content creators of all stripes. That much is definitely certain. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2011/10/18/5-ways-technology-is-changing-rss/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>3 Count: Goooooal!</title>
		<link>http://www.plagiarismtoday.com/2011/10/04/3-count-goooooal/</link>
		<comments>http://www.plagiarismtoday.com/2011/10/04/3-count-goooooal/#comments</comments>
		<pubDate>Tue, 04 Oct 2011 16:53:08 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Copyright News]]></category>
		<category><![CDATA[acta]]></category>
		<category><![CDATA[canada]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[EU]]></category>
		<category><![CDATA[football]]></category>
		<category><![CDATA[greens]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[real estate]]></category>
		<category><![CDATA[Scraping]]></category>
		<category><![CDATA[soccer]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=11309</guid>
		<description><![CDATA[EU Soccer matches may get cheaper to watch, Greens reject ACTA in the EU and Century 21 wins a scraping verdict in Canada.]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2010/07/3count004-trim.png" alt="" title="3count004-trim" class="alignleft size-full wp-image-7303" height="162" width="175"></p>
<p><em>Have any suggestions for the 3 Count? Let me know via Twitter <a href="http://twitter.com/plagiarismtoday">@plagiarismtoday</a>.</em></p>
<h4>1: <a href="http://www.cbsnews.com/stories/2011/10/04/ap/business/main20115240.shtml">EU Backs Fans Watching Soccer on Cheap Decoders</a></h4>
<p>First off today, a ruling by the EU&#8217;s highest court may change the way fans all over the continent view soccer matches. The case centers around a British pub owner, Karen Murphy, who was sued by the England&#8217;s Premier League after she used a cheap Greek decoder to view league matches in her pub. The decoder cost a fraction of what it would have cost to subscribe to Sky TV, the broadcaster with the UK rights. That prompted the suit from the league. Though a lower court ruled against Murphy, the higher court said that a local law barring the importation and sale of such decoders could not be justified and that the matches themselves do not qualify for copyright protection, though pre-game material, overlays and other elements might. The decision now goes back to the lower court, which has to take the advisement and apply it.</p>
<h4>2: <a href="http://www.pcworld.com/businesscenter/article/241061/legal_expert_says_anticounterfeit_deal_should_be_scrapped.html">Legal Expert Says Anti-counterfeit Deal Should Be Scrapped</a></h4>
<p>Next up today, a Douwe Korff, professor of international law at London Metropolitan University, was recently hired by the Greens in the EU Parliament to study the human rights implications of the Anti-Counterfeiting Trade Agreement (ACTA). He concluded that the agreement may negatively impact free speech and personal excpression by not allowing trivial or otherwise beneficial infringements. However, the EU Parliament has asked its own legal committee, the JURI committee, to look into the legality of ACTA as well.</p>
<h4>3: <a href="http://business.financialpost.com/2011/09/12/century-21-canada-wins-lawsuit-against-rogers-subsidiary-zoocasa/">Century 21 Canada wins lawsuit against Rogers subsidiary Zoocasa</a></h4>
<p>Finally today, in a story that I missed when it first broke, Century 21 Canada has won its lawsuit against Zoocasa, a real estate searching site that Century 21 accused of scraping content from their Web presence. According to the ruling, the scraping was a violation of both copyright law and Century 21&#8242;s terms of service. The court ordered Zoocasa to stop misusing the site and to pay $1,000 in damages.</p>
<h4>Suggestions</h4>
<p>That&#8217;s it for the three count today. We will be back tomorrow with three more copyright links. If you have a link that you want to suggest a link for the column or have any proposals to make it better. Feel free to leave a comment or send me an email. I hope to hear from you. </p>
<h4>Want the Full Story?</h4>
<p>Tune in <a href="http://www.plagairsimtoday.com/podcast">every Wednesday evening at 5 PM ET for the live recording of the Copyright 2.0 Show</a> or wait and get the edited version <a href="http://www.plagiarismtoday.com/category/podcast/">Friday right here on Plagiarism Today</a>. </p>
<p><em>The 3 Count Logo was created by <a rel="nofollow" href="http://www.cloudjunkies.com/">Justin Goff</a> and is licensed under a <a rel="nofollow" href="http://creativecommons.org/licenses/by/3.0/">Creative Commons Attribution License</a>. </em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2011/10/04/3-count-goooooal/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Five Years Later: Why RSS Scraping Still is Not OK</title>
		<link>http://www.plagiarismtoday.com/2011/08/17/five-years-later-why-rss-scraping-still-is-not-ok/</link>
		<comments>http://www.plagiarismtoday.com/2011/08/17/five-years-later-why-rss-scraping-still-is-not-ok/#comments</comments>
		<pubDate>Wed, 17 Aug 2011 16:27:57 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[feeds]]></category>
		<category><![CDATA[Implied-License]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[RSS]]></category>
		<category><![CDATA[RSS scraping]]></category>
		<category><![CDATA[Scraping]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=10723</guid>
		<description><![CDATA[FIve years after first writing about RSS scraping, the legal realities of scraping haven't changed but the scrapers definitely have.]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/08/icon_rss-250x250.png" alt="Sample RSS Icon" title="RSS Icon" width="250" height="250" class="alignleft size-medium wp-image-10732" />Five years ago I penned an article entitled &#8220;<a href="http://www.plagiarismtoday.com/2006/08/29/why-rss-scraping-isnt-ok/">Why RSS Scraping Isn&#8217;t OK</a>&#8220;. The goal of the article was to take a look at the arguments scrapers used, legal and ethical, and explain why the realities of the law were not on their side. </p>
<p>Basically, at that time, RSS scrapers were arguing that, by putting content into an RSS feed, one was giving permission to use it on other sites, essentially creating an implied license to republish it. However, as I talked about in the previous article, the legal realities are much different and RSS scraping without per mission is, almost certainly, a copyright infringement.</p>
<p>However, while the legal realities haven&#8217;t changed much in the past five years, the people doing the scraping have. Spammers and sploggers, now wary of duplicate content issues, have largely abandoned RSS scraping in favor of other techniques. Today, the scrapers are fewer but place themselves as editors, curators and collectors, people building moderated lists of great content.</p>
<p>This shift hasn&#8217;t done much to alter the legal realities of scraping nor has it done much to placate creators who still see this as one of the most common issues they face.</p>
<p>The truth is that, even with this new veneer, RSS scraping is still not legally or ethically acceptable. Whether it&#8217;s curators or spammers, those who scrape from RSS feeds are in a dubious position and one that seems to be getting worse every day.<span id="more-10723"></span></p>
<h4>The Past Five Years Of Law and Scraping</h4>
<p>The past five years of legal history have been strangely quiet on the issue of RSS scraping. Despite how common the behavior is, very few suits have dealt with the issue.</p>
<p>The best known of those cases was <a href="http://www.boston.com/business/ticker/2009/01/nyt_gatehouse_r.html">Gatehouse Media vs. The New York Times</a>. Which saw Gatehouse Media, the owners of &#8220;Wicked Local&#8221; brand sites as well as hundreds of smaller papers, sue the New York for the Times&#8217; scraping of their RSS feeds for inclusion on Boston.com&#8217;s &#8220;Your Town&#8221; section. </p>
<p>The suit only centered around the headlines and excerpts from the stories involved but the Times felt their position was weak enough to warrant settling the matter publicly and quickly. In the end, the New York Times agreed to stop scraping Gatehouse feeds and respect restrictions placed by Gatehouse Media on the content.</p>
<p>Related cases on the issue of data scraping, sometimes called data mining, have largely been equally negative for the scrapers. Though only at the summary judgment phase at last report, the <a href="http://blog.ericgoldman.org/archives/2010/04/court_denies_su_1.htm">Snap-on Business Solutions Inc. v. O&#8217;Neil &#038; Assocs., Inc</a>, highlights the other legal perils of scraping.</p>
<p>In that case, Snap-on produced and maintained a database of auto parts for Mitsubishi. After two years, Mitsubishi began to look at other vendors for the contract but Snap-on would not give up control over the data. Mitsubishi eventually hired an outside contractor, O&#8217;Neil, to scrape the content out of the database and bring it into a new system. When Snap-on learned of the scraping, they filed suit.</p>
<p>In the summary judgement phase of the case, the judge ruled that Snap-on likely had arguments regarding the Computer Fraud and Abuse Act (CFAA), Trespass to Chattels and Breach of Contract. The court rejected a copyright infringement argument, but only because the content copied did not qualify for copyright protection, unlike with RSS feeds.</p>
<p>The case shows, as I pointed out years ago, that <a href="http://www.plagiarismtoday.com/2006/08/24/linkworthy-scraping-as-a-legal-minefield/">scraping is a legal minefield</a>. Even cases that seem to go the way of the scraper, such as the <a href="http://blog.ericgoldman.org/archives/2010/09/antiscraping_la.htm">Cvent, Inc. v. Eventbrite, Inc. case</a>, are highly fact-specific and seem to hinge more on poor case preparation than the law itself. (Note: Even in that &#8220;victory&#8221; the copyright claims and the unjust enrichment claims survived dismissal.)</p>
<p>Instead, most seem to follow the route of the <a href="http://blog.ericgoldman.org/archives/2007/10/ticketmaster_wi.htm">Ticketmaster L.L.C. v. RMG Technologies, Inc.</a> case, a 2007 win for Tickemaster against a sniping service that was snatching up popular tickets using an automated process. In that case, the court ruled RMG was infringing copyright by merely browsing the relevant pages since they were doing so in violation of Tickemtaster&#8217;s &#8220;browserwrap&#8221; license.</p>
<p>In short, the legal realities for scraper are even more bleak than they were five years ago. The implied license argument that&#8217;s so popular among scrapers has been eroded and, all in all, it&#8217;s almost impossible to scrape legally, RSS or otherwise. </p>
<p>Yet, what&#8217;s changed in the last five years isn&#8217;t so much the law, but the scrapers themselves and that&#8217;s where things have truly gotten interesting.</p>
<h4>The Death of the Spammer Scraper</h4>
<p>Back in 2006, your &#8220;typical&#8221; RSS scraper was probably a spammer, someone seeking a quick, hands off way of filling a large number of sites with search engine friendly content to rise in the rankings and, eventually, usurp the original work for certain keywords.</p>
<p>Those days, however, are gone. Though scraping spammers still exist, most spammers moved on from this method as Google and the other search engines improved their duplicate content detection, making it a less effective technique. Methods such as content spinning, content generation and even cheap outsourcing have proved to be more effective and equally reliable.</p>
<p>This decline has largely mirrored the <a href="http://www.readwriteweb.com/archives/rss_reader_market_in_disarray.php">overall decline in traditional (reader-based) RSS usage</a>. RSS is falling out of vogue, at least as a tool for scraping and reading, but not as a tool for &#8220;curating&#8221;. </p>
<p>The reason is that tools for integrating RSS into existing websites have grown much more common and easier to use in the past five years. Though some were developed for the use of spamming, other tools were meant to allow authors to integrate all of their sites in one place. However, some authors have latched onto these tools as a way of bringing in the work of others without permission.</p>
<p>This has created a situation where the people doing the scraping are fewer in number, but likely much more dangerous. Where search engines were relatively effective at filtering out spammers, these sites tend to appear to be much more legitimate, increasing the likelihood they could be mistaken as originals.</p>
<p>Fortunately, the law doesn&#8217;t make a great distinction between spammers and those who scrape with less nefarious intentions, but many who engage in this practice have, according to emails I&#8217;ve seen, have claimed to have an ethical or even legal right to engage in the scraping, calling themselves &#8220;editors&#8221;.</p>
<p>This has set the stage for some ugly battles that, while they haven&#8217;t reached the courtroom yet, have certainly been heated on the Web.</p>
<p>Indeed, this argument seems to be one that&#8217;s moving out of the courtroom and into the court of public opinion, a place where it&#8217;s likely to stay given how straightforward the legal issues seem.</p>
<h4>Bottom Line</h4>
<p>In the end, consider that the New York Times Company, one of the most powerful media institutions on the planet, couldn&#8217;t or didn&#8217;t want to defend scraping of just headlines and summaries, there&#8217;s little hope for a successful defense of full RSS scraping. This is especially true in the light of other, related scraping cases.</p>
<p>However, those who want to scrape and those who are willing to allow their feeds to be scraped do have options. Creative Commons, for example, <a href="http://wiki.creativecommons.org/Syndication">has modules for RSS feeds</a> that enable applications to detect what they are allowed to do with a feed. </p>
<p>To those who don&#8217;t wish to allow it, I encourage you to put in your feed itself a notice stating that you do not wish to allow republishing and that the feed is for private personal use only. Though it shouldn&#8217;t be necessary under the law, it&#8217;s a wise move that blocks many of the potential arguments a scraper might raise. Furthermore, such footers can greatly help with the detection of scrapers.</p>
<p>All in all, RSS scraping has definitely changed in terms of who is using it and why, but the threat isn&#8217;t all that different and the legal realities have hardly changed at all. </p>
<p>This means that RSS scraping can be easily fought, just that the people you&#8217;re moving against may be a bit more vocal in their views. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2011/08/17/five-years-later-why-rss-scraping-still-is-not-ok/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>FAQs: The Basics of RSS Scraping</title>
		<link>http://www.plagiarismtoday.com/2011/05/09/faqs-the-basics-of-rss-scraping/</link>
		<comments>http://www.plagiarismtoday.com/2011/05/09/faqs-the-basics-of-rss-scraping/#comments</comments>
		<pubDate>Mon, 09 May 2011 18:21:39 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[RSS]]></category>
		<category><![CDATA[RSS scraping]]></category>
		<category><![CDATA[Scraping]]></category>
		<category><![CDATA[Spam-Blogging]]></category>
		<category><![CDATA[Spamming]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=9659</guid>
		<description><![CDATA[RSS Scraping is a problem nearly every webmaster is going to have to face at some point, here's the basics on what it is and what to do about it.]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/05/rss-big-icon1-250x250.png" alt="" title="rss-big-icon" width="250" height="250" class="alignleft size-medium wp-image-9664" />RSS scraping is one of the most common and most frustrating types of content theft bloggers, forum admins and other site owners will face as they grow their presence online. Not only does it, often, allow the scraper to grab all of the content from the original site easily, but it also is a tactic used by spammers, who not only are able to exploit the content for search engine gains, but are also among the most despised infringers online.</p>
<p>As such, it&#8217;s important for all webmasters and content creators to be aware of what RSS scraping is, how it works and where it&#8217;s going in the future. Even though <a href="http://www.staynalive.com/2011/05/twitter-and-facebook-both-quietly-kill.html">RSS as a protocol may be on the ropes</a>, RSS scraping is not a problem that&#8217;s going away and, in fact, may be getting a lot worse in the coming years.</p>
<p>With that in mind, here is a quick FAQ on some of the more common questions asked about RSS scraping and what can be done about it.<span id="more-9659"></span></p>
<h4>What is RSS?</h4>
<p>RSS, sometimes referred to as Really Simple Syndication or <a href="http://www.whatisrss.com/">Rich Site Summary</a>, is a protocol that makes it easy for other sites and tools to access the content in your site by formatting your content in a consistent, easy-to-parse way.</p>
<p>Contrary to an HTML document, which could have the content be anywhere on the page, RSS indicates clearly what is the headline, body and other elements of the content. This makes it easy to grab the content and display it elsewhere without the surrounding formatting and HTML code.</p>
<h4>How is RSS Normally Used?</h4>
<p>Traditionally, RSS has been used to enable readers to subscribe to a site using various RSS readers such as <a href="http://www.google.com/reader">Google Reader</a>, <a href="http://www.feeddemon.com/">Feed Demon</a> and even many mail clients. </p>
<p>However, RSS has also been used to power other services, such as <a href="http://www.mailchimp.com/features/rss-to-email/">email newsletters</a> and even <a href="http://www.facebook.com/RSS.Graffiti">Facebook integration</a>.</p>
<h4>What is RSS Scraping?</h4>
<p>RSS scraping is when a third party, usually a spammer, grabs the content in an RSS and republishes it wholesale on another site. </p>
<p>In this regard, RSS scrapers work a great deal like Google Reader, grabbing your site&#8217;s content and displaying it on a site but, where Google Reader places the content behind a password protected wall that can only be accessed by the subscriber (or those who are shared the individual story), scrapers instead place the content on a public site for anyone to view, including search engines.</p>
<h4>Why do People Scrape RSS Feeds?</h4>
<p>Spammers seek high rankings in search engines so they can get traffic to display their ads against or sell products with. To do this, they need content but creating content by hand is time-consuming and difficult, especially when much of it is going to make no difference in the search engines.</p>
<p>RSS scraping is an easy way for spammers, and other sites, to quickly fill their pages with content, even if the content comes solely from other sites.</p>
<h4>How Can RSS Scraping Hurt Me?</h4>
<p>In most cases, RSS scraping doesn&#8217;t hurt. Google and other search engines have become savvy enough about spam that most of the time, they don&#8217;t give much credence to spam sites, keeping them from getting a lot of traffic or harming you in the rankings. </p>
<p>However, the system is far from perfect and there are many times spammers outrank the sites they scrape from for relevant terms. This is especially true with new sites or those that don&#8217;t have a strong search engine presence.</p>
<p>Less likely is that others may confuse the spam site as either being the original site or as being one endorsed by you, thus actively taking traffic from you. Few people, however, make this mistake with spam sites as the distinction is usually very clear.</p>
<p>All in all, the risk from an individual case of RSS scraping is actually fairly low, but the problem is that there is rarely just one or two such scrapers working at any given time.</p>
<h4>What Can I Do About RSS Scraping?</h4>
<p>Dealing with RSS scraping starts with good SEO practices. If you link between your posts, get good inbound mentions and earn social networking shares, odds are that RSS scraping won&#8217;t greatly impact you.</p>
<p>If it does, you can alway seek to have the content removed by either <a href="http://www.plagiarismtoday.com/stopping-internet-plagiarism/4-contacting-the-host/">filing a DMCA notice with the spammer&#8217;s host</a> or, if that fails, <a href="http://www.plagiarismtoday.com/stopping-internet-plagiarism/6-when-all-else-fails/">sending one to Google</a>. </p>
<p>If RSS scraping becomes a more serious and more recurring problem, you  may want to consider truncating your feeds or eliminating them. <a href="http://www.plagiarismtoday.com/2007/01/04/the-six-worst-ways-to-protect-content/">Though that would be an extreme last resort</a>.</p>
<h4>Is RSS Scraping Illegal?</h4>
<p>Some have made arguments that distributing your content via an RSS feed, even if you didn&#8217;t realize you were doing it, creates an implied license to use it in this manner. However, <a href="http://www.plagiarismtoday.com/2006/08/29/why-rss-scraping-isnt-ok/">there are many problems with that and other related arguments on RSS scraping</a>. </p>
<p>Generally, RSS scraping is considered to be copyright infringement, though there are <a href="http://www.plagiarismtoday.com/2006/08/24/linkworthy-scraping-as-a-legal-minefield/'">other legal arguments against RSS scraping</a> as well. </p>
<h4>What if I Want to Encourage RSS Scraping and Reuse</h4>
<p>If you want others to scrape your RSS feed, you can actually give blanket permission to do that by <a href="http://wiki.creativecommons.org/Syndication">inserting a Creative Commons license into your feed</a>. This will let bots that do scraping know your intentions and, those that are complying with the law should be able to follow your wishes.</p>
<h4>How Can I Track RSS Scraping?</h4>
<p>Many people will find RSS scrapers on accident when they search for keywords relevent to their blog or site. However, you can keep track of your content using automated tools like <a href="https://fairshare.attributor.com/fairshare/">Fairshare</a> that are designed for tracking dynamic content.</p>
<p>In the end though, its best to keep an eye on the search engines for terms that others commonly find your site through as scrapers will often show up for those same results though, initially, they will likely be lower than your site.</p>
<h4>What is the Future of RSS Scraping</h4>
<p>Though it&#8217;s difficult to predict what spam tactics will be popular in the coming years, RSS scraping has been a problem for at least six years and is continuing today.</p>
<p>That being said, it has fallen out of favor with many spammers, who prefer content generation or scraping excerpts from feeds to avoid duplicate content penalties in the search engines. Still, many active spammers use the method though spammers have clearly become more diversified in this area.</p>
<h4>Bottom Line</h4>
<p>There&#8217;s no doubt that RSS scraping can be and often is very annoying and very problematic. That being said, there&#8217;s no reason that it should be a major headache or that it should become a reason to walk away from your site. Most cases of RSS scraping don&#8217;t have a major impact on a blog and those that do can usually be dealt with.</p>
<p>That being said, if you are having a serious problem with RSS scraping, please f<a href="http://www.plagiarismtoday.com/contact-pt/">eel free to drop me a line or</a>, if you think you may need outside help, feel free to <a href="http://copybyte.com">see if I can help via my consulting services</a>. </p>
<p>All in all, RSS scraping is a reality most bloggers and webmasters will have to deal with, but it&#8217;s not one that should sink your site if you&#8217;re savvy about how to handle it.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2011/05/09/faqs-the-basics-of-rss-scraping/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Copyright 2.0 Show &#8211; Episode 150</title>
		<link>http://www.plagiarismtoday.com/2010/05/07/copyright-2-0-show-episode-150-2/</link>
		<comments>http://www.plagiarismtoday.com/2010/05/07/copyright-2-0-show-episode-150-2/#comments</comments>
		<pubDate>Fri, 07 May 2010 20:09:31 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Podcast]]></category>
		<category><![CDATA[art]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[germany]]></category>
		<category><![CDATA[global grind]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[rapidshare]]></category>
		<category><![CDATA[RSS]]></category>
		<category><![CDATA[scholastic]]></category>
		<category><![CDATA[Scraping]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=6586</guid>
		<description><![CDATA[It is Friday again and that means that it is time for another episode of the Copyright 2.0 Show. It is our second week with our &#8220;longer story&#8221; format and we have a lot of great news for you. There&#8217;s an update to the Global Grind case, a major plagiarism controversy and perhaps the most...]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2010/05/rapidshare-logo-1.jpg" alt="" title="rapidshare-logo-1" width="250" height="173" class="alignleft size-full wp-image-6588"></p>
<p>It is Friday again and that means that it is time for another episode of the Copyright 2.0 Show.</p>
<p>It is our second week with our &#8220;longer story&#8221; format and we have a lot of great news for you. There&#8217;s an update to the Global Grind case, a major plagiarism controversy and perhaps the most meta copyright infringement ever. </p>
<p>All delivered with our usual mix if reporting, in-depth discussion and humor. It&#8217;s a show that, hopefully, is as entertaining as it is thought-provoking and informative.</p>
<p>This week&#8217;s stories include:</p>
<ul id="null">
<li>Global Grind Stops Scraping</li>
<li>A Plagiarism Controversy Shakes a Major Art Contest</li>
<li>RapidShare Scores a Big Win in Germany</li>
<li>Can You Pirate a Pirate Ship?</li>
</ul>
<p>You can <a href="http://recordings.talkshoe.com/TC-22590/TS-354606.mp3">download the MP3 file here</a> (direct download). Those interested in subscribing to the show can do so via <a href="http://www.copyright20.com/podcasts/rss">this feed</a>.</p>
<p><a href="http://www.diigo.com/list/plagiarismtoday/episode-150">Show Notes</a></p>
<h4>About the Hosts</h4>
<p><strong>Jonathan Bailey</strong></p>
<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://files.plagiarismtoday.com/wp-content/uploads/2009/06/jonathan-box-150x150.png" alt="jonathan-box" title="jonathan-box" width="150" height="150" class="alignleft size-thumbnail wp-image-3842"></p>
<p>Jonathan Bailey (<a href="http://twitter.com/plagiarismtoday">@plagiarismtoday</a>) is the Webmaster and author of Plagiarism Today (Hint: You&#8217;re there now) and works as a copyright and plagiarism consultant. Though not an attorney, he has resolved over 700 cases of plagiarism involving his own work and has helped countless others protect their work and develop strategies for making their content work as hard as possible toward their goals.</p>
<p><strong>Patrick O&#8217;Keefe</strong></p>
<p><img style=' float: right; padding: 4px; margin: 0 0 2px 7px;'  src="http://files.plagiarismtoday.com/wp-content/uploads/2009/06/patrick.jpg" alt="patrick" title="patrick" width="150" height="150" class="alignright size-full wp-image-3848"></p>
<p>Patrick O&#8217;Keefe (<a href="http://twitter.com/iFroggy">@iFroggy</a>) is the owner of the <a href="http://www.ifroggy.com">iFroggy Network</a>, a network of websites covering various interests. He&#8217;s the author of the book <a href="http://www.managingonlineforums.com/">&#8220;Managing Online Forums,&#8221;</a> a practical guide to managing online communities and social spaces. He maintains a blog about online community management at <a href="http://www.managingcommunities.com/">ManagingCommunities.com</a> and a personal blog at <a href="http://www.patrickokeefe.com/">patrickokeefe.com</a>.</p>
<p><object type="application/x-shockwave-flash" width="220" height="160" data="http://bigcontact.com/feed-player/8912_16725/r:0;t:1001"><param name="quality" value="best"><param name="wmode" value="window"><param name="allowScriptAccess" value="always"><param name="allowFullScreen" value="true"><param name="movie" value="http://bigcontact.com/feed-player/8912_16725/r:0;t:1001"></object></p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2010/05/07/copyright-2-0-show-episode-150-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="http://recordings.talkshoe.com/TC-22590/TS-354606.mp3" length="49509355" type="audio/mpeg" />
		</item>
		<item>
		<title>Update: Global Grind Stops Scraping</title>
		<link>http://www.plagiarismtoday.com/2010/05/06/update-global-grind-stops-scraping/</link>
		<comments>http://www.plagiarismtoday.com/2010/05/06/update-global-grind-stops-scraping/#comments</comments>
		<pubDate>Thu, 06 May 2010 18:00:05 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[global grind]]></category>
		<category><![CDATA[hip hop]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[RSS]]></category>
		<category><![CDATA[Scraping]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=6579</guid>
		<description><![CDATA[Global Grind has taken several steps to mitigate its content theft issues. Here's what it's done. ]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2010/04/globalgrind-logo.jpg" alt="" title="globalgrind-logo" width="209" height="68" class="alignleft size-full wp-image-6471"></p>
<p>Last week, <a href="http://www.plagiarismtoday.com/2010/04/27/global-grind-copies-content-publishes-it-to-google-news/">I covered a controversy involving the well-known hip-hop site Global Grind</a>. The controversy, <a href="http://www.patrickokeefe.com/2010/04/26/global-grind-copies-content-submits-it-to-google-news/">originally reported by my friend and Copyright 2.0 Show co-host Patrick O&#8217;Keefe</a>, had become something of a large-scale controversy, especially in the rap and hip-hop community, garnering many mentions on popular blogs and Twitter accounts.</p>
<p>In short, what O&#8217;Keefe accused Global Grind of doing, and the site later admitted, was scraping content from various sources, including one of O&#8217;Keefe&#8217;s blog, and publishing the content to Google. Since Global Grind had been accepted to Google News, the scraped content appeared there as well. </p>
<p>However, today I have <a href="http://www.patrickokeefe.com/2010/05/01/global-grind-kills-top-frame-bar-full-content-scraping-adds-direct-source-links/">an update on the case</a>. After a week of the firestorm, it seems that Global Grind has not only stopped scraping, but has also removed all old scraped content from its site and stopped using its frame for outbound links. </p>
<p>All in all the news is extremely good. Though it is upsetting that Global Grind engaged in these practices in the first place and didn&#8217;t respond to O&#8217;Keefe&#8217;s initial, private, inquiries on the issue, it is clear they are taking the right steps now.</p>
<p>Currently, on their home page, all the stories pulled from other sources, of which there are only a few, cite just a few words and link directly to the original source, without the frame. <a href="http://globalgrind.com/channel/news/content/1561899/Faisal-Shahzad-Did-Dry-Run-Of-Times-Square-Bombing/">Only one story contains more than an acceptable level of copied content</a> and, judging from the way others were handled, it seems that it was likely an error made by a staff member, not a change in policy.</p>
<p>So, for now, this case seems to be largely resolved. Though I am still hoping for further changes from Global Grind, including a public statement on this issue and a designation of an actual DMCA agent, the major issues have been dealt with.</p>
<p>Hopefully this case will serve as a warning to other sites that may try a similar tactic, bloggers and Webmasters do notice and are not happy about this kind of infringement. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2010/05/06/update-global-grind-stops-scraping/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Copyright 2.0 Show &#8211; Episode 149</title>
		<link>http://www.plagiarismtoday.com/2010/04/30/copyright-2-0-show-episode-150/</link>
		<comments>http://www.plagiarismtoday.com/2010/04/30/copyright-2-0-show-episode-150/#comments</comments>
		<pubDate>Fri, 30 Apr 2010 18:39:25 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Podcast]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[global grind]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[RSS scraping]]></category>
		<category><![CDATA[Scraping]]></category>
		<category><![CDATA[spam-blog]]></category>
		<category><![CDATA[Splogging]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=6522</guid>
		<description><![CDATA[It is Friday again and that means that it is time for another episode of the Copyright 2.0 Show. It is a very special week for the Copyright 2.0 Show as spend the hour on just one news story, the Global Grind controversy originally reported on by Patrick O&#8217;Keefe, the esteemed co-host of the show....]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2010/04/globalgrind-logo.jpg" alt="" title="globalgrind-logo" width="209" height="68" class="alignleft size-full wp-image-6471"></p>
<p>It is Friday again and that means that it is time for another episode of the Copyright 2.0 Show.</p>
<p>It is a very special week for the Copyright 2.0 Show as spend the hour on just one news story, <a href="http://www.plagiarismtoday.com/2010/04/27/global-grind-copies-content-publishes-it-to-google-news/">the Global Grind controversy</a> originally <a href="http://www.patrickokeefe.com/2010/04/26/global-grind-copies-content-submits-it-to-google-news/">reported on by Patrick O&#8217;Keefe</a>, the esteemed co-host of the show. We also debuted a new chatroom, <a href="http://www.plagiarismtoday.com/podcast">which can be found here</a> and had a very long, involved discussion with those who dropped by for the show. </p>
<p>It was a great show and we hope to see you there every Wednesday at 6 PM ET for the live recording!</p>
<p>In this show we covered:</p>
<ul id="null">
<li>The Background of the Global Grind Case</li>
<li>What Has Been Done About It</li>
<li>How Global Grind Has Responded</li>
<li>What Affected Webmasters Can Do</li>
<li>What&#8217;s Next for the Case</li>
<li>And Many more&#8230;</li>
</ul>
<p>You can <a href="http://recordings.talkshoe.com/TC-22590/TS-352308.mp3">download the MP3 file here</a> (direct download). Those interested in subscribing to the show can do so via <a href="http://www.copyright20.com/podcasts/rss">this feed</a>.</p>
<p><a href="http://www.diigo.com/list/plagiarismtoday/episode-149">Show Notes</a></p>
<h4>About the Hosts</h4>
<p><strong>Jonathan Bailey</strong></p>
<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://files.plagiarismtoday.com/wp-content/uploads/2009/06/jonathan-box-150x150.png" alt="jonathan-box" title="jonathan-box" width="150" height="150" class="alignleft size-thumbnail wp-image-3842"></p>
<p>Jonathan Bailey (<a href="http://twitter.com/plagiarismtoday">@plagiarismtoday</a>) is the Webmaster and author of Plagiarism Today (Hint: You&#8217;re there now) and works as a copyright and plagiarism consultant. Though not an attorney, he has resolved over 700 cases of plagiarism involving his own work and has helped countless others protect their work and develop strategies for making their content work as hard as possible toward their goals.</p>
<p><strong>Patrick O&#8217;Keefe</strong></p>
<p><img style=' float: right; padding: 4px; margin: 0 0 2px 7px;'  src="http://files.plagiarismtoday.com/wp-content/uploads/2009/06/patrick.jpg" alt="patrick" title="patrick" width="150" height="150" class="alignright size-full wp-image-3848"></p>
<p>Patrick O&#8217;Keefe (<a href="http://twitter.com/iFroggy">@iFroggy</a>) is the owner of the <a href="http://www.ifroggy.com">iFroggy Network</a>, a network of websites covering various interests. He&#8217;s the author of the book <a href="http://www.managingonlineforums.com/">&#8220;Managing Online Forums,&#8221;</a> a practical guide to managing online communities and social spaces. He maintains a blog about online community management at <a href="http://www.managingcommunities.com/">ManagingCommunities.com</a> and a personal blog at <a href="http://www.patrickokeefe.com/">patrickokeefe.com</a>.</p>
<p><object type="application/x-shockwave-flash" width="220" height="160" data="http://bigcontact.com/feed-player/8912_16725/r:0;t:1001"><param name="quality" value="best"><param name="wmode" value="window"><param name="allowScriptAccess" value="always"><param name="allowFullScreen" value="true"><param name="movie" value="http://bigcontact.com/feed-player/8912_16725/r:0;t:1001"></object></p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2010/04/30/copyright-2-0-show-episode-150/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
<enclosure url="http://recordings.talkshoe.com/TC-22590/TS-352308.mp3" length="64267493" type="audio/mpeg" />
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Page Caching using disk: enhanced

Served from: www.plagiarismtoday.com @ 2012-02-13 05:26:41 -->
