<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Plagiarism Todaycopygator | Plagiarism Today</title>
	<atom:link href="http://www.plagiarismtoday.com/tag/copygator/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.plagiarismtoday.com</link>
	<description>Content Theft, Plagiarism, Copyright Infringement</description>
	<lastBuildDate>Mon, 13 Feb 2012 06:51:37 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Plagium: A Copyscape Alternative</title>
		<link>http://www.plagiarismtoday.com/2009/05/07/plagium-a-copyscape-alternative/</link>
		<comments>http://www.plagiarismtoday.com/2009/05/07/plagium-a-copyscape-alternative/#comments</comments>
		<pubDate>Thu, 07 May 2009 19:05:28 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Products]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[copy detection]]></category>
		<category><![CDATA[copygator]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[copyscape]]></category>
		<category><![CDATA[fairshare]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[plagium]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=3418</guid>
		<description><![CDATA[A new plagiarism service promises to shake up the scene by providing a solid competitor to Copyscape. But can it hold up?]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://files.plagiarismtoday.com/wp-content/uploads/2009/05/plagium-logo-300x71.jpg" alt="plagium-logo" title="plagium-logo" width="300" height="71" class="alignleft size-medium wp-image-3419" /></p>
<p>When it comes to tracking content across the Web, <a href="http://www.copyscape.com">Copyscape</a> is, for the most part, the brand name to know.</p>
<p>This reputation has been very well earned. They recently took <a href="http://www.plagiarismtoday.com/2008/11/04/copyscape-tops-plagiarism-checker-testing/">top honors in a round of plagiarism checker testing services</a>, which put them against several much more expensive services.</p>
<p>However, competitors have begun to emerge. Some, such as <a href="http://fairshare.cc">FairShare</a> offer <a href="http://www.plagiarismtoday.com/2008/11/04/copyscape-tops-plagiarism-checker-testing/">more features and more free results</a> and others, such as <a href="http://www.copygator.com">CopyGator</a>, <a href="http://www.plagiarismtoday.com/2009/01/20/copygator-a-game-changer/">offer great convenience</a>. Despite this, especially for static content, Copyscape has remained the gold standard.</p>
<p>But a new service hopes to provide a new challenge. <a href="http://www.plagium.com/index.cfm?mode=text">Plagium</a>, a copy detection system by <a href="http://www.septetsystems.com/">Septet Systems</a>, provides a very similar service to Copyscape but adds additional free features and uses Yahoo! rather than Google to perform its searches.</p>
<p>The question is how does it stack up and, to measure that, I put the service through a battery of tests, using my well-copied and plagiarized literary works as the measuring stick.<span id="more-3418"></span></p>
<h4>About Plagium</h4>
<p>The comparisons between Plagium and Copyscape are obvious, however, the default interface of Plagium is not to provide a URL to be checked, as with Copyscape, but a textbox to paste your text. Though this is less convenient, it actually, in my experience, provides better results as the plagiarism checker is only examining the content, not the surrounding text (navigation, footer, etc.).</p>
<p>However, if you prefer the convenience of just providing the URL, you can click the &#8220;Check URL&#8221; link and get a more Copyscape-like interface.</p>
<p>Plagium&#8217;s results add an interesting new feature called the &#8220;Timeline&#8221;, which shows roughly when the various reuses went online. This lets you prioritize your actions based upon either the most recent or the least current matches. However, as neat as the feature is, it can get cluttered on works that have a lot of copies and it isn&#8217;t exactly clear in the beginning what all of the elements mean, especially the sizes of the bubbles.</p>
<p><img src="http://files.plagiarismtoday.com/wp-content/uploads/2009/05/timeline-2.jpg" alt="timeline-2" title="timeline-2" width="450" height="168" class="alignnone size-full wp-image-3430" /></p>
<p>However, the most powerful feature of Plagium is its alert system. If you register for a free account, you can have the service track your text and alert you in a weekly email to any new copies it finds. You can also subscribe to an RSS feed of the results. </p>
<p>With this feature, Plagirum becomes something of a FairShare targeted at static content. Where FairShare requires an RSS feed to parse (<a href="http://www.associatedcontent.com/article/1657226/how_to_create_a_custom_google_reader.html">though there are hacks that can be used to get static content into the system</a>), this can work on any text that can be pasted into the system.</p>
<p>What is amazing about this is that Copyscape only offers the URL search and ten results free. <a href="http://copyscape.com/signup.php?pro=1&#038;o=f">It&#8217;s paid accounts</a>, five cents a search, allows users to paste text and receive unlimited results. They also <a href="http://copyscape.com/copysentry.php">provide a sentry service</a>, which monitors 10 pages once a week for about $5 per month. </p>
<p>However, Plagium currently offers all of these features for free. A representative for the company said that they are providing it for free to &#8220;attract paying customers for custom information tracking system development work,&#8221; though the site does also accept donations.</p>
<p>But not much of this matters if the plagiarism detection isn&#8217;t up to code. So I decided to put the system to a quick test to see how it handles some of my most plagiarized works.</p>
<h4>The Tests</h4>
<p>For the purpose of this test I ran five of my works through both Plagium, Copyscape (using the text paste feature) and, as a baseline, I ran a statically improbably phrase from each work through Google. </p>
<p>In each case I looked and attempted to verify that at least most of the results were not false positives. However, it is possible that there are some non-matches or additional duplicates included within the mix.</p>
<p>The results of the tests are below:</p>
<p><strong>Poem 1</strong></p>
<p>The first poem was a 224-wrord poem that was known to be widely plagiarized.</p>
<table cellspacing=10>
<tr>
<td><strong>Plagium</strong></td>
<td><strong>Copyscape</strong></td>
<td><strong>Google</strong></td>
</tr>
<tr>
<td>34</td>
<td>29</td>
<td>351</td>
</tr>
</table>
<p>The first test showed that Plagium found approximately 17% more matches than Copyscape. Copyscape, for example, did not find my own site though Plagium listed it first.The page is listed in Google. </p>
<p>Still, the Google results trumped both of the two very handily and provided a large amount of additional results. However, the actual number of results is far lower than the number provided as it appears many of the Google results were duplicates where the same page had multiple URLs.</p>
<p><strong>Poem 2</strong></p>
<p>The second poem is a 279 word poem also known to be heavily plagiarized.</p>
<table cellspacing=10>
<tr>
<td><strong>Plagium</strong></td>
<td><strong>Copyscape</strong></td>
<td><strong>Google</strong></td>
</tr>
<tr>
<td>21</td>
<td>9</td>
<td>201</td>
</tr>
</table>
<p>In this test, Plagium outperformed Copyscape by over 100%. However, Plagium does suffer from some duplication issues. For example, my site has two pages listed with the work on it though, once again, it doesn&#8217;t appear at all in Copyscape. However, even with this, there are far more unique results in Plagium.</p>
<p>Google once again trumped both of them but the duplication in Google makes that only useful for baseline, not an exact number.</p>
<p><strong>Story 1</strong></p>
<p>For this test I used a 1550 word short story with very limited reuse. </p>
<table cellspacing=10>
<tr>
<td><strong>Plagium</strong></td>
<td><strong>Copyscape</strong></td>
<td><strong>Google</strong></td>
</tr>
<tr>
<td>5*</td>
<td>1</td>
<td>5*</td>
</tr>
</table>
<p>(*)In this test all three essentially tied. The difference between the 5s by Plagium and Google was the four matches they found on my site. All three found the exact same reuse, which is a legitimate copy of the work on another site.</p>
<p>In this case, they all three performed the same.</p>
<p><strong>Prose 1</strong></p>
<p>For this test, I used a 785 word short story with a modest amount of known reuse.</p>
<table cellspacing=10>
<tr>
<td><strong>Plagium</strong></td>
<td><strong>Copyscape</strong></td>
<td><strong>Google</strong></td>
</tr>
<tr>
<td>6</td>
<td>10</td>
<td>41</td>
</tr>
</table>
<p>In this case, Copyscape was the clear winner. Not only did Plagium return fewer results, but the six results were really just 2 as 4 results were from my site and the other 2 from the same forum. Copyscape, on the other hand, delivered 10 matches, at least 4 of which were unique.</p>
<p>Google&#8217;s results, on the other hand, contained 20-25 duplicates, making its number closer to the mid 20s.</p>
<p><strong>Prose 2</strong></p>
<p>For this test I used a 202 word prose piece with a moderate amount of known plagiarism.</p>
<table cellspacing=10>
<tr>
<td><strong>Plagium</strong></td>
<td><strong>Copyscape</strong></td>
<td><strong>Google</strong></td>
</tr>
<tr>
<td>4</td>
<td>1</td>
<td>26</td>
</tr>
</table>
<p>In this case, Plagium found three unique matches, including my site, that were not in Copyscape. Google did find more matches than both, but once again there was a serious duplication issue. At least nine items in Google&#8217;s results were duplicates, meaning that the number is closer to 15-18 results.</p>
<p>Still, this was a clear case where Plagium found results that Copyscape missed.</p>
<h4>Results</h4>
<p>In all five tests, Google outperformed both Plagium and Copyscape. However, it contained a very high amount of duplicate results and the benefit was likely minimal. In the contest between Plagium and Copyscape, Plagium found more matches three of the times, Copyscape did better in one test and they tied in one.</p>
<p>It appeared to me that Copyscape was not producing the number of matches it once did. The second poem, for example, is the same one I used when <a href="http://www.plagiarismtoday.com/2007/10/02/copyscape-improved-again/">comparing Copyscape to itself in 2007</a>. In that testing, it first found no results, then ten results, then 31. With today&#8217;s test, it found 9 even though the actual number of copies has remained fairly flat. </p>
<p>Whether this is because Copyscape does not work as well with pasted text (the first tests were done with the URL function) or because changes have limited the results it is producing, it is clear that it is not as effective as it once was for finding all of the results for a work.</p>
<p>However, it is important to note that this is far from a comprehensive comparison of the two service. These are just five very limited cases. Everyone else&#8217;s mileage will vary. </p>
<h4>Bottom Line</h4>
<p>In the end Plagium&#8217;s results were very solid and it actually performed better than Copyscape in most tests. Whether this is a fluke or a sign of something greater, remains to be seen.</p>
<p>However, since Plagium is completely free, there&#8217;s no harm in trying it out and I actively encourage you to do so. You can also experiment with the alerts feature and see if it works well for your content (I haven&#8217;t seen any results yet in the few that I set up). </p>
<p>Though I&#8217;m not ready to recommend Plagium as the sole plagiarism checker one should use, I don&#8217;t think I&#8217;ll ever reach that point with any product, but it is a very solid addition pulling in some very competitive matching numbers.</p>
<p>If Plagium isn&#8217;t a part of your plagiarism detection toolbox, it should be. The results are solid from what I&#8217;ve seen, the features are very powerful and, best of all, it is completely free. You can&#8217;t ask for much more out of a plagiarism checker.</p>
<p>Personally, I&#8217;ll probably start relying more on Plagium for my static content and continue to use FairShare for items already within an RSS feed. This works well with the intentions and limitations of the two services. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2009/05/07/plagium-a-copyscape-alternative/feed/</wfw:commentRss>
		<slash:comments>28</slash:comments>
		</item>
		<item>
		<title>Copygator: A Game Changer?</title>
		<link>http://www.plagiarismtoday.com/2009/01/20/copygator-a-game-changer/</link>
		<comments>http://www.plagiarismtoday.com/2009/01/20/copygator-a-game-changer/#comments</comments>
		<pubDate>Tue, 20 Jan 2009 19:55:04 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Products]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[copygator]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[RSS]]></category>
		<category><![CDATA[RSS scraping]]></category>
		<category><![CDATA[urlfan]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=2534</guid>
		<description><![CDATA[A new services called Copygator promises to change the way you detect your content. With a simplified detection and reporting system, it seems to have a lot to offer bloggers, but can the detection live up to its marketing?]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  align="left" src="http://www.plagiarismtoday.com/wp-content/uploads/2009/01/copygator-logo.png" alt="copygator-logo" width="250" height="86" class="attachment wp-att-2543 alignleft" />Over the past few days, I have received comments from an individual claiming to use Copygator, a new service (the domain was registered on Jan. 11) that claims to &#8220;monitor your RSS feed and find where your content has been republished in the blogosphere.&#8221;</p>
<p>The idea behind Copygator is that you take your site URL or your RSS feed, submit it to their site and then they monitor it for any potential matches. The site will notify you of results that it gets either via email, RSS or a color-changing badge that you can place on your site.</p>
<p>Though the service definitely sounds interesting, in its initial testing the results were less than impressive and I have several concerns about the service that need to be addressed before I can give it any recommendation.</p>
<p>Please note that this is not intended to be a thorough review of the service, just an overview of what it offers and some of my initial experiences.<span id="more-2534"></span></p>
<h4>The Basics</h4>
<p>When you first visit the Copygator home page, you are given several options for how you may want to submit your content. The easiest way is to just provide either the URL of your site (and let Copygator parse your content from the autodetected feed) or submit the feed directly.</p>
<p><img src="http://www.plagiarismtoday.com/wp-content/uploads/2009/01/copygator-example1.png" alt="copygator-example1" width="523" height="36" class="attachment wp-att-2535 " /></p>
<p>From there, the site will give you a series of options on how to subscribe for additional updates, including, as mentioned above, email, RSS and more.</p>
<p><img src="http://www.plagiarismtoday.com/wp-content/uploads/2009/01/copygator-example.png" alt="copygator-example" width="500" height="191" class="attachment wp-att-2538 " /></p>
<p>And below that you see the matches that it has already found. Typically, when adding a new feed, it takes some time before any matches appear. Though the site says 10-15 minutes, in my testing it was a little bit longer, though not horribly so, at about 30 minutes.</p>
<p>Each of the matches gives you the the option of either visiting that site&#8217;s page on Copygator, visiting the page directly or &#8220;comparing&#8221; the two works, looking at any matching text, called &#8220;collisions&#8221;, that the site has found.</p>
<p>Unlike other sites, including Copyscape, that display the full page and highlight relevant portions, Copygator only displays general information about the percent matching and comparisons of the similar text. </p>
<p><img src="http://www.plagiarismtoday.com/wp-content/uploads/2009/01/copygator-example1-1.png" alt="copygator-example1-1" width="500" height="210" class="attachment wp-att-2540 " /></p>
<p>Overall, the process is pretty straightforward and easy to understand. The service does have a great deal going for it, especially if you&#8217;re a blogger that has been dealing with a large amount of scraping.</p>
<h4>Reasons to Cheer</h4>
<p>The good news is that Copygator is free and brain-dead easy to use. You simply submit your feed and Copygator does the rest. You even get some great choices on how you want to be notified of new matches. Though I am not sure about the color-changing button, I worry that it could be a means to spam the search engines, I could see installing it and using my WordPress theme to only display it to me when I am logged in.</p>
<p>The basic premise is to make this service as simple to use in every way and that permeates through the other features. For example, rather that presenting statistics about matching words or percentages, the service describes the match saying that it either &#8220;bears a slight resemblance&#8221;, &#8220;share many similar elements&#8221; or &#8220;is an exact copy&#8221;.</p>
<p>Though these vague descriptions may not please stat junkies or those that are dealing with very large amounts of infringement, to most bloggers, this takes a lot of the burden off figuring out which matches to look at.</p>
<p>In short, it is an approachable service, both in terms of price, tools and terminology. But it is far from a perfect one. There are still many elements in it that have me concerned.</p>
<h4>Missteps and Drawbacks</h4>
<p>The biggest problem that I have with the service is that the matching does not seem to be working correctly at this time. Though I know well that PT&#8217;s content is reused on other sites (much of it with permission), none of it seems to be picked up. Even after setting up my feed and tracking it for approximately a day, no other sites are appearing.</p>
<p>In fact, the only matches I am seeing right now are matches from within the site itself. Clearly, Plagiarism Today should not be showing up for its own matches (many posts, such as the podcasts, are template-based and will always bear some resemblance). The site seems to be confused since the PT has both a regular an on-server RSS (/feed) and a FeedBurner one, the former redirects to the latter (I&#8217;ve been meaning to clean this up for some time but have had other issues).</p>
<p>Part of the issue is a limitation with Copygator where it can only parse content in RSS feeds. It can only find matches on content found in one feed against content found in others. If the matched content does not appear in a site&#8217;s RSS feed or that blog is not among the 2 million feeds being monitored right now (in 2005 <a href="http://www.blogherald.com/2005/05/25/world-wide-blog-count-for-may-now-over-60-million-blogs/">there was already an estimated 60 million blogs</a>), it isn&#8217;t going to show up as a match.</p>
<p>As a result, <a href="http://www.plagiarismtoday.com/2008/08/19/tineye-protecting-images-preventing-orphans/">much like Tineye</a>, the results of the matches are very incomplete. However, the service is new and it may grow and become more valuable in the coming weeks and months. The bad news is that it has a long way to go.</p>
<h4>Some Personal Problems</h4>
<p>Usually, when doing write ups and reviews about new products, I do not mention any personal issues that I have with the service. However, this is a rare case where those personal issues may reflect on the site and service.</p>
<p>First, in the past two days I have received two comments from &#8220;James S&#8221; regarding Copygator. The first was to a <a href="http://www.plagiarismtoday.com/2008/02/13/using-copyscape-to-detect-derivative-works/comment-page-1/#comment-124297">post about using Copyscape to detect derivative works</a> and the second was on <a href="http://www.plagiarismtoday.com/2008/10/14/copyrightspot-new-copy-detection-service/comment-page-1/#comment-124304">my recent post about CopyrightSpot</a>. </p>
<p>The two comments were clearly creating using the same template, though the first post was clearly not a complete post and both identified James as a user of the product. However, the email address that was used to post the comments is the <a href="http://whois.domaintools.com/copygator.com">same email that is from the whois</a> and only one of the comments, the second, had the name linked to Copygator.</p>
<p>Though I am hard pressed to call this comment spam as it was to two relevant posts and it appears that the posting was not automated, I find it to disconcerting that the operator would post comments without clearly identifying himself as the creator. I have no rule that forbids product and service creators from posting information on their sites as comments, though I prefer they <a href="http://www.plagiarismtoday.com/contact-pt/">contact me directly</a>.</p>
<p>The other concern is that the site is connected to <a href="http://www.urlfan.com/" rel="nofollow">URLFan</a>, a  search engine and aggregator <a href="http://www.plagiarismtoday.com/2008/04/17/search-engines-three-to-beware/">I&#8217;ve expressed concern about in the past</a>. Though the URLFan seems to have stopped engaging in the controversial behavior (truncating articles, requiring users to visit the page to view the full content, etc.), it is a pedigree that is going to concern many.</p>
<p>In the end, while I am worried about these elements, I did not want them to keep me from not covering a potentially useful service and I decided it was best to present my readers with the information that I know and let them decide.</p>
<p>I did contact &#8220;James&#8221; via the email provided before writing this article but he has not responded as of yet. However, it has only been a few hours since the initial contact (I only learned of these issues this morning). I will update and expand this article should I hear back.</p>
<h4>Conclusions</h4>
<p>Copygator shows a great deal of promise but it has serious limitations that prevent it from being a very useful tool at this time. Relying on it as a main source of content theft detection would be foolish at this moment. Even putting aside my personal issues and the pedigree of this site, there are just too many limitations to trust it solely.</p>
<p>That being said, it is nice to see some innovation and a real focus on simplicity. If this service can improve its matching, then there could be a bright future for it. The flaw might not be that it is a bad service, but that it was launched before it was truly ready.</p>
<p>Still, given the history of the site and how it was promoted, there are a lot of reasons to be wary of it. I intend to do some more thorough testing with it in the coming days, which I&#8217;ll likely report on next week, but I can&#8217;t see myself relying on it unless some very difficult questions are answered.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2009/01/20/copygator-a-game-changer/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Page Caching using disk: enhanced

Served from: www.plagiarismtoday.com @ 2012-02-13 08:28:06 -->
