<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Plagiarism Todayplagium | Plagiarism Today</title>
	<atom:link href="http://www.plagiarismtoday.com/tag/plagium/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.plagiarismtoday.com</link>
	<description>Content Theft, Plagiarism, Copyright Infringement</description>
	<lastBuildDate>Mon, 13 Feb 2012 06:51:37 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>The Limitation of Every Plagiarism Checker</title>
		<link>http://www.plagiarismtoday.com/2011/12/07/the-limitation-of-every-plagiarism-checker/</link>
		<comments>http://www.plagiarismtoday.com/2011/12/07/the-limitation-of-every-plagiarism-checker/#comments</comments>
		<pubDate>Wed, 07 Dec 2011 18:34:47 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[copyscape]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[plagiarism-detection]]></category>
		<category><![CDATA[plagium]]></category>
		<category><![CDATA[turnitin]]></category>
		<category><![CDATA[wcopyfind]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=11985</guid>
		<description><![CDATA[As teachers and content creators rely more and more on plagiarism detection, they often lose sight of just how limited even the best tools are...]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/12/turnitin-logo.jpg" alt="Turnitin Logo" title="Turnitin Logo" class="alignleft size-full wp-image-11993" />When it comes to plagiarism, technology has been both a blessing and a curse. Though it has made it easier than ever to find and copy work from others without attribution, it&#8217;s also made it easier to track and handle plagiarism when it happens.</p>
<p>With tools that can search billions of documents in seconds and can find matches only a few words in length, it might seem as if plagiarism would be as easily detected as finding information in Google. A matter of merely punching your query and going through the results.</p>
<p>Unfortunately, that isn&#8217;t the case.</p>
<p>Plagiarism detectors have a huge limitation and one that isn&#8217;t likely to go away any time soon. That limitation is, simply put, that plagiarism detectors can&#8217;t actually detect plagiarism and, instead, do something very different altogether.<span id="more-11985"></span></p>
<h4>How Plagiarism Detection Works</h4>
<p>This problem might seem a bit odd to those unfamiliar with the technology. After all, dishwashers wash dishes and car starters start cars, but plagiarism detectors don&#8217;t actually detect plagiarism. </p>
<p>Instead, what they actually detect is sections of identical text. Though there is a variety of techniques for doing this, the end results are pretty much always the same. A plagiarism detection service looks for matching strings of words between the document its looking at and the ones it has in its index. This is true for a local plagiarism checker, such as <a href="http://plagiarism.bloomfieldmedia.com/z-wordpress/software/wcopyfind/">WCopyFind</a>, search engine-based systems such as <a href="http://www.copyscape.com">Copyscape</a> and <a href="http://www.plagium.com">Plagium</a> and high-end system such as <a href="https://turnitin.com">Turnitin</a>.</p>
<p>They all work on the same principle and basically function much like we would expect Google or another search engine to work, finding the words we want in other sources and providing the best results it can.</p>
<p>While this makes them powerful tools, doing the same comparison by hand would be impossible given all of the sources these tools can check, it does mean that it has some tremendous blind spots. </p>
<p>However, those blind spots are only a problem if people aren&#8217;t aware or don&#8217;t believe that they are there. Then they become huge issues that can lead to both false positives and false negatives.</p>
<h4>The Limitations of Plagiarism Detection</h4>
<p>Since plagiarism detection tools can only detect copying, or more specifically similar phrases, there are two areas where they are particularly weak.</p>
<ol>
<li><strong>Non-Verbatim Plagiarism:</strong> Plagiarism that involves the rewriting, translating or otherwise redrafting the text can&#8217;t be detected. This can be difficult to get away with as most plagiarism detectors are extremely sensitive, but since plagiarism detectors don&#8217;t analyze the content of the work, just the words, it can&#8217;t see if you lifted the idea or information if you didn&#8217;t also lift the words. This is a common problem in academia, which treats this kind of plagiarism equally as seriously as verbatim plagiarism.</li>
<li><strong>Common Phrasing/Attributed Use:</strong> Second, though many plagiarism checkers will make an attempt to separate out attributed use, given the variety of attribution styles it isn&#8217;t always possible. Also, given how common some phrases are in the English language, many plagiarism checkers will report matches that are actually just coincidence.</li>
</ol>
<p>In short, plagiarism detection tools are just machines and they can make mistakes. However, that is true with any tool as, for example, you don&#8217;t discard Microsoft Word because you can make a typo. </p>
<p>Also, like any other tools, plagiarism checkers are useless without humans to use them intelligently, which is the biggest problem such tools have.</p>
<h4>The Human Element</h4>
<p>The answer to all of this is simple, the decision as to what is and what is not plagiarism should be left to human beings. Humans are the only ones who can detect non-verbatim plagiarism and are the only one who can make determinations about the likelihood that the matches are coincidence and the whether the attribution was adequate or not.</p>
<p>Professors who have a hard rule about papers not being more than X% matching or authors who don&#8217;t let others copy more than X number of words before seeking legal action aren&#8217;t fighting plagiarism, but are doing more to confuse the issue.</p>
<p>While bright line rules are always tempting because they are easy to remember and follow, with plagiarism, there are few such rules and you can&#8217;t turn your judgment over to a machine.</p>
<h4>Bottom Line</h4>
<p>None of this is meant as a slight to any of these tools. I use all of the tools listed regularly and am grateful for the valuable service they provide. The problem doesn&#8217;t lie with the technology, but with those who treat these tools as magical solutions that are capable of making perfect judgments about plagiarism.</p>
<p>They are anything but.</p>
<p>As tempting as it is to turn over our judgment on plagiarism matters to the machines, it simply doesn&#8217;t work. Not only will a lot of plagiarism go undetected, but a lot of people will be accused falsely.</p>
<p>Though plagiarism detection tools are a part of the solution, they have to be used in tandem with human judgment and discretion to do any good.</p>
<p>If used correctly, a plagiarism detection service will alert someone to the possibility of plagiarism, not to its actual existence.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2011/12/07/the-limitation-of-every-plagiarism-checker/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>PlagScan Review: Solid Plagiarism Detection</title>
		<link>http://www.plagiarismtoday.com/2011/09/06/plagscan-review-solid-plagiarism-detection/</link>
		<comments>http://www.plagiarismtoday.com/2011/09/06/plagscan-review-solid-plagiarism-detection/#comments</comments>
		<pubDate>Tue, 06 Sep 2011 18:30:00 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[copyscape]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[plagium]]></category>
		<category><![CDATA[plagscan]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=10894</guid>
		<description><![CDATA[PlagScan, a German plagiarism detection service, certainly has fans all over the globe, but can it track your content online?]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/09/plagscan-logo-300x103.jpg" alt="PlagScan Logo" title="PlagScan Logo" width="300" height="103" class="alignleft size-medium wp-image-10936" />PlagScan is a plagiarism checker that certainly has its share of fans. In <a href="http://plagiat.htw-berlin.de/software-en/2010-2/">Dr. Weber-Wulff&#8217;s most recent round of plagairism tests</a>, <a href="http://www.plagiarismtoday.com/2011/01/13/plagaware-takes-top-honors-in-plagiarism-checker-showdown/">PlagScan was listed as &#8220;partially useful&#8221;</a>, the highest honor those tests awarded. Overall, PlagScan placed fourth.</p>
<p>That placed it well above better-known services, including both Copyscape and Plagium, both of which are more widely used by webmasters wanting to track their content.</p>
<p>Also, <a href="http://www.plagscan.com">PlagScan</a> has earned the trust of <a href="http://www.plagscan.com/organizations_searching_plagiarism_with_plagscan">dozens of academic institutions and businesses</a>, most of which are in the company&#8217;s native Germany. </p>
<p>However, Dr. Weber-Wulff&#8217;s tests were aimed at an education environment. The question remains, how well does it test when it comes to protecting web-based content from infringement? I was recently approached by Markus Goldbach, PlagScan&#8217;s CEO, who asked me to do such a test. </p>
<p>So, I decided to put PlagScan through a brief test to find out how effective it was for this particular usage case.<span id="more-10894"></span></p>
<h4>What is PlagScan</h4>
<p><img style=' float: right; padding: 4px; margin: 0 0 2px 7px;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/09/plagscan-sample-image-300x165.jpg" alt="" title="plagscan-sample-image" width="300" height="165" class="alignright size-medium wp-image-10938" />PlagScan originally saw life as <a href="http://www.plagscan.com/seesources/">SeeSources</a>, a free plagiarism detection tool that focused on language patterns in plagiarism detection. However, to continue development, the people behind it took the product commercial and created PlagScan, which operates now as a professional and academic plagiarism detection service.</p>
<p>Plagscan works on the Yahoo!BoSS-API, the same API many similar services are built upon, and works by either uploading a document or by copying and pasting the text you want to check. PlagScan then analyzes the text involved and then returns the results in an email that includes PDF, plain text and docx formatted results. The results are also stored in your account.</p>
<p>The question is how well does the system work? To find out, I decided to test five different documents in PlagScan (using the copy and paste method) and see what the results were.</p>
<h4>The Test Results</h4>
<p>To test PlagScan, I decided to pit against two more of the more common tools webmasters use to track text, CopyScape and Plagium. I had all three services analyze five separate and structurally different works, each with differing amounts of known plagiarism to see how many results the services would uncover and how relevant those results would be.</p>
<p>Note: For all tests, I used the most &#8220;Pro&#8221; feature, available, including a pro account in Copyscape and &#8220;Deep Scan&#8221; in Plagium. I did not include matches in PlagScan that the system considered to be to low to be unoriginal. In each case, I used the text upload feature for consistency.</p>
<p><strong>Test 1 &#8211; Poem (Medium Reuse)</strong></p>
<p>For the first test, I ran through an old poem of mine (243 words) that i knew still had a large number of copies on the Web, both legitimate and plagiarized.</p>
<ol>
<li><strong>PlagScan:</strong> 20</li>
<li><strong>Copyscape:</strong> 10</li>
<li><strong>Plagium:</strong> 4</li>
</ol>
<p>The results here are pretty striking. Plagium flat out found more sources. That being said, there was some source duplication and, once you eliminate the confirmed dupes, as well as a small number of false positives, there are still several pages that PlagScan found that Copyscape did not.</p>
<p>That being said, Copyscape did find most of the critical domains, if not all of them. But was still outmatched in this test.</p>
<p><strong>Test 2 &#8211; Prose Piece (Low Reuse)</strong></p>
<p>For the second test, I ran through an old short essay of mine (202 words) that I knew had seen only limited reuse.</p>
<ol>
<li><strong>PlagScan:</strong> 2</li>
<li><strong>Plagium:</strong> 1</li>
<li><strong>Copyscape:</strong> 0</li>
</ol>
<p>In this test, PlagScan managed to detect both the original work and one plagiarism of it. Plagium managed to find the original and Copyscape reported it as clean.</p>
<p>In short, PlagScan found a plagiarist of this work that two others did not.</p>
<p><strong>Test 3 &#8211; Short Story Piece (Low Reuse)</strong></p>
<p>For the third test, I used an old short story of mine that (1682 words), as with the prose, saw only limited reuse.</p>
<ol>
<li><strong>PlagScan:</strong> 18*</li>
<li><strong>Plagium:</strong> 1</li>
<li><strong>Copyscape:</strong> 0</li>
</ol>
<p>Once again, Copyscape reported the work as being clean and Plagium only found the original. However, this time PlagScan found some 18 matches. However, most of those matches were for very short sections of the work and all except one, the original, were not actually copies.</p>
<p>This was a clear case of PlagScan returning a large number of false positives, though it did alert me to one or two pieces that might be considered of interest.</p>
<p><strong>Test 4 &#8211; Marketing Copy (High Reuse)</strong></p>
<p>For the fourth test, I decided to run through a client of mine&#8217;s marketing copy (460 words) that had seen moderate reuse and plagiarism over the past few months. The goal was to test how up to date the systems were with their databases.</p>
<ol>
<li><strong>Copyscape:</strong> 39</li>
<li><strong>PlagScan:</strong> 23</li>
<li><strong>Plagium:</strong> 11</li>
</ol>
<p>PlagScan lost this one, finding only 23 results to CopyScape&#8217;s 39. However, both sites suffered from a false positive problem as, toward the bottom of both results, several sites were listed even though they only had a small handful of words in common.</p>
<p>That being said, Copyscape seemed to have slightly fewer false positives in this test, making it the clear winner.</p>
<p><strong>Test 5 &#8211; Information Copy (High Reuse)</strong></p>
<p>Finally, a test to look at one of the more commonly plagiarized pieces of text I work with, an informational piace (1074 words) widely lifted by competitors. As such. </p>
<ol>
<li><strong>Copyscape:</strong> 39</li>
<li><strong>PlagScan:</strong> 25</li>
<li><strong>Plagium:</strong> 11</li>
</ol>
<p>Once again, Copyscape found more matches though, this time, both were plagued very badly by false positives and listed many results that weren&#8217;t actually matches. Still, Copyscape seemed to fair a little bit better with accuracy of results and, at the same time, turn out better quality results. </p>
<p>All in all, this one is another win for Copyscape.</p>
<h4>Beyond the Numbers</h4>
<p>Numerically, this is a very impressive showing for PlagScan as it either beat or held its own against much better-known competitors. However, this accuracy comes at a cost, both financially and time-wise.</p>
<p>Fist, PlagScan does cost a good deal more than CopyScape. Copyscape, for example, charges 5 cents a search (up to 2,000 words), meaning this test cost me a mere 25 cents. PlagScan, on the other hand, charges one credit per 100 words. At its cheapest, a credit costs about 1.1 cents. If you do more than 500 words in a search, Copyscape will always be cheaper (barring a different plan). </p>
<p>All in all, I spent 39 credits on this test, which at its cheapest would be just shy of 43 cents. Still not prohibitively expensive, but worth noting, especially for those with much larger projects.</p>
<p>The bigger drawback to PlagScan is the system itself. First, results with PlagScan take far longer than with either Copyscape or Plagium. With every test I would start PlagScan and then perform the test with the others and have the other two results back before PlagScan was finished. It was by no means torturously slow, taking no more than a couple of minutes on the longest test, but that time can add up if you&#8217;re testing a lot of work.</p>
<p>Finally, PlagScan, from a user interface standpoint, is definitely more geared to detecting the authenticity of an unknown work than to finding plagiarism of a known original online. The reports deal more with how much plagiarism was detected and lack features such as case management or case prioritization. There&#8217;s also no way to automatically monitor for plagiarism, as with both <a href="http://www.copyscape.com/copysentry.php">Copyscape&#8217;s Copysentry</a> and <a href="http://blog.plagium.com/2009/02/track-text-through-plagium-alerts.html">Plagium&#8217;s alerts system</a>.</p>
<p>In the end, though PlagScan definitely does a great job in finding matches, possibly a better job than Copyscape in many cases, using it will be unwieldily for most webmasters, bloggers, authors and other content creators.</p>
<p>That doesn&#8217;t mean that it isn&#8217;t worthwhile, it definitely is, just that its use may be somewhat limited.</p>
<h4>Bottom Line</h4>
<p>In the end, PlagScan fared very well in these tests. It competed well with Copyscape, both of which, numerically at least, trounced Plagium. That being said, Plagium was nearly immune to the false positives issue that plagued both Copyscape and PlagScan, making it more accurate, though probably not the most complete.</p>
<p>Clearly though, PlagScan did better with some kinds of works than others. However, this is typical with plagiarism checkers. I wouldn&#8217;t recommend you rely on it, or any other plagiarism checker, as your sole tool. Not only would you likely be missing results, but its setup would make that unweidly.</p>
<p>That being said, it&#8217;s definitely a new tool to put in the metaphorical toolbox, something to use to support your plagiarism detection efforts. It might not replace anything else, but using it certainly can help you see and catch more.</p>
<p>I will certainly be keeping my account active and using it, at least from time to time. As it adds tools and services aimed at these goals, I&#8217;ll begin to use it even more.</p>
<p>All in all, PlagScan held its own in this field and, with a few tweaks, can grow and become a front runner in this field in very short order.</p>
<p><em><strong>Note:</strong> I can provide copies of the test works to all parties if they are interested.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2011/09/06/plagscan-review-solid-plagiarism-detection/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Plagium Introduces Deep Search</title>
		<link>http://www.plagiarismtoday.com/2011/04/13/plagium-introduces-deep-search/</link>
		<comments>http://www.plagiarismtoday.com/2011/04/13/plagium-introduces-deep-search/#comments</comments>
		<pubDate>Wed, 13 Apr 2011 18:32:59 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[plagiarism-detection]]></category>
		<category><![CDATA[plagium]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=9463</guid>
		<description><![CDATA[Plagiarism detection service Plagium has introduced a "Deep Search" tool to help you find more matches and better search through longer works.]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2009/05/plagium-logo-300x71.jpg" alt="" title="plagium-logo" width="300" height="71" class="alignleft size-medium wp-image-3419" /></p>
<p>Earlier this week, <a href="http://plagium.com">Plagium</a> <a href="http://blog.plagium.com/2011/04/plagium-announces-deep-search.html">announced its new &#8220;Deep Search&#8221; feature</a>, which it hopes will make it easier to spot duplicates and more subtle plagiarisms/copies in longer works.</p>
<p>The new feature works by separating a longer work into multiple sections, each roughly a paragraph in length, locating duplicated content within each section and displaying the matching content contained within each detected page. </p>
<p>The idea is to make it easier to go through longer document, to more quickly understand which copies are the most important, the content they are using and how much matching material there is.</p>
<p>The question, however, is how well does the system work and is it worth the money that Plagium is charging? To find out, I ran Plagium through a series of quick tests to see how well it performed.<span id="more-9463"></span></p>
<h4>How Plagium Deep Search Works</h4>
<p>Previously I talked about Plagium and <a href="http://www.plagiarismtoday.com/2009/05/07/plagium-a-copyscape-alternative/">compared it favorably to Copyscape</a> and other, similar plagiarism checkers for the purpose of finding plagiarisms and other copies of your work online. I even mentioned it in a case study showing how it was useful in <a href="http://www.plagiarismtoday.com/2010/10/05/case-study-tracking-a-sneaky-plagiarist-poet/">catching a plagiarizing poet</a>.</p>
<p>However, one of my gripes about Plagium was that it has always been difficult to parse the results. Plagium has always provided good information about the infringing pages, but not necessarily about what was being copied. This was especially problematic with longer documents where the copied text might be buried deep within the page. </p>
<p>Plagium&#8217;s deep search attempts to fix that. By breaking lengthy documents into sections and showing match results for each part of the document, it makes it easy to both get a general overview of the entire document via its &#8220;summary&#8221; feature and results for each part of the document.</p>
<p><img src="http://www.plagiarismtoday.com/wp-content/uploads/2011/04/plagium-sample-500x304.jpg" alt="" title="plagium-sample" width="500" height="304" class="alignnone size-large wp-image-9464" /></p>
<p>This deep searching, however, comes at a cost. Unlike Plagium&#8217;s &#8220;Quick Search&#8221; feature, which is the equivalent of its previous service, deep searching is not free. Deep searches from Plagium cost $1 for 100,000 characters (approximately 20,000 words), $2 for 200,000 characters (approximately 40,000 words) and $10 for 1,100,000 characters (approximately 220,000 words). </p>
<p>So is Deep worth the money? I put the process through a few tests to find out.</p>
<h4>Usability and Interface</h4>
<p>Right off the bat, there were a few things that annoyed me about Plagium&#8217;s Deep Search feature. First and foremost was that these searches were far from quick. </p>
<p>Even for medium-length documents, these searches routinely took longer than 40 seconds. While that might not seem long, bear in mind that other services, including Plagium&#8217;s Quick Search tool, usually take less than four seconds. Basically, if you perform a Deep Search, be prepared to wait.</p>
<p>The interface itself was functional but not exactly attractive or impressive. The results are broken up into a summary and a detailed report. The first provides just an overview of the pages detected and the latter does the section-by-section breakdown.</p>
<p>One useful feature is the ability to delete sites from the results. This is useful both if you have domains that you aren&#8217;t interested in, such as a permitted use or even your own site, or to remove sites you&#8217;ve already processed.</p>
<p>Still, it&#8217;s frustrating that there was no way to do a full side-by-side comparison of the original and the duplicate. Though hovering your mouse over each result in the detailed report would show you the matching text in that section, getting a complete view of the suspected copying is something that&#8217;s impossible with Plagium but easy with Copyscape.</p>
<p>That being said, I do like the way Plagium breaks down the similarities by words, sentences and highest search engine rank. It makes it very easy to get an &#8220;at a glance&#8221; understanding how serious the copying really is and lets you prioritize matches easily.</p>
<p>However, all of these features are meaningless without good matching.  To find out how well Plagium&#8217;s Deep Search performed, I ran it through a series of tests designed to compare it to similar services.</p>
<h4>Matching Tests</h4>
<p>To better understand how well Plagium&#8217;s Deep Search tool did at detecting plagiarism, I decided to do several side-by-side tests comparing it against both their free offering and Copyscape&#8217;s Premium offering. </p>
<p>The results are below:</p>
<p><strong>Client Page 1</strong></p>
<p>First I decided to run <a href="http://www.ravensrants.com/loner/">an old prose work</a> of mine that had relatively limited copying to see how well the various engines did when dealing with older works in a more traditional format.</p>
<table cellspacing=15>
<tr>
<td><strong>Plagium Deep Scan</strong></td>
<td><strong>Plagium Quick Scan</strong></td>
<td><strong>Copyscape Premium</strong></td>
</tr>
<tr align="center">
<td>5</td>
<td>2</td>
<td>1</td>
</tr>
</table>
<p><strong>Press Release</strong></p>
<p>Second, I tested <a href="http://www.businesswire.com/news/home/20110412005305/en/Copyright-Clearance-Center-Launches-%E2%80%98Get-Now%E2%80%99-Academic">a recent press release by the Copyright Clearance Center</a> to see how well it detected copying that had taken place very recently. </p>
<table cellspacing=15>
<tr>
<td><strong>Plagium Deep Scan</strong></td>
<td><strong>Plagium Quick Scan</strong></td>
<td><strong>Copyscape Premium</strong></td>
</tr>
<tr align="center">
<td>95</td>
<td>20</td>
<td>25</td>
</tr>
</table>
<p><strong>Poem</strong></p>
<p>Finally, I tested <a href="http://www.ravensrants.com/friends-or-lovers/">an old poem of mine</a> that I knew had widespread copying and reuse, both legitimate and illegitimate, to determine how well it handled poetry and works without traditional paragraph breaks. </p>
<table cellspacing=15>
<tr>
<td><strong>Plagium Deep Scan</strong></td>
<td><strong>Plagium Quick Scan</strong></td>
<td><strong>Copyscape Premium</strong></td>
</tr>
<tr align="center">
<td>15</td>
<td>11</td>
<td>12</td>
</tr>
</table>
<p>In every case, Plagium Deep Search came out on top, finding more results. However going through the results, especially with the second case, I found that many of the results were false positives, sharing less than 50 words. With no way to set the threshold for what is considered a match, I would have had to eliminate about half of the results from being actual matches.</p>
<p><img style=' float: right; padding: 4px; margin: 0 0 2px 7px;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/04/plagium-error.jpg" alt="" title="plagium-error" width="258" height="259" class="alignright size-full wp-image-9471" />That being said, even with the false positives removed, Plagium Deep Search outperformed both its free offering and Copyscape in finding matches. This is likely due to the fact that Plagium Deep Search seems to poll a wider range of sources, including Yahoo! News, Bing and Bing News. Though Plagium also polls Yahoo! search, that is now powered by Bing, making that search of limited usefulness.</p>
<p>One minor issue I did have with Plagium&#8217;s match detection was that, in its detailed report, perfect matches often had gaps in the highlighting, indicating that parts of the match were not detected. This didn&#8217;t seem to affect the overall accuracy in terms of finding pages, but it could limit Plagium&#8217;s usefulness for certain types of plagiarism analyses where greater precision is needed.</p>
<p>All in all, from a matches found perspective, Plagium seems to have a very compelling product on its hands and one that others may wish to start making broader use of.</p>
<h4>Bottom Line</h4>
<p>Despite some interface issues, Plagium&#8217;s Deep Search tool is a pretty compelling service offering and the $10 account for 1.1 million characters will likely last most searchers a full year, which is as long as the credits are good for.</p>
<p>Even with all of the tests that I did, I have only gone through about 50,000 characters on my account.</p>
<p>Personally, I&#8217;ll be using the Deep Search tool in lieu of the free offering, which I was already using to supplement Copyscape, Google Alerts and other search tools. </p>
<p>However, for most I would recommend testing your search with the free offering before deciding if there is any cause to use the Deep Search. If you find no results on the free offering, there isn&#8217;t any reason to spend the money, even if it is only a dollar or two.</p>
<p>That being said, if you do find a cause to dig deeper, you&#8217;ll likely find Plagium Deep Search to be well worth the cost. </p>
<p><em><strong>Disclosure:</strong> I was given 1.1 million characters of deep searches for free for the purpose of performing this review. This is valued at $10.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2011/04/13/plagium-introduces-deep-search/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The 3 Uses for Plagiarism Detection Tools</title>
		<link>http://www.plagiarismtoday.com/2011/03/03/the-3-uses-for-plagiarism-detection-tools/</link>
		<comments>http://www.plagiarismtoday.com/2011/03/03/the-3-uses-for-plagiarism-detection-tools/#comments</comments>
		<pubDate>Thu, 03 Mar 2011 20:29:29 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[copyscape]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[ithenticate]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[plagiraism detection]]></category>
		<category><![CDATA[plagium]]></category>
		<category><![CDATA[turnitin]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=9123</guid>
		<description><![CDATA[Plagiarism detection tools actually have to serve a variety of functions, here's the three big ones that you need to be aware of.]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/03/detective-full-300x200.jpg" alt="Detective Badge Image" title="Detective Badge" width="300" height="200" class="alignleft size-medium wp-image-9125" /><a href="http://www.plagiarismtoday.com/2011/03/02/my-secret-plagiarism-detection-weapon/">In my last post about WCopyFind</a>, I talked briefly about the different usage scenarios that plagiarism checking tools have to deal with. Each, however, require a different skill set and, unfortunately, it seems no one tool is deal for any two situations, much less all of them.</p>
<p>So what are the usage scenarios a plagiarism checker will have to face? There are three overriding themes and any specific case will likely either be classified as one of the three or may have a combination of two or even all three scenarios in them.</p>
<p>These situations highlight why it is important to be aware of the different plagiarism and copy detection tools out there and not just relying on one or two. Just as using a screwdriver is wrong when trying to hammer in a nail, it is important to use the right tool when checking for plagiarism and, to do that, you need to understand the different jobs there are.<span id="more-9123"></span></p>
<h4>1. Verifying Originality</h4>
<p>In this scenario, you are given a piece of content from an unknown origin, whether an essay, a new article, poem, etc. and you need to check and see if the work is original. </p>
<p>This is the situation faced by countless professors, teachers and other educators every day. It&#8217;s also the one faced by editors in newsrooms and for sites across the Web The goal is to either verify that the work is original or determine if it might be plagiarized.</p>
<p><strong>What it Needs</strong></p>
<p>Generally, for plagiarism checkers in this area, accuracy and breadth of database content is the most crucial thing. Such plagiarism checkers don&#8217;t have to find every result, just the one correct result. However, it must be able to return that to serve any purpose at all.</p>
<p>Speed, however, is slightly less important though simplicity is crucial as many of the people reading the reports as those reading them often know little about the original material or the suspected source content.</p>
<p><strong>Leaders</strong></p>
<p>Currently, <a href="http://www.iparadigms.com/">iParadigms</a> is the undisputed leader in this field with its two main products, <a href="http://turnitin.com/static/index.php">Turnitin</a>, for schools, and <a href="http://www.ithenticate.com/">iThenticate</a> for businesses.</p>
<p><a href="http://safeassign.com/">SafeAssign</a>, which is owned by Blackboard, is a common alternative.</p>
<h4>2. Tracking Content Misuse</h4>
<p>This is the more common situation we talk about on Plagiarism Today. A content creator has written a piece of material they know to be authentic and want to track how it is being used on the Web. This involves not merely returning one accurate result, but rather, all the results available.</p>
<p><strong>What it Needs</strong></p>
<p>Breadth and accuracy are still important, but are less so. The reason is because there&#8217;s a higher tolerance for false positives as it is easier to make human judgements when starting with a known authentic source and, generally, there is only an interest in looking on the Web, not databases of academic content.</p>
<p>What is more important is the ability of the checker to return a large number of accurate results and to do so quickly. It&#8217;s not enough for the plagiarism checker to spot misuse and stop, instead, it has to find and report every incident it can.</p>
<p><strong>Leaders</strong></p>
<p>For casual users, <a href="http://copyscape.com">Copyscape</a> and <a href="http://plagium.com">Plagium</a> are likely the best tools. For businesses, services such as <a href="http://attributor.com">Attributor</a> and <a href="http://icopyright.com">iCopyright Discovery</a> are more robust solutions.</p>
<h4>3. In-Depth Plagiarism Analysis</h4>
<p>The final situation is one where one already suspects the work of being a plagiarism and has reduced the field of candidates down to a a one or a few documents. The checker needs to either confirm those suspicions or get a more accurate picture on just how extensive the plagiarism is.</p>
<p><strong>What it Needs</strong></p>
<p>If you already know where the work was likely plagiarized from, you don&#8217;t need any kind of Internet searching capability. Instead, you can focus on comparing the two documents in depth and that requires a flexible plagiarism checker that can easily sift through the works involved for similarities and produce detailed results.</p>
<p><strong>Leaders</strong></p>
<p><a href="http://plagiarism.phys.virginia.edu/Wsoftware.html">WCopyFind</a> is one of the best-known and most loved apps in this area though there are also a slew of document comparison tools that can also work.</p>
<h4>Bottom Line</h4>
<p>Most people reading this are going to wonder what this means for them. The answer is simple: If you ever find yourself in need of a plagiarism or copy detection tool, it&#8217;s important to stop before making a decision and ask the important question of &#8216;What do I need to do with it?&#8221;</p>
<p>What you need the tool for is going to determine how you&#8217;re doing to use it and, that in turn, will determine which tool is likely the best.</p>
<p>There are still other differences between the tools, some seem to work better for certain types of content or plagiarism than others, but when deciding which tool to use, the first thing to consider is the job it will be doing.</p>
<p>Once you know that, the rest of the decision gets much easier.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2011/03/03/the-3-uses-for-plagiarism-detection-tools/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Case Study: Tracking a Sneaky Plagiarist Poet</title>
		<link>http://www.plagiarismtoday.com/2010/10/05/case-study-tracking-a-sneaky-plagiarist-poet/</link>
		<comments>http://www.plagiarismtoday.com/2010/10/05/case-study-tracking-a-sneaky-plagiarist-poet/#comments</comments>
		<pubDate>Tue, 05 Oct 2010 16:41:33 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[copyscape]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[plagiarism-detection]]></category>
		<category><![CDATA[plagium]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=7996</guid>
		<description><![CDATA[Tracking plagiarism is rarely a straightforward task, sometimes you have to rely on your intuition. This is one of those cases.]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2010/10/mask-sample-300x194.jpg" alt="" title="mask-sample" width="300" height="194" class="alignleft size-medium wp-image-8032" />Miriam Short-Poncer is someone whose judgement I trust. A member of the <a href="http://www.paganspace.net/">PaganSpace</a> community site, Miriam has done a great deal of good work in fighting plagiarism on the site. Back in May, she notified me that she suspected a member of plagiarizing my poetry and tracked me down to my Plagiarism Today account. </p>
<p>So. when she forwarded me a poem posted by another community member, I had no reason to question that she was right.</p>
<p>But what should have been just a routine case of tracking down a potentially plagiarized work turned out to be an interesting case study in plagiarism detection, especially with poetry, and may have highlighted a weakness in Copyscape and a strength in Plagium.</p>
<p>This makes it a case worth reviewing as it may help others in a similar position track their work online and detect infringing works posted to their sites.</p>
<p>With that in mind, here&#8217;s what happened and what we can learn from it. .<span id="more-7996"></span></p>
<h4>Tracking a Plagiarist Poet</h4>
<p>PaganSpace, like a lot of other niche communities, allows members to create blogs, profile and post to forums Also like any community that becomes large enough, it has a few people who don&#8217;t post original works and, instead, post the works of others. This is a major violation of the community&#8217;s rules and is handled swiftly.</p>
<p>According to Miriam, one particular member had been behaving very suspiciously, posting images of herself that were clearly professional models (I was able to confirm this in at least two cases), claiming to have written a book and other inconsistencies. So when this person published a poem to her blog, she was instantly suspicious. </p>
<p>However, she was unable to track down the source herself and turned it over to me in hopes that I would have better luck. </p>
<p>I cursed the fact it had to be a poem, knowing first hand how difficult it is to track poetry plagiarism on the Web, but decided to give it a shot. After a quick read to find some unique lines, I started off by doing the same thing she probably did, copying three lines from the poem and pasting them into Google. </p>
<p>Each one turned up negative.</p>
<p>I then decided to try a different technique, I copied and pasted the poem into my Copyscape premium account. Once again, it came up with nothing. </p>
<p><img src="http://www.plagiarismtoday.com/wp-content/uploads/2010/10/poem-copyscape-500x182.jpg" alt="" title="poem-copyscape" width="500" height="182" class="alignnone size-large wp-image-7997" /></p>
<p>At this point, I was beginning to wonder if the work could be genuine or at least so heavily modified that there was no way to track it back. Still, being both her friend and being in her debt for her help, I decided to give it another try.</p>
<p>I read the poem more thoroughly, analyzing it line by line. It had all the hallmarks of a plagiarized piece that had been partly rewritten. There were inconsistencies in the language, odd word choices and changes in the language. Where most of the poem seemed to be in very plain tones, some of the passages shifted to a wordy, almost formal tone. </p>
<p>Realizing I had selected the wordy passages in my original tests, largely because they were more likely to be unique, I changed strategies and searched for one of the more plain lines. It was there <a href="http://www.best-love-poems.com/poems.php?id=1152538">I hit pay dirt and found this poem</a>.</p>
<p>After a side-by-site comparison of the two works, I could see that there were a lot of differences, the plagiarized copy was barely half the length of the original and had several new lines, but it was clearly a plagiarism. Not only were the first stanzas in both identical, but they both copied a punctuation error, namely the lack of an apostrophe in &#8220;I&#8217;m&#8221;.</p>
<p>Curious, I decided to give <a href="http://plagium.com">Plagium</a> a shot at detecting the poem and it actually came through quite well.</p>
<p><img src="http://www.plagiarismtoday.com/wp-content/uploads/2010/10/poem-plagium-500x174.jpg" alt="" title="poem-plagium" width="500" height="174" class="alignnone size-large wp-image-7998" /></p>
<p>Still all of this raises some serious questions about why the detection was so difficult and why one of the best plagiarism checkers couldn&#8217;t spot it. A brief look at the two poems and it was easy to see why.</p>
<h4>Comparing the Works</h4>
<p>What was interesting about this case is that the plagiarized work had a very interesting mix of old content and new content. For example, the first stanzas in both of the works were identical, including typo (original on left):</p>
<p><img src="http://www.plagiarismtoday.com/wp-content/uploads/2010/10/poem-compare-1-500x89.jpg" alt="" title="poem-compare-1" width="500" height="89" class="alignnone size-large wp-image-8006" /></p>
<p>But the second stanza shows the kind of rewriting done through most of the poem (original on left):</p>
<p><img src="http://www.plagiarismtoday.com/wp-content/uploads/2010/10/poem-compare-2-500x70.jpg" alt="" title="poem-compare-2" width="500" height="70" class="alignnone size-large wp-image-8007" /></p>
<p>The first two lines in the second stanza were combined in the latter work. The second line in the latter matches perfectly the third line in the original and the fourth line of that stanza for the duplicate appears to be original.</p>
<p>That type of rewriting continues for much of the work, combining lines, combining stanzas, adding new material and repurposing many sections wholesale. All in all, about a third of the content is directly lifted from the original, another third is rewritten and the last third is original.</p>
<p>It certainly is not an original work, but it is very difficult to detect as a plagiarism.</p>
<h4>Lessons Learned</h4>
<p>Through it all, there were several lessons that can be gleaned from this very strange and difficult case:</p>
<ol>
<li><strong>Editing a Work is Inefficient for Avoiding Detection:</strong> Though it is possible to edit a work enough to make it impossible to detect its source, doing so requires a great deal of time and work. Already a great deal went into editing this poem and it was still able to be detected, albeit with trouble.</li>
<li><strong>Plagium May Have an Upper Hand:</strong> Though it&#8217;s just one case, it is clear Plagium came through when Copyscape didn&#8217;t. It may be a sign Plagium is better for this type of detection, especially with poetry. <a href="http://www.plagiarismtoday.com/2009/05/07/plagium-a-copyscape-alternative/">This mirrors my own results</a> from when I first learned of the service.</li>
<li><strong>You Can&#8217;t Replace Intuition:</strong> Though technology can help a great deal, nothing replaces intuition. The number of times intuition has failed me is far less than the number of times technology has failed me in this area. There is no substitute for human analysis.</li>
</ol>
<p>Though this case may not be the most difficult I&#8217;ve handled or one of the worst I&#8217;ve seen, it&#8217;s an interesting microcosm of just some of the challenges I face every day in detecting and stopping plagiarism. </p>
<h4>Bottom Line</h4>
<p>In the end, I want to give thanks to Miriam for her permission in using this case as an example. I greatly appreciate the opportunity to talk about the types of things I do all day.</p>
<p>Also, as you may have noticed, I have not linked to or pasted in the plagiarized work, out of respect to the original author (I was not able to contact her without signing up for an account on her poetry site) and because PaganSpace is still completing its investigation. However, the plagiarized work, along with the images I discovered, have been removed and the profile of the person involved has been set to private. The administrator of the site responded very swiftly when notified of these problems.</p>
<p>All in all though, it does seem to be a pretty clear-cut case and, when combined with the use of professional photos to represent herself, it seems almost certain that there is more to this story. However, at this time I&#8217;ve done all that I can and, in the process, created a great case study for helping others (and myself) better track down plagiarists.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2010/10/05/case-study-tracking-a-sneaky-plagiarist-poet/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>5 Major Changes in the Past 5 Years of Content Theft</title>
		<link>http://www.plagiarismtoday.com/2010/06/15/5-major-changes-in-the-past-5-years-of-content-theft/</link>
		<comments>http://www.plagiarismtoday.com/2010/06/15/5-major-changes-in-the-past-5-years-of-content-theft/#comments</comments>
		<pubDate>Tue, 15 Jun 2010 16:37:31 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Punditry]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[copyscape]]></category>
		<category><![CDATA[DMCA]]></category>
		<category><![CDATA[fairshare]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[plagium]]></category>
		<category><![CDATA[takedown]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=6878</guid>
		<description><![CDATA[With PT celebrating its 5-year anniversary this week, I'm now taking a look back at five things that have changed since 2005. ]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2010/06/time-simple1.jpg" alt="" title="time-simple" width="278" height="184" class="alignleft size-full wp-image-6881"></p>
<p>Yesterday, <A href="http://www.plagiarismtoday.com/2010/06/14/plagiarism-today-5-years-later/">I talked about the 5-year anniversary of Plagiarism Today</A> and what it has meant for me and the site. It&#8217;s been a great 5 years to say the least, but it has also been an interesting five years for content creators.</p>
<p>When I walked into Plagiarism Today, I was inspired to do so by the spate of human plagiarists, people taking my work and claiming it to be their own (despite a license to freely use it with attribution). That is why this site is named Plagiarism Today and not Copyright or Content Theft Today. However, a lot has changed over those five years and the site has had to adapt and change to those movements.</p>
<p>So, with that in mind, here are the five biggest changes for webmasters and bloggers that I have observed over the past five years and what they mean for those who are interested in tracking and protecting their work on the Web.<span id="more-6878"></span></p>
<p><H4>5. Growing Awareness of the Issue</H4></p>
<p>When I started Plagiarism Today, I had an uphill battle convincing people that plagiarism online was a serious problem. In fact, <A href="http://www.plagiarismtoday.com/2006/02/04/housekeeping-links-and-more/">my first mention on This Week in Tech</A> was a fairly negative one as the hosts didn&#8217;t see the plagiarism issue I did.</p>
<p>Things have changed though and, perhaps the greatest sign is that I am now a recurring guest on This Week in Law, which is on the TWiT network, <A href="http://twit.tv/twil63">including most recently on Episode 63</A>.</p>
<p>Clearly, people are more aware of the issues of content theft and plagiarism on the Web and that has made my job, as well as the job of webmasters, much easier.</p>
<p><H4>4. Hosts Less Cooperative</H4></p>
<p>One discouraging trend I have noticed is that hosts are becoming more and more hostile to dealing with copyright matters. This isn&#8217;t universally true, most hosts that were great still are and some have improved, <A href="http://www.plagiarismtoday.com/2009/04/14/google-accepts-online-dmcas-for-blogger/">such as Google Blogger</A>, but most paid hosts in particular have been aggressive at trying to avoid compliance.</p>
<p>This isn&#8217;t an issue I&#8217;ve faced much personally, likely due to my site being well-known in these circles, but I&#8217;ve been getting increasing reports of hosts being aggressive in trying to not comply with notices. In one recent case, even accepting a counter-notice before the takedown notice was filed, in violation of the protocol.</p>
<p>I&#8217;ll have more on this problem in the future.</p>
<p><H4>3. Social Networking Boon</H4></p>
<p><A href="http://www.bizreport.com/2007/11/facebook_shows_125_growth_year_over_year.html#">In 2006 Facebook had a mere 8 million unique visitors</A> and had just opened for public use. In 2009, Facebook <A href="http://www.facebook.com/press/info.php?statistics">has an estimated 200 million unique visitors log in every single day</A>.</p>
<p>This has had a huge impact not just on how people use the Internet, but where they post works and the types of copyright infringement that are more common. Where, five years ago, forums and free blogging sites were the most common sources of human plagiarism, today it&#8217;s Facebook and other social networking sites. </p>
<p>This is an example of a shift in the broader Web having a dramatic impact on the way content is used (and misused).</p>
<p><H4>2. Rise (and Fall) of Scraping</H4></p>
<p>Much of the initial interest in Plagiarism Today was not generated by human plagiarists but by automated spammers who were scraping RSS feeds, sending people to this site to find ways to stop them. </p>
<p>When I started PT, full RSS scraping was fairly rare, <A href="http://www.plagiarismtoday.com/2005/11/03/splogs-plagiarism-en-masse/">though I did mention it first within a few months of starting the site</A>, but it became extremely common between 2006-2008. However, the method fell out of favor with many spammers since then, in part due to improved duplicate content filters and also in part due to copyright complaints from bloggers.</p>
<p>Though there are still plenty of RSS scrapers out there, other types of spam blogging are increasing in popularity, including scraping search engine results, truncated feed scraping and content generation. In short, full-feed RSS scraping, though common, is losing some favor and the newer methods skirt most copyright issues. </p>
<p>However, from what I am seeing, human plagiarism is on the rise again, meaning that there is still plenty of work for me to do. </p>
<p><H4>1. Improved Technology/Tools</H4></p>
<p>In 2005, the best tool for tracking your content was <A href="http://google.com/alerts">Google Alerts</A> and Copyscape was still fairly new. So much has changed in 5 years that it is difficult to put it into words. </p>
<p>For one, we have great new tools for tracking blog content, including free services such as <A href="https://fairshare.attributor.com/fairshare/">FairShare</A>, Copyscape alternatives such as <A href="http://plagium.com">Plagium</A> and even free tools for tracking images, such as <A href="http://tineye.com">Tineye</A>. This doesn&#8217;t count the spate of more traditional plagiarism checkers, the licensing applications and non-repudation tools that register and datestamp copyrighted works.</p>
<p>The tools available today put what was available in 2005 to shame and gives me a great deal of trouble trying to stay on top of all the changes. If you&#8217;re a creator of content and eager to track your work, now is a great time to be active on the Web and it seems poised only get better.</p>
<p><H4>Bottom Line</H4></p>
<p>The past five years have been a period of rapid change for webmasters in this area and the next five will be the same. However, I am curious about what you think will happen over the next few years in this area.</p>
<p>Please leave your comments below or drop me a line to send me your thoughts.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2010/06/15/5-major-changes-in-the-past-5-years-of-content-theft/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>5 Reasons Google is My Primary Plagiarism Checker</title>
		<link>http://www.plagiarismtoday.com/2010/02/09/5-reasons-google-is-my-primary-plagiarism-checker/</link>
		<comments>http://www.plagiarismtoday.com/2010/02/09/5-reasons-google-is-my-primary-plagiarism-checker/#comments</comments>
		<pubDate>Tue, 09 Feb 2010 18:19:47 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Attributor]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[copyscape]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[icopyright]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[plagiarism checker]]></category>
		<category><![CDATA[plagium]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=5530</guid>
		<description><![CDATA[With all of the powerful tools out there for detecting plagiarism, is it possible Google is still the best?]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  alt="Google&#039;s Logo" src="http://files.plagiarismtoday.com/wp-content/uploads/2010/02/google-logo4.jpg" title="Google Logo" class="alignleft" width="304" height="119"></p>
<p>Whether you are a writer looking for plagiarized copies of your work or a teacher/professor checking academic papers for plagiarism, Google is your friend.</p>
<p>Google provides, by far, the easiest way to perform quick plagiarism checks, whether to find if a work is plagiarized or has been the victim of plagiarism, it does so for free and it does it in a very robust way.</p>
<p>Though there are a lot of great tools out there with many great uses, Google remains my first stop for plagiarism checks in most cases as it is simply faster, cheaper and more accurate than most other tools.</p>
<p>Though you shouldn&#8217;t use it exclusively and definitely should not shy away from using additional tools, you need Google in your arsenal and you need to learn how to use it well. Otherwise, you may find yourself spending more time and money than needed while not getting the results you desire.<span id="more-5530"></span></p>
<h4>Why Google</h4>
<p>When deciding where to start with your plagiarism check, consider the five following reasons to start with Google:</p>
<ol>
<li><strong>Human Analysis is Best:</strong> It is pretty trivial for a human to find a statistically improbable phrase that is likely to be reused. Some plagiarism checkers don&#8217;t ignore quoted and cited content and all search for content that is likely repeated without plagiarism. This means a few seconds spent on the front end finding a good phrase can save hours on the backend filtering through false positives. Furthermore, over-reliance on more automated systems can result in users taking the results as gospel and not performing adequate human evaluation. This can be a tremendous mistake.</li>
<li><strong>Immediate, Accessible and Free:</strong> Even a complicated Google search is returned within a few seconds. Some take days to process matches while even the faster ones usually take a few minutes, this hinders their usefulness in checking hunches. Also, Google is free to use and is available anywhere you have an Internet connection, even via your phone. The service that fits in your schedule and budget is the one you will use and if you don&#8217;t use a plagiarism checker, it can do no good at all.</li>
<li><strong>Accuracy:</strong> In my experience, Google produces far fewer false positives than even more advanced plagiarism checkers. It also has a very large database with billions of pages, including PDFs, Word files and other non-HTML formatted content. It also updates in very close to real time with Google News and blog search, making it great for finding instances of plagiarism that take place quick after publication.</li>
<li><strong>It&#8217;s What You Care About:</strong> If your work is plagiarized and the plagiarism isn&#8217;t in Google, does it exist? It&#8217;s a valid question and, if you&#8217;re a content creator worried about SEO, the answer is probably no. Other checkers that don&#8217;t work off Google&#8217;s database may cause you to spend time and resources on leads that don&#8217;t matter. Other databases are usually slower to update. Also, Google tends to do a good job of prioritizing matches for you, starting with those that are more important. Finally, Google, in my experience, is the most popular means for students to plagiarize their work, making it a logical tool to backtrack any suspected plagiarism.</li>
<li><strong>It&#8217;s Dead Simple:</strong> Everyone knows how to do a Google search. Not everyone knows how to format a paper for submission to another service. It&#8217;s a method anyone can use with almost no training at all, including those easily intimidated by technology.</li>
</ol>
<p>In short, Google is easy to use, very fast and provides very accurate, broad results for the total price of free. Though it isn&#8217;t the perfect plagiarism checker by any stretch. When others ask me to quickly check a work for them, it is where I usually start. If something trips my sensors, I will often times use another checker, such as Plagium or CopyScape to drill down deeper. </p>
<p>In short, there is no intended slight in this of other plagiarism checkers, in fact, there are many legitimate needs that they are needed to fill.</p>
<h4>Google&#8217;s Limitations</h4>
<p>As great as Google is, there are still limitations to what it can do and those limitations are often filled very well through other services. Consider the following:</p>
<ol>
<li><strong>Organization and Resolution Assistance:</strong> Google simply provides results, it is up to you to organize them and take action on them. Services like <a href="http://attributor.com">Attributor</a> and <a href="http://icopyright.com">iCopyright Conductor</a>, which are aimed at larger content creators, and <a href="http://turnitin.com/static/index.html">Turnitin</a> and <a href="http://www.safeassign.com/">SafeAssign</a>, which are aimed at schools, provide that organization. This makes managing large case loads much more bearable.</li>
<li><strong>Additional Sources:</strong> Plagiarism checkers that specialize in academic environments, including Turnitin, include additional databases that are not available to Google including private article databases and research paper.</li>
<li><strong>Full-Work Matching:</strong> Though Google is great for quick checks and finding potential matching pages, determining what content is matching and which isn&#8217;t is a headache by hand. More robust checkers will highlight the duplicate content and make it easy to see at-a-glance what has been copied. Plagiarism checkers such as <a href="http://www.copyscape.com">Copyscape</a>, which is based on Google, and <a href="http://plagium.com">Plagium</a> are natural additions to Google in this area. Also, collusion detection such as <a href="http://www.plagiarism.phys.virginia.edu/Wsoftware.html">WCopyFind</a> can check two suspect documents, such as one Google suspects, and highlight matching portions.</li>
</ol>
<p>In short, these tools have a time and a place. I still recommend them highly and use them widely depending on the project and situation. However, they do some of their best work after Google or another search engine has alerted the searcher to the possibility of plagiarism and a deeper look is needed to determine how significant the potential infraction is.</p>
<h4>Bottom Line</h4>
<p>When someone asks me to check and see if a work is plagiarized, especially if they are wanting me to see if the work appears anywhere else on the Web, I usually turn to Google first. Though other checkers are great, Google simply does the best job of letting me know how much copying the work has seen, who the most important infringers/likely sources are and if further research is needed.</p>
<p>Uunless Google alerts me that there is a likely problem, I know that other services will most likely be a waste of time that will possibly have me swimming through false positives or simply waiting for results. All in all, it is time lost that could be better spent elsewhere. </p>
<p>For most searches, Google is my primary tool of choice. Though it isn&#8217;t usually the last word on whether or not a work has been plagiarized, it tells me what I need to know and helps me better determine what I need to do next. It is my first choice for plagiarism checker, the default tool I reach for, but that doesn&#8217;t make it the only one I use.</p>
<p>Regardless, learning how to use Google for plagiarism detection and learning how to use it well should be the first priority for anyone wanting to find duplicate content, whether of their own work or to detect plagiarism in other&#8217;s. Without it, you won&#8217;t be as effective at plagiarism detection nor as able to perform the task.</p>
<p>Simply put, relying on a plagiarism checker to make decisions for you is a poor move, especially with the danger of false positives. Human judgement is the best and Google lets you exercise it some before bringing in the bigger guns.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2010/02/09/5-reasons-google-is-my-primary-plagiarism-checker/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
		<item>
		<title>Plagium: A Copyscape Alternative</title>
		<link>http://www.plagiarismtoday.com/2009/05/07/plagium-a-copyscape-alternative/</link>
		<comments>http://www.plagiarismtoday.com/2009/05/07/plagium-a-copyscape-alternative/#comments</comments>
		<pubDate>Thu, 07 May 2009 19:05:28 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Products]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[copy detection]]></category>
		<category><![CDATA[copygator]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[copyscape]]></category>
		<category><![CDATA[fairshare]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[plagium]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=3418</guid>
		<description><![CDATA[A new plagiarism service promises to shake up the scene by providing a solid competitor to Copyscape. But can it hold up?]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://files.plagiarismtoday.com/wp-content/uploads/2009/05/plagium-logo-300x71.jpg" alt="plagium-logo" title="plagium-logo" width="300" height="71" class="alignleft size-medium wp-image-3419" /></p>
<p>When it comes to tracking content across the Web, <a href="http://www.copyscape.com">Copyscape</a> is, for the most part, the brand name to know.</p>
<p>This reputation has been very well earned. They recently took <a href="http://www.plagiarismtoday.com/2008/11/04/copyscape-tops-plagiarism-checker-testing/">top honors in a round of plagiarism checker testing services</a>, which put them against several much more expensive services.</p>
<p>However, competitors have begun to emerge. Some, such as <a href="http://fairshare.cc">FairShare</a> offer <a href="http://www.plagiarismtoday.com/2008/11/04/copyscape-tops-plagiarism-checker-testing/">more features and more free results</a> and others, such as <a href="http://www.copygator.com">CopyGator</a>, <a href="http://www.plagiarismtoday.com/2009/01/20/copygator-a-game-changer/">offer great convenience</a>. Despite this, especially for static content, Copyscape has remained the gold standard.</p>
<p>But a new service hopes to provide a new challenge. <a href="http://www.plagium.com/index.cfm?mode=text">Plagium</a>, a copy detection system by <a href="http://www.septetsystems.com/">Septet Systems</a>, provides a very similar service to Copyscape but adds additional free features and uses Yahoo! rather than Google to perform its searches.</p>
<p>The question is how does it stack up and, to measure that, I put the service through a battery of tests, using my well-copied and plagiarized literary works as the measuring stick.<span id="more-3418"></span></p>
<h4>About Plagium</h4>
<p>The comparisons between Plagium and Copyscape are obvious, however, the default interface of Plagium is not to provide a URL to be checked, as with Copyscape, but a textbox to paste your text. Though this is less convenient, it actually, in my experience, provides better results as the plagiarism checker is only examining the content, not the surrounding text (navigation, footer, etc.).</p>
<p>However, if you prefer the convenience of just providing the URL, you can click the &#8220;Check URL&#8221; link and get a more Copyscape-like interface.</p>
<p>Plagium&#8217;s results add an interesting new feature called the &#8220;Timeline&#8221;, which shows roughly when the various reuses went online. This lets you prioritize your actions based upon either the most recent or the least current matches. However, as neat as the feature is, it can get cluttered on works that have a lot of copies and it isn&#8217;t exactly clear in the beginning what all of the elements mean, especially the sizes of the bubbles.</p>
<p><img src="http://files.plagiarismtoday.com/wp-content/uploads/2009/05/timeline-2.jpg" alt="timeline-2" title="timeline-2" width="450" height="168" class="alignnone size-full wp-image-3430" /></p>
<p>However, the most powerful feature of Plagium is its alert system. If you register for a free account, you can have the service track your text and alert you in a weekly email to any new copies it finds. You can also subscribe to an RSS feed of the results. </p>
<p>With this feature, Plagirum becomes something of a FairShare targeted at static content. Where FairShare requires an RSS feed to parse (<a href="http://www.associatedcontent.com/article/1657226/how_to_create_a_custom_google_reader.html">though there are hacks that can be used to get static content into the system</a>), this can work on any text that can be pasted into the system.</p>
<p>What is amazing about this is that Copyscape only offers the URL search and ten results free. <a href="http://copyscape.com/signup.php?pro=1&#038;o=f">It&#8217;s paid accounts</a>, five cents a search, allows users to paste text and receive unlimited results. They also <a href="http://copyscape.com/copysentry.php">provide a sentry service</a>, which monitors 10 pages once a week for about $5 per month. </p>
<p>However, Plagium currently offers all of these features for free. A representative for the company said that they are providing it for free to &#8220;attract paying customers for custom information tracking system development work,&#8221; though the site does also accept donations.</p>
<p>But not much of this matters if the plagiarism detection isn&#8217;t up to code. So I decided to put the system to a quick test to see how it handles some of my most plagiarized works.</p>
<h4>The Tests</h4>
<p>For the purpose of this test I ran five of my works through both Plagium, Copyscape (using the text paste feature) and, as a baseline, I ran a statically improbably phrase from each work through Google. </p>
<p>In each case I looked and attempted to verify that at least most of the results were not false positives. However, it is possible that there are some non-matches or additional duplicates included within the mix.</p>
<p>The results of the tests are below:</p>
<p><strong>Poem 1</strong></p>
<p>The first poem was a 224-wrord poem that was known to be widely plagiarized.</p>
<table cellspacing=10>
<tr>
<td><strong>Plagium</strong></td>
<td><strong>Copyscape</strong></td>
<td><strong>Google</strong></td>
</tr>
<tr>
<td>34</td>
<td>29</td>
<td>351</td>
</tr>
</table>
<p>The first test showed that Plagium found approximately 17% more matches than Copyscape. Copyscape, for example, did not find my own site though Plagium listed it first.The page is listed in Google. </p>
<p>Still, the Google results trumped both of the two very handily and provided a large amount of additional results. However, the actual number of results is far lower than the number provided as it appears many of the Google results were duplicates where the same page had multiple URLs.</p>
<p><strong>Poem 2</strong></p>
<p>The second poem is a 279 word poem also known to be heavily plagiarized.</p>
<table cellspacing=10>
<tr>
<td><strong>Plagium</strong></td>
<td><strong>Copyscape</strong></td>
<td><strong>Google</strong></td>
</tr>
<tr>
<td>21</td>
<td>9</td>
<td>201</td>
</tr>
</table>
<p>In this test, Plagium outperformed Copyscape by over 100%. However, Plagium does suffer from some duplication issues. For example, my site has two pages listed with the work on it though, once again, it doesn&#8217;t appear at all in Copyscape. However, even with this, there are far more unique results in Plagium.</p>
<p>Google once again trumped both of them but the duplication in Google makes that only useful for baseline, not an exact number.</p>
<p><strong>Story 1</strong></p>
<p>For this test I used a 1550 word short story with very limited reuse. </p>
<table cellspacing=10>
<tr>
<td><strong>Plagium</strong></td>
<td><strong>Copyscape</strong></td>
<td><strong>Google</strong></td>
</tr>
<tr>
<td>5*</td>
<td>1</td>
<td>5*</td>
</tr>
</table>
<p>(*)In this test all three essentially tied. The difference between the 5s by Plagium and Google was the four matches they found on my site. All three found the exact same reuse, which is a legitimate copy of the work on another site.</p>
<p>In this case, they all three performed the same.</p>
<p><strong>Prose 1</strong></p>
<p>For this test, I used a 785 word short story with a modest amount of known reuse.</p>
<table cellspacing=10>
<tr>
<td><strong>Plagium</strong></td>
<td><strong>Copyscape</strong></td>
<td><strong>Google</strong></td>
</tr>
<tr>
<td>6</td>
<td>10</td>
<td>41</td>
</tr>
</table>
<p>In this case, Copyscape was the clear winner. Not only did Plagium return fewer results, but the six results were really just 2 as 4 results were from my site and the other 2 from the same forum. Copyscape, on the other hand, delivered 10 matches, at least 4 of which were unique.</p>
<p>Google&#8217;s results, on the other hand, contained 20-25 duplicates, making its number closer to the mid 20s.</p>
<p><strong>Prose 2</strong></p>
<p>For this test I used a 202 word prose piece with a moderate amount of known plagiarism.</p>
<table cellspacing=10>
<tr>
<td><strong>Plagium</strong></td>
<td><strong>Copyscape</strong></td>
<td><strong>Google</strong></td>
</tr>
<tr>
<td>4</td>
<td>1</td>
<td>26</td>
</tr>
</table>
<p>In this case, Plagium found three unique matches, including my site, that were not in Copyscape. Google did find more matches than both, but once again there was a serious duplication issue. At least nine items in Google&#8217;s results were duplicates, meaning that the number is closer to 15-18 results.</p>
<p>Still, this was a clear case where Plagium found results that Copyscape missed.</p>
<h4>Results</h4>
<p>In all five tests, Google outperformed both Plagium and Copyscape. However, it contained a very high amount of duplicate results and the benefit was likely minimal. In the contest between Plagium and Copyscape, Plagium found more matches three of the times, Copyscape did better in one test and they tied in one.</p>
<p>It appeared to me that Copyscape was not producing the number of matches it once did. The second poem, for example, is the same one I used when <a href="http://www.plagiarismtoday.com/2007/10/02/copyscape-improved-again/">comparing Copyscape to itself in 2007</a>. In that testing, it first found no results, then ten results, then 31. With today&#8217;s test, it found 9 even though the actual number of copies has remained fairly flat. </p>
<p>Whether this is because Copyscape does not work as well with pasted text (the first tests were done with the URL function) or because changes have limited the results it is producing, it is clear that it is not as effective as it once was for finding all of the results for a work.</p>
<p>However, it is important to note that this is far from a comprehensive comparison of the two service. These are just five very limited cases. Everyone else&#8217;s mileage will vary. </p>
<h4>Bottom Line</h4>
<p>In the end Plagium&#8217;s results were very solid and it actually performed better than Copyscape in most tests. Whether this is a fluke or a sign of something greater, remains to be seen.</p>
<p>However, since Plagium is completely free, there&#8217;s no harm in trying it out and I actively encourage you to do so. You can also experiment with the alerts feature and see if it works well for your content (I haven&#8217;t seen any results yet in the few that I set up). </p>
<p>Though I&#8217;m not ready to recommend Plagium as the sole plagiarism checker one should use, I don&#8217;t think I&#8217;ll ever reach that point with any product, but it is a very solid addition pulling in some very competitive matching numbers.</p>
<p>If Plagium isn&#8217;t a part of your plagiarism detection toolbox, it should be. The results are solid from what I&#8217;ve seen, the features are very powerful and, best of all, it is completely free. You can&#8217;t ask for much more out of a plagiarism checker.</p>
<p>Personally, I&#8217;ll probably start relying more on Plagium for my static content and continue to use FairShare for items already within an RSS feed. This works well with the intentions and limitations of the two services. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2009/05/07/plagium-a-copyscape-alternative/feed/</wfw:commentRss>
		<slash:comments>28</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Page Caching using disk: enhanced

Served from: www.plagiarismtoday.com @ 2012-02-13 05:15:21 -->
