<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Plagiarism Todayplagiarism-detection | Plagiarism Today</title>
	<atom:link href="http://www.plagiarismtoday.com/tag/plagiarism-detection/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.plagiarismtoday.com</link>
	<description>Content Theft, Plagiarism, Copyright Infringement</description>
	<lastBuildDate>Mon, 13 Feb 2012 06:51:37 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>The Limitation of Every Plagiarism Checker</title>
		<link>http://www.plagiarismtoday.com/2011/12/07/the-limitation-of-every-plagiarism-checker/</link>
		<comments>http://www.plagiarismtoday.com/2011/12/07/the-limitation-of-every-plagiarism-checker/#comments</comments>
		<pubDate>Wed, 07 Dec 2011 18:34:47 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[copyscape]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[plagiarism-detection]]></category>
		<category><![CDATA[plagium]]></category>
		<category><![CDATA[turnitin]]></category>
		<category><![CDATA[wcopyfind]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=11985</guid>
		<description><![CDATA[As teachers and content creators rely more and more on plagiarism detection, they often lose sight of just how limited even the best tools are...]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/12/turnitin-logo.jpg" alt="Turnitin Logo" title="Turnitin Logo" class="alignleft size-full wp-image-11993" />When it comes to plagiarism, technology has been both a blessing and a curse. Though it has made it easier than ever to find and copy work from others without attribution, it&#8217;s also made it easier to track and handle plagiarism when it happens.</p>
<p>With tools that can search billions of documents in seconds and can find matches only a few words in length, it might seem as if plagiarism would be as easily detected as finding information in Google. A matter of merely punching your query and going through the results.</p>
<p>Unfortunately, that isn&#8217;t the case.</p>
<p>Plagiarism detectors have a huge limitation and one that isn&#8217;t likely to go away any time soon. That limitation is, simply put, that plagiarism detectors can&#8217;t actually detect plagiarism and, instead, do something very different altogether.<span id="more-11985"></span></p>
<h4>How Plagiarism Detection Works</h4>
<p>This problem might seem a bit odd to those unfamiliar with the technology. After all, dishwashers wash dishes and car starters start cars, but plagiarism detectors don&#8217;t actually detect plagiarism. </p>
<p>Instead, what they actually detect is sections of identical text. Though there is a variety of techniques for doing this, the end results are pretty much always the same. A plagiarism detection service looks for matching strings of words between the document its looking at and the ones it has in its index. This is true for a local plagiarism checker, such as <a href="http://plagiarism.bloomfieldmedia.com/z-wordpress/software/wcopyfind/">WCopyFind</a>, search engine-based systems such as <a href="http://www.copyscape.com">Copyscape</a> and <a href="http://www.plagium.com">Plagium</a> and high-end system such as <a href="https://turnitin.com">Turnitin</a>.</p>
<p>They all work on the same principle and basically function much like we would expect Google or another search engine to work, finding the words we want in other sources and providing the best results it can.</p>
<p>While this makes them powerful tools, doing the same comparison by hand would be impossible given all of the sources these tools can check, it does mean that it has some tremendous blind spots. </p>
<p>However, those blind spots are only a problem if people aren&#8217;t aware or don&#8217;t believe that they are there. Then they become huge issues that can lead to both false positives and false negatives.</p>
<h4>The Limitations of Plagiarism Detection</h4>
<p>Since plagiarism detection tools can only detect copying, or more specifically similar phrases, there are two areas where they are particularly weak.</p>
<ol>
<li><strong>Non-Verbatim Plagiarism:</strong> Plagiarism that involves the rewriting, translating or otherwise redrafting the text can&#8217;t be detected. This can be difficult to get away with as most plagiarism detectors are extremely sensitive, but since plagiarism detectors don&#8217;t analyze the content of the work, just the words, it can&#8217;t see if you lifted the idea or information if you didn&#8217;t also lift the words. This is a common problem in academia, which treats this kind of plagiarism equally as seriously as verbatim plagiarism.</li>
<li><strong>Common Phrasing/Attributed Use:</strong> Second, though many plagiarism checkers will make an attempt to separate out attributed use, given the variety of attribution styles it isn&#8217;t always possible. Also, given how common some phrases are in the English language, many plagiarism checkers will report matches that are actually just coincidence.</li>
</ol>
<p>In short, plagiarism detection tools are just machines and they can make mistakes. However, that is true with any tool as, for example, you don&#8217;t discard Microsoft Word because you can make a typo. </p>
<p>Also, like any other tools, plagiarism checkers are useless without humans to use them intelligently, which is the biggest problem such tools have.</p>
<h4>The Human Element</h4>
<p>The answer to all of this is simple, the decision as to what is and what is not plagiarism should be left to human beings. Humans are the only ones who can detect non-verbatim plagiarism and are the only one who can make determinations about the likelihood that the matches are coincidence and the whether the attribution was adequate or not.</p>
<p>Professors who have a hard rule about papers not being more than X% matching or authors who don&#8217;t let others copy more than X number of words before seeking legal action aren&#8217;t fighting plagiarism, but are doing more to confuse the issue.</p>
<p>While bright line rules are always tempting because they are easy to remember and follow, with plagiarism, there are few such rules and you can&#8217;t turn your judgment over to a machine.</p>
<h4>Bottom Line</h4>
<p>None of this is meant as a slight to any of these tools. I use all of the tools listed regularly and am grateful for the valuable service they provide. The problem doesn&#8217;t lie with the technology, but with those who treat these tools as magical solutions that are capable of making perfect judgments about plagiarism.</p>
<p>They are anything but.</p>
<p>As tempting as it is to turn over our judgment on plagiarism matters to the machines, it simply doesn&#8217;t work. Not only will a lot of plagiarism go undetected, but a lot of people will be accused falsely.</p>
<p>Though plagiarism detection tools are a part of the solution, they have to be used in tandem with human judgment and discretion to do any good.</p>
<p>If used correctly, a plagiarism detection service will alert someone to the possibility of plagiarism, not to its actual existence.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2011/12/07/the-limitation-of-every-plagiarism-checker/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Why Plagiarism is on the Rise</title>
		<link>http://www.plagiarismtoday.com/2011/11/11/why-plagiarism-is-on-the-rise/</link>
		<comments>http://www.plagiarismtoday.com/2011/11/11/why-plagiarism-is-on-the-rise/#comments</comments>
		<pubDate>Fri, 11 Nov 2011 19:24:33 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[plagiarism-detection]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=11775</guid>
		<description><![CDATA[While it's clear that plagiarism is on the rise, what is less clear is why. However, the answers are surprisingly easy to see.]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/11/discipline-sample-image-166x250.jpg" alt="Image of Student Discipline" title="Student Discipline Image" width="166" height="250" class="alignleft size-medium wp-image-11777" />The research, sadly, is pretty clear. <a href="http://www.zdnet.com/blog/igeneration/college-plagiarism-on-the-rise-blame-the-web-or-blame-the-student/12528">Academic plagiarism is on the rise</a>. Even back in 2000, <a href="http://www.zdnet.com/blog/igeneration/college-plagiarism-on-the-rise-blame-the-web-or-blame-the-student/12528">well over half of all students in one survey admitted to having plagiarized</a> at least some content from the Web and the numbers are not getting any better.</p>
<p>The problem is bad and it&#8217;s getting worse. But the question many teachers, professors and administrators are asking is simple: Why?</p>
<p>Though there&#8217;s a tendency to talk about &#8220;kids these days&#8221; or blame it on some cultural disconnect between the generations, the ethics of plagiarism are largely unchanged from previous generations and previous studies showed <a href="http://www.plagiarismtoday.com/2010/09/23/what-age-to-children-see-plagiarism-as-wrong/">children see the right and wrong of plagiarism as early as 5</a>. </p>
<p>So why is plagiarism on the rise? The answer is surprisingly simple: Because it&#8217;s easier.<span id="more-11775"></span></p>
<h4>How Plagiarism Became Easy</h4>
<p>If you go back just 25 years ago, plagiarism was hard work. One had to go to the library, find sources to copy from, retype those sources and then turn them in as their own. By the time one does all of that, they are a large part of the way to doing a non-plagiarized assignment so there was little benefit to risking punishment and shame.</p>
<p>But with the Web making finding content easy and copy/paste making it take only seconds to bring it into your word processor. Plagiarism is now incredibly easy and a tremendous time/effort saver. </p>
<p>For example, pre-Web and pre-computers, plagiarism might save 35%-45% of the time it would take to actually write an assignment. Now, plagiarism saves more than 95% in many cases. </p>
<p>As any marketer will tell you, the easier you make an action, the more likely it is people will do it. People are much more likely to go to the store down the street than one across town, for example.</p>
<p>In short, the ethics haven&#8217;t changed (at least not as significantly as the numbers would seem) but the ease of the technology makes it so that plagiarists feel the benefits of plagiarism outweigh the risks and the ethical concerns.</p>
<p>That, in turn, is a big part of why students are increasingly turning to plagiarism to solve their academic problems. However, there are still other reasons that may play a role.</p>
<h4>Other Factors to Weigh</h4>
<p>Of course, it isn&#8217;t entirely that simple. there are several other potential causes including:</p>
<ol>
<li><strong>Improved Plagiarism Detection:</strong> Advancements in plagiarism detection mean more plagiarists are caught and that means we know about more plagiarists. This is similar to how improved crime reporting can actually raise crime statistics.</li>
<li><strong>Changes in Attitude:</strong> There has definitely been at least some change in attitude in how students approach content. Though these changes are likely overstated, they do play a role.</li>
<li><strong>Changes in the Education Environment:</strong> Larger classes, more standardization and less emphasis on creativity help create an environment where students feel that cheating is acceptable and that they can get away with it.</li>
</ol>
<p>These factors, and others, definitely play a role but, almost certainly, the advancements in technology alone would cause plagiarism to rise, even without the other shifts.</p>
<p>However, this doesn&#8217;t mean that teachers can or should attempt to remove technology from the equation. The benefits of the Web and computers in general still far outweigh the drawbacks, it&#8217;s just a matter of bringing them in a way that&#8217;s productive and appropriate.</p>
<h4>Combating the Trend</h4>
<p>Battling this trend is not going to be easy. Removing computers is not the answer and it is unlikely such an attempt would be successful, especially on take home assignments.</p>
<p>Instead, it&#8217;s important to put some of the difficulty back into plagiarism and this is going to require teachers put forth extra effort, especially in crafting assignments.</p>
<p>Consider the following ideas:</p>
<ol>
<li><strong>Choose Plagiarism-Proof Topics:</strong> Pick topics that test student&#8217;s knowledge but can&#8217;t be easily Googled or written by someone not in the course, such as odd comparisons or something personal to the student.</li>
<li><strong>Require Multiple Drafts:</strong> Require that students submit multiple drafts and show their progress between them.</li>
<li><strong>Handwritten Portions:</strong> In some cases, you can have students submit portions of the assignment in handwritten form.</li>
<li><strong>Use In-Class Portions:</strong> In-class segments of an assignment, such as presentations, quizzes and even in-class writing are nearly impossible to plagiarize.</li>
<li><strong>Require Paper-Only Sources:</strong> Force students to have at least a certain number of print-only sources and specify they must be from your school&#8217;s library. Makes copying a paper online much more difficult.</li>
</ol>
<p>These are just a few suggestions, there are many more out there, but the basic idea is to add speedbumps to the process of plagiarism, making it less tempting (or even impossible) to do. That will do more to discourage plagiarism than any threat of punishment you can levy.</p>
<h4>Bottom Line</h4>
<p>Psychology shows us that <a href="http://www.northshorefamilies.com/whypunishmentdoesntwork.html">the threat of punishment doesn&#8217;t work</a> unless the punishing agent is there or the punishment is so severe that it is debilitating (and only after the punishment).</p>
<p>Since the teacher can&#8217;t be there when the student is writing their paper in their room (at least I&#8217;d hope not), many schools have adopted a &#8220;more punishment&#8221; approach. However, that does little good as most students don&#8217;t believe they&#8217;ll get caught and feel the risk is worth the gains.</p>
<p>In short, the only way to really impact plagiarism is to make it more difficult. While plagiarism should still be a punishable offense, if we&#8217;re going to talk about changing behaviors and reducing plagiarism, the emphasis has to be put on making it harder to plagiarize and skewing the gains one gets from cheating.</p>
<p>Without that, all of the detection and enforcement isn&#8217;t going to make any difference in the bigger picture.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2011/11/11/why-plagiarism-is-on-the-rise/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Teachers: You&#8217;re Handling Plagiarism Wrong</title>
		<link>http://www.plagiarismtoday.com/2011/09/21/teachers-youre-handling-plagiarism-wrong/</link>
		<comments>http://www.plagiarismtoday.com/2011/09/21/teachers-youre-handling-plagiarism-wrong/#comments</comments>
		<pubDate>Wed, 21 Sep 2011 18:07:44 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[academia]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[instructors]]></category>
		<category><![CDATA[iparadigms]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[plagiarism-detection]]></category>
		<category><![CDATA[turnitin]]></category>
		<category><![CDATA[writecheck]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=11136</guid>
		<description><![CDATA[If there's one thing the controversy over WriteCheck shows, it's that instructors have lost sight of what's important in the plagiarism battle.]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/09/school-closed-image-300x222.jpg" alt="School Closed Image" title="School Closed Image" width="300" height="222" class="alignleft size-medium wp-image-11140" />Recently, I had the good fortune to be <a href="http://www.insidehighered.com/news/2011/09/09/turnitin_writecheck_lights_fire_in_plagiarism_debate">interviewed for and quoted in a recent article for Inside Higher Ed</a>. However, the issue on hand was a thorny one, especially among teachers and professors: <a href="http://www.writecheck.com">WriteCheck</a>.</p>
<p>WriteCheck is a service that lets students submit a paper and have it checked for grammar and plagiarism. While there are many such services, this one is powered by iParadigms, the company that makes Turnitin, the most popular plagiarism checker for academic institutions. </p>
<p>This caused many to feel that iParadigms was, as one instructor put it, &#8220;warlords who are arming both sides in this plagiarism war.&#8221;</p>
<p>But while I can certainly understand this feeling of betrayal and this concern that students might use WriteCheck for the purpose of skirting plagiarism detection systems, perhaps using it to vet a purchased paper or to make sure their efforts to hide plagiarism were adequate, it is a misguided fear.</p>
<p>However, this is a fear that stems from faulty logic when it comes to fighting plagiarism and, sadly, the logic seems to be getting more pervasive as time goes on.</p>
<p>I&#8217;ve already talked at length about <a href="http://www.plagiarismtoday.com/2010/05/10/how-schools-are-hurting-the-fight-against-plagiarism/">how schools are hurting the fight against plagiarism</a>, but, as this WriteCheck controversy proves, schools have yet to really understand the issues involved and what they need to do to keep students from plagiarizing.<span id="more-11136"></span></p>
<h4>Stopping the War</h4>
<p>As I mentioned in the previous article, there&#8217;s a pervasive climate of fear when it comes to matters of plagiarism. Students are threatened with severe punishments over an act that is often poorly explained and seemingly decided by a computer that they never see. Even honest students can be afraid of getting caught plagiarizing and that, in turn, does no good for the academic climate. </p>
<p>But as the &#8220;warlord&#8221; comment points out, many instructors feel that they are at war against plagiarists and that iParadigms and similar companies arms dealers of sorts. This turns catching a plagiarist into a victory and one escaping a defeat, an attitude that doesn&#8217;t bring about any progress or understanding.</p>
<p>The truth is that every plagiarist caught by Turnitin or a similar automated checker is a failure of the system and a miserable one at that.</p>
<p>Yes, that line of defense is necessary and it should be there, but it is the absolute last resort and, despite the impressive technology, the least effective as well.</p>
<p>Consider the many opportunities to stop a plagiarist BEFORE they are caught by an automated detection system:</p>
<ol>
<li><strong>Education:</strong> Educating students about what is and is not plagiarism, from a practical standpoint, is important for helping them avoid accidentally plagiarizing and understanding why it is important not to do so deliberately.</li>
<li><strong>Assignment Building:</strong> A well-crafted assignment is virtually plagiarism proof. Building good assignments that are original, test the student&#8217;s knowledge and can&#8217;t be trivially copy/pasted is a huge step forward in the fight.</li>
<li><strong>Academic Resources:</strong> Schools need to make academic resources available, such as assignment assistance programs, to their students where they can ask questions and get help as a means to dissuade plagiarism and eencourage a useful conversation on the topic.</li>
<li><strong>Instructor Connection:</strong> Though not always practical, in many classes an instructor should know their student reasonably well and be able to detect when they are struggling, enabling them to reach out to them and provide greater help.</li>
<li><strong>Instructor Intuition:</strong> Once again, if an instructor is familiar with a student&#8217;s writing, they should be able to detect plagiarism without having to run it through an automated system.</li>
</ol>
<p>In short, with so many ways to stop or detect plagiarism BEFORE it reaches an automated checker, every case that gets that far has to be seen as a failure, a breakdown in the chain before that point.</p>
<p>Granted, some of these shortfalls have more to do with the education system at large, which often puts far too many students into a class, dividing up instructor attention too many ways.</p>
<p>However, others are more case specific, calling on the schools and instructors to think about plagiarism in a different way and shift their focus from &#8220;winning the war&#8221; to actually educating and dealing with the issue.</p>
<h4>Moving Forward</h4>
<p>The simple truth is, the developers of plagiarism detection system, iParadigms in particular, never intended themselves to be the plagiarism police. Obviously, that is going to be part of their function but such tools hold a great deal of potential to make students better researchers, something that WriteCheck can also do.</p>
<p>While there will always be hardcore cheaters who will &#8220;write&#8221; papers via copy/paste in hopes of getting a good grade or just get out of an assignment, they still are far outnumbered by the legitimate students caught up in the climate of fear.</p>
<p>However, it&#8217;s that climate of fear that may actually be encouraging plagiarism. Since many students feel they can&#8217;t control if they will or will not be accused of plagiarism, they feel they might as well be a plagiarist.</p>
<p>Warped logic, to be certain, but further proof of how relying so heavily on the last line of defense can actually make the plagiarism problem much worse.</p>
<p>The way forward is to end the war on plagiarism, open up a dialog about it and focus the punitive efforts solely on the hardcore cheaters.</p>
<p>This approach much better serves the bulk of the students and limits the &#8220;arms race&#8221; discussion that&#8217;s taking place now.</p>
<h4>Bottom Line</h4>
<p>Will this shift in attitude be easy? No. Teachers are angry. They feel a lot of their students are trying to cheat their way to better grades both by lying to them and stepping over honest students who worked hard.</p>
<p>This anger is understandable, but it&#8217;s rare that good policy comes from an emotional response.</p>
<p>If schools and instructors look at plagiarism as a practical problem, the issue becomes much more clear. Over-reliance on plagiarism checking technology isn&#8217;t solving the problem, but creating a climate of fear and producing smarter plagiarists.</p>
<p>In short, WriteCheck isn&#8217;t the enemy, but the hatred of it is a symptom of a much greater problem and one that has to be addressed now lest the situation go completely out of control.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2011/09/21/teachers-youre-handling-plagiarism-wrong/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>How to Use PlagScan Screencast</title>
		<link>http://www.plagiarismtoday.com/2011/09/19/how-to-use-plagscan-screencast/</link>
		<comments>http://www.plagiarismtoday.com/2011/09/19/how-to-use-plagscan-screencast/#comments</comments>
		<pubDate>Mon, 19 Sep 2011 18:49:14 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[plagiarism-detection]]></category>
		<category><![CDATA[plagscan]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=11106</guid>
		<description><![CDATA[If you read the earlier review of PlagScan and want to give it a try, here's a quick screencast to get you started.]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/09/plagscan-logo-300x103.jpg" alt="PlagScan Logo" title="PlagScan Logo" width="300" height="103" class="alignleft size-medium wp-image-10936" />Earlier this month, <a href="http://www.plagiarismtoday.com/2011/09/06/plagscan-review-solid-plagiarism-detection/">I did a review of the German plagiarism detection service PlagScan</a>. In that review, I found that <a href="http://www.plagscan.com">PlagScan</a>, overall, did very well in detecting copies of a work online and that it compared favorably to competing services such as <a href="http://copyscape.com">Copyscape</a> and <a href="http://plagium.com">Plagium</a>.</p>
<p>All of this is despite the fact that PlagScan is aimed at a more academic audience and wasn&#8217;t originally designed for this kind of use. However, thanks to PlagScan&#8217;s powerful detection engine, its results are more than adequate.</p>
<p>Still, there are some UI issues with PlagScan, largely because of the audience it was aimed at. As such, it might be a bit difficult to use at first, especially if you aren&#8217;t familiar with the user interface and the &#8220;points&#8221; system.</p>
<p>So, with that in mind, I created a very short screencast that demonstrates how PlagScan works and I do it by doing a demo check using Charles Dickens&#8217; &#8220;A Tale of Two Cities&#8221; (or at least the first few paragraphs) as a test.</p>
<p>All in all, you can easily see how PlagScan works and begin trying it out for yourself.</p>
<p>However, if you are a more advanced urser, you may be interested to know that <a href="http://www.plagscan.com/api/guide">there is an API that you can use</a> if you want to integrate PlagScan in with any CMS or other application. While this probably isn&#8217;t something your average writer will be interested in, plugin authors may find this to be a useful tool for building plagiarism-checking into existing tools.</p>
<p>So, without further ado, the video is embedded below!</p>
<p><iframe width="560" height="315" src="http://www.youtube.com/embed/qre80TaJlZw" frameborder="0" allowfullscreen></iframe></p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2011/09/19/how-to-use-plagscan-screencast/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>What Yahoo!&#8217;s Downfall Might Mean for Plagiarism Detection</title>
		<link>http://www.plagiarismtoday.com/2011/09/08/what-yahoos-downfall-might-mean-for-plagiarism-detection/</link>
		<comments>http://www.plagiarismtoday.com/2011/09/08/what-yahoos-downfall-might-mean-for-plagiarism-detection/#comments</comments>
		<pubDate>Thu, 08 Sep 2011 17:56:14 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[api]]></category>
		<category><![CDATA[bing]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[plagiarism-detection]]></category>
		<category><![CDATA[Yahoo]]></category>
		<category><![CDATA[yahoo boss]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=10975</guid>
		<description><![CDATA[The turmoil at Yahoo! should give pause to everyone who is partners with the company and that includes many plagiarism detection services.]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/09/yahoo-logo-300x68.jpg" alt="Yahoo! Logo" title="Yahoo! Logo" width="300" height="68" class="alignleft size-medium wp-image-10977" /><a href="http://www.forbes.com/sites/ericsavitz/2011/09/06/for-yahoo-it-looks-the-beginning-of-the-end/">Times are clearly tough at Yahoo!</a>. With its current CEO recently fired, slipping marketshare and rumors of a pending sale, Yahoo! has certainly seen better days.</p>
<p><a href="http://www.statowl.com/search_engine_market_share.php">With a search engine marketshare of less than 10%</a>, Yahoo! is already largely seen as irrelevant when it comes to general search, especially since it began outsourcing its search results to Bing!</p>
<p>However, there is at least one area where Yahoo! has remained a critical player: Plagiarism detection.</p>
<p>Simply put, many of the most popular plagiarism detection services take advantage of <a href="http://developer.yahoo.com/search/boss/">Yahoo! Search Boss API</a> (Application Programming Interface), which has made creating a plagiarism detection service both affordable and relatively simple. </p>
<p>So, as Yahoo!&#8217;s future hangs into balance, so does the future and capability of many of the Web&#8217;s best-known plagiarism detection services including <a href="http://www.copyscape.com">Copyscape</a>, <a href="http://plagium.com">Plagium</a>, <a href="http://www.plagscan.com/">PlagScan</a> and <a href="http://www.plagaware.com/">PlagAware</a>, all of which use Yahoo! either exclusively or in part to find their results.</p>
<p>To be clear, there&#8217;s no immediate threat to Yahoo! BOSS and its closure has not even been mentioned. This is purely an academic exercise.</p>
<p>However, with such uncertain times ahead for Yahoo!, the question gets raised, what would a Yahoo!-less plagiarism detection landscape look like? The answer isn&#8217;t very clear.<span id="more-10975"></span></p>
<h4>Why Yahoo! is Important</h4>
<p><img style=' float: right; padding: 4px; margin: 0 0 2px 7px;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/09/yahoo-boss-unsized-300x126.jpg" alt="Yahoo Boss API" title="Yahoo Boss API" width="300" height="126" class="alignright size-medium wp-image-10979" /></p>
<p>Without using a search API of some sort, a plagiarism detection service would have to crawl websites and create its own index, a time-consuming and expensive process that would cause the services to be prohibitively expensive. Fortunately, most major search engines offer APIs that enable plagiarism checkers, as well as other services, to tap into their indexes for a relatively easily and cheaply.</p>
<p>Many plagiarism detection services began using Yahoo! BOSS over competing offerings for a simple reason: Cost.</p>
<p>Historically, Yahoo! BOSS was a free service. But, even after Yahoo! began to charge for the service (shortly after it began to use Bing for search results) the <a href="http://techcrunch.com/2011/02/08/yahoo-boss-cost/">cost of using Yahoo! BOSS was still many times cheaper than using Google&#8217;s Search API</a>.</p>
<p>This is why many of the best-known plagiarism detection services are built either in whole or in part on Yahoo! BOSS.</p>
<p>This cost is important because performing a single plagiarism check, usually, requires multiple API queries (the exact amount depends on how the service handles queries, the length of the work involved and other factors). As such, these API costs often become a major expense for these services.</p>
<p>But more than just a cost issue, the presence of a competing API to Google also offers a different perspective. Being able to tap multiple indexes of the Web rather than just one has the potential to ensure the maximum number of results are returned, especially since the different indexes often catch different content. </p>
<p>In short, without the Yahoo! Boss API, we are likely looking at a much more expensive and more limited future for plagiarism detection.</p>
<h4>What Does a Yahoo!-less Future Look Like?</h4>
<p>If Yahoo! BOSS were to go away, the future is definitely a difficult one for many plagiarism detection services.</p>
<p>Some, such as Copyscape and Plagium, already mix results from multiple sources (Google/Yahoo! and Yahoo!/Bing respectively) and would likely just lose some of their fidelity in their results. Copyscape would, arguably, be in a better position than most as it began life using the Google API.</p>
<p>Others, such as PlagAware and PlagScan, both of which use (or seem to use) Yahoo! exclusively would be forced to write a completely new backend for their service. This could have a drastic impact on how they detect duplicate content and how effective they are (better or worse).</p>
<p>Higher-end services, like <a href="http://www.attributor.com">Attributor</a>, which use their own index of the Web would be unaffected by any change or closure of Yahoo! BOSS and may even have their position strengthened.</p>
<p>All in all though, there would be a major shuffle ahead for plagiarism detection services as they looked to fill the void left by Yahoo! BOSS. </p>
<h4>Where Would the Refugees Go?</h4>
<p>Those who depend on Yahoo! BOSS, if they wanted to stay open, would have a tough choice ahead of them as there are only two (major) providers who would remain.</p>
<ol>
<li><strong>Google:</strong> Google&#8217;s API is definitely robust, as is Google&#8217;s index, but is also much more expensive than Yahoo! BOSS.</li>
<li><strong>Bing:</strong> Bing&#8217;s API is much less established and not as well regarded as Google&#8217;s but it is free for unlimited queries, just as Yahoo! BOSS was. However, <a href="http://www.bing.com/developers/tou.aspx">the API&#8217;s TOU</a> may pose challenges in some cases, <a href="http://joerussbowman.tumblr.com/post/121174263/the-bing-api-is-not-free">specifically related to advertising requirements</a>.</li>
</ol>
<p>In short, for developers it&#8217;s a choice between an established and robust API that is more expensive and a newer one that comes with limitations on how the results can be used. </p>
<p>Most would likely go with Bing as it is the most natural replacement (especially since Yahoo! results come from Bing) but it remains to be seen if Bing&#8217;s results can compare with Yahoo! or Google&#8217;s for this purpose.</p>
<p>That would be something very interesting to test in the future.</p>
<h4>Bottom Line</h4>
<p>To reiterate the good news, there&#8217;s no immediate threat to Yahoo! BOSS at this time so all of the above is merely hypothetical. There has been no talk of closing Yahoo! BOSS and, given that it currently is a revenue generator for Yahoo!, it isn&#8217;t likely to be first on the chopping block.</p>
<p>That being said, the turmoil at Yahoo should give cause for concern to those who rely on the Yahoo! BOSS and the time may well be now to start looking at alternatives.</p>
<p>After all, if the end of Yahoo! BOSS does come, it will likely be sudden and it may be difficult for companies that rely on it to quickly reconfigure their products.</p>
<p>Even if it seems unnecessary at this time, preparing for the possibility may be the best move these services can make.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2011/09/08/what-yahoos-downfall-might-mean-for-plagiarism-detection/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Why Fighting Plagiarism is Important</title>
		<link>http://www.plagiarismtoday.com/2011/08/15/why-fighting-plagiarism-is-important/</link>
		<comments>http://www.plagiarismtoday.com/2011/08/15/why-fighting-plagiarism-is-important/#comments</comments>
		<pubDate>Mon, 15 Aug 2011 17:47:55 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[academia]]></category>
		<category><![CDATA[academic]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[detection]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[plagiarism-detection]]></category>
		<category><![CDATA[punishment]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=10638</guid>
		<description><![CDATA[For a long time now, schools have been avoiding the issue of plagiarism. However, the time to deal with plagiarism is today and the reasons are clear.]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/08/punishment-jar-sample-165x250.jpg" alt="Swear Jar" title="Swear Jar Sample" width="165" height="250" class="alignleft size-medium wp-image-10711" />It&#8217;s a story that I hear all-too-often. A professor, in this case Panagiotis G. Ipeirotis, an Associate Professor at Stern School of Business of New York University, cracks down on plagiarism in his classroom and makes a push to catch and report cheaters.</p>
<p>Ipeirotis&#8217; efforts definitely produced results. Over the semester, he found that some 20 percent of his class had plagiarized to one degree or another and began taking action against them. His reward, however, wasn&#8217;t a promotion or praise, but rather, him <a href="http://www.insidehighered.com/news/2011/07/22/nyu_professor_s_blog_post_sets_off_debate_on_plagiarism">having his raise reduced to the lowest amount he&#8217;d seen</a>.</p>
<p>The reason, according to both Ipeirotis and the justification he received for the small raise, was that his students, many of which he had caught and reported for plagiarism, had rated him poorly.</p>
<p>What made Ipeirotis&#8217; case unique was not that he fought plagiarism and was punished, <a href="http://behind-the-enemy-lines.blogspot.com/2011/07/why-i-will-never-pursue-cheating-again.html">but that he spoke out about it on a now-removed blog post</a>. Behind the scenes, teachers have long been boiling over with concerns that their schools are not taking plagiarism issues with weight and, sometimes, are actively discouraging addressing the problem.</p>
<p>For that to change. schools need to take plagiarism seriously and begin rewarding teachers, the ones on the front lines, for addressing this issue. This means both taking the detection and discipline side of fighting plagiarism seriously as well as looking to alternative solutions that could render the problem moot.<span id="more-10638"></span></p>
<h3>Why Schools Turn a Blind Eye</h3>
<p>To be blunt, it&#8217;s a difficult time for schools, especially in the U.S. At all levels and both public and private, money is tight and resources are very limited. Dollars for plagiarism fighting are a low priority in the big scheme of things, especially as issues that could impact the safety of students and faculty are growing in number and priority.</p>
<p>The truth is that fighting plagiarism doesn&#8217;t help test scores, improve graduation rates, bring in new students or improve the school&#8217;s reputation. As important as it is, a school can turn a blind eye to plagiarism and still function.</p>
<p>To make matters worse, fighting plagiarism often times hurts the schools in meeting benchmarks. Disciplined students often drop out, lowering graduation rates, and students that fail classes due to plagiarism lower the overall GPA.</p>
<p>Image-conscious schools have also become wary of the reputation issues that come from actively pursuing plagiarists. Dealing with a large amount of it earn a school a reputation for being a plagiarism haven, even though the amount found actually proves the opposite.</p>
<p>This is then compounded by plagiarists who use social media to bash schools online. Smaller, lesser-known schools are especially vulnerable to these kinds of attacks.</p>
<p>These challenges have led to an atmosphere where many instructors feel that students are treated more like customers, to be pleased and cared for, rather than students who need to be educated and graded.</p>
<p>That problem doesn&#8217;t just impact plagiarism, but all areas of academic unpleasantness. From homework, to grade curves and more, the relationship between teacher and student is changing, likely not for the better.</p>
<h3>Getting Serious About Plagiarism</h3>
<p>If schools want to provide the best education they can to their students, this attitude must change and soon.</p>
<p>For one, if there to be any merit to the idea that college is meant to prepare students for later occupations, plagiarism must be dealt with and strongly.</p>
<p>While it&#8217;s true that those who plagiarize an assignment successfully located the needed information, which is a part of any assignment (both in and out of school), it&#8217;s a part that is almost trivial with the birth of the Internet and the student still skipped on many of the most important elements.</p>
<p>Academic assignments, at least good ones, do far more than teach students how to find and spit back information. They teach critical thinking, including how to challenge ideas. They teach students how to spot connections and trends among bits of data they have and they even help improve writing skills, a necessary tool just about anywhere one goes.</p>
<p>Students who plagiarize an assignment miss most of the education that could have come from it. That, in the long run, means a lower quality education, which means a lower-quality graduate if they go that far.</p>
<p>But before one walks away thinking plagiarists only cheat themselves. consider the following issues:</p>
<ol>
<li>Good students, sensing or knowing that their peers are cheating to get good grades, often better than theirs, will either start cheating as well, reduce their efforts or simply leave.</li>
<li>Good instructors, detecting plagiarism but unable to effectively respond to it, will often reduce their efforts or leave, once again reducing the quality of education for all students, cheaters or not.</li>
<li>Students who cheat are, generally, less dedicated to their education. They make poorer graduates that not only are less likely to become active alumni, but also will reflect badly on the school in other ways after graduation.</li>
</ol>
<p>This isn&#8217;t to say that every plagiarist is a doomed failure that will sink your school, but plagiarism as an epidemic will, over time, erode the quality of education for everyone there and hurt the school&#8217;s reputation.</p>
<p>However, since most of the dire impacts take years to show up, many schools are happy to kick the can down the road and hope for a better solution to the plagiarism problem later.</p>
<h3>Today is the Day</h3>
<p>The problem with kicking the can down the road is that now is, most likely, the ideal time to address these issues.</p>
<p>First off, the technology to detect plagiarism is the best it has ever been and the cheapest it has ever been. It&#8217;s less expensive, easier to use and more powerful than ever. Unfortunately, new plagiarism techniques may soon shift the balance, making it critical to address these issues now, while instructors have the upper hand.</p>
<p>Second, the Internet generation is just now truly coming of age. Students who have never known research without the Internet are just now reaching the higher levels of education. Sadly, these are the ones perceived to be the greatest risk of plagiarizing, rightly or wrongly, but if they are reached now, then those behind them will see the shift in culture. </p>
<p>Finally, we have ways of dealing with plagiarism other than punishment. If we&#8217;re going to shift the academic culture away from the trend toward plagiarism, we can&#8217;t simply punish our way out of it. Education is critical and the tools above make it easier to do just that. However, in a few years, education might not be possible, or at least not as easy, as the plagiarists will be the ones who have done nearly all the teaching.</p>
<p>In short, now is the time to strike and waiting until tomorrow just makes the batter harder and even less-winnable. </p>
<h4>Bottom Line</h4>
<p>It&#8217;s time for schools to reward teachers like Professor Ipeirotis for their hard work fighting plagiarism and, more importantly, to start opening addressing the issue. Though it&#8217;s tempting to sweep that matter under the rug, doing so misses a valuable opportunity to deal with the issue and risks lowering the quality of education for everyone.</p>
<p>Plagiarism is certainly not a pleasant business, I know that well because it&#8217;s my 9-to-5, but it&#8217;s an important one.</p>
<p>Schools need to address it. Not just so that they can make better students, but better creatives and better workers. After all, when cheating becomes a way of life, its impact is felt well beyond the classroom.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2011/08/15/why-fighting-plagiarism-is-important/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Why Plagiarism Detection Alone Isn&#8217;t Enough</title>
		<link>http://www.plagiarismtoday.com/2011/08/04/why-plagiarism-detection-alone-isnt-enough/</link>
		<comments>http://www.plagiarismtoday.com/2011/08/04/why-plagiarism-detection-alone-isnt-enough/#comments</comments>
		<pubDate>Thu, 04 Aug 2011 19:24:10 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[plagiarism analysis]]></category>
		<category><![CDATA[plagiarism-detection]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=10614</guid>
		<description><![CDATA[Many, both in and out of academia, have put near-total faith in plagiarism detection tools. Here's what it takes to do a full plagiarism analysis.]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/08/chart-sample-image-300x197.jpg" alt="Plagiarism Detection Image" title="Plagiarism Analysis Image" width="300" height="197" class="alignleft size-medium wp-image-10616" />One of the most common things I do as part of my <a href="http://copybyte.com/">copyright and plagiarism consulting service</a> is a plagiarism analysis. However, these are not simple tasks nor are they small projects. Even an analysis of a short work can take several hours to complete, many more if a formal report is needed.</p>
<p>However, when I give proposals on doing a plagiarism analysis, many are taken aback by the time and costs involved. I often get the question, &#8220;Can&#8217;t you just Google it?&#8221; or something to the effect.</p>
<p>The problem is that, while plagiarism detection technology certainly has made it easier than ever to detect copied text, it doesn&#8217;t actually make a determination about what is or is not plagiarism. In fact, leaning too heavily on automated plagiarism detection is one of the ways that people make mistakes many people make, creating a very real issue with false positives and false negatives that only a human can sort out.</p>
<p>So what is involved in a thorough plagiarism analysis? I&#8217;ll explain below.<span id="more-10614"></span></p>
<h4>The Three Types of Analysis</h4>
<p>Typically, there are three different scenarios for which a plagiarism analysis is used.</p>
<ol>
<li>A Believed Original Work is Checked for Suspicion</li>
<li>A Suspicious Work is Checked Against Unknown Sources</li>
<li>A Suspicious Work is Checked Against Known Sources</li>
</ol>
<p>It&#8217;s easy to see how a single case of plagiarism can actually need all three types of analysis. For example, if a work without suspicion receives a quick analysis though an automated plagiarism detection system, it might call for additional attention and then move into a case where it&#8217;s compared against a suspiciously similar source.</p>
<p>However, all three analyses are slightly different. The first, for example, is usually just a quick automated check. The second is a more thorough one, but is against all known and available sources. The final one is a very specific, hands on search that compares two or more works side by side.</p>
<p>That being said, of the three types, only the third is where the question, &#8220;Is this plagiarism?&#8221; is answered with any meaning. </p>
<p>So, with that in mind, here&#8217;s a look at how those analyses are done by myself and, to my understanding, most others who do them.</p>
<h4>Doing the Analysis</h4>
<p>The reason that one can not simply Google an analysis or punch a work into a machine and get a definite answer is that machines can only detect copying, not plagiarism. However, they don&#8217;t even do a good job detecting copying as they can only spot exact copies, not paraphrases or other alterations.</p>
<p>Also, machines can&#8217;t distinguish between copying that is likely plagiarized or copying that is mere coincidence. Likewise, it can&#8217;t easily check for citations either to find unattributed lifting. </p>
<p>As such, there&#8217;s much more to a plagiarism analysis, including, typically, the following steps.</p>
<ol>
<li><strong>Automated Analysis:</strong> The work is put through some form of automated plagiarism detection. Possibly first a broad one against all known sources and then a more specific one against the one or two suspected sourced. Provides general information on suspect areas, percent of work copied and so forth.</li>
<li><strong>Citation Analysis:</strong> Citations are then checked fort he matching content and any properly attributed work, generally, is removed from consideration. Exceptions include cases of citation plagiarism (lifting citations from an earlier work) and situations where it seems they copied from a source that used the citation correctly first.</li>
<li><strong>Common Phrases Discarded:</strong> Next, since most plagiarism detection systems, especially sensitive ones, report short, common phrases as being matches, sentences that are likely mere coincidence need to be removed. </li>
<li><strong>Paraphrasing/Rewording Detected:</strong> The opposite problem is that plagiarism detection systems won&#8217;t detect paraphrased or otherwise altered plagiarism. This makes it critical to read the works involved carefully, using the automated analysis as a guide, to find areas of likely paraphrasing and rewording.</li>
<li><strong>Other Parts Added/Removed From Consideration:</strong> Depending on the nature of the works, other passages may be added or removed from consideration. For example, in cases of law school plagiarism, certain passages might be non-common but required. Those need to be removed. Other signs are also considered such as odd grammar mistakes the works have in common, etc.</li>
<li><strong>New Tallies Are Made:</strong> With all additions/removals made, a new tally is made to decide how much of the work is suspect of being copied without attribution.</li>
<li><strong>Analysis Made:</strong> The determined amount of unattributed copying is then weighed against the ethical standards the work had to live up against to decide if it would likely be considered a plagiarism.</li>
</ol>
<p>At this point, the findings are usually presented and, if needed, the formal report is drafted. </p>
<p>Needless to say, this process is time-consuming, tedious and requires an intimate level of knowledge with the works. I typically do these analysis with multiple note files, spreadsheets and other documents. Larger projects often require the help or of one or more other people.</p>
<p>Even in cases where the plagiarism is obvious the analysis needs to be done both to prove that it is as obvious as it seems and to show just how deep it goes. Very few cases don&#8217;t require this level of work and those usually don&#8217;t require an analysis at all.</p>
<p>This, in turn, is why even plagiarism analyses on shorter works can take several hours and longer ones can often take weeks.</p>
<h4>Bottom Line</h4>
<p>Accusing someone of plagiarism is almost never a simple matter. Accusing someone of plagiarism can have dire consequences for one&#8217;s future, either academically or in their career. It is not something that should be taken lightly or simply handed off to a machines, no matter how good their &#8220;plagiarism detection&#8221; is.</p>
<p>Machines don&#8217;t understand the complexities of paraphrasing, citation and the general ethics of plagiarism. Those are decisions humans have to make.</p>
<p>This is why true plagiarism analysis requires so much work and takes so long to do properly.</p>
<p>Without that time and effort, the risks of false positives or false negatives is simply too great, all with an accusation for which there should never be any mistake. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2011/08/04/why-plagiarism-detection-alone-isnt-enough/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Catching Dead Plagiarists: Is it Worth It?</title>
		<link>http://www.plagiarismtoday.com/2011/08/02/catching-dead-plagiarists-is-it-worth-it/</link>
		<comments>http://www.plagiarismtoday.com/2011/08/02/catching-dead-plagiarists-is-it-worth-it/#comments</comments>
		<pubDate>Tue, 02 Aug 2011 19:05:35 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[google book search]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[plagiarism-detection]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=10586</guid>
		<description><![CDATA[Book digitization could unleash a torrent of new plagiarism discoveries. But will it and is it worth it if it does?]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/08/cemetery-image-sample-300x225.jpg" alt="Cemetery Image " title="Cemetery Image" width="300" height="225" class="alignleft size-medium wp-image-10592" />Back in 2006, Slate ran an article entitled &#8220;<a href="http://www.slate.com/id/2153313/">Dead Plagiarists Society: Will Google Book Search uncover long-buried literary crimes?</a>&#8221;</p>
<p>The premise is simple: Google Book Search and other book digitization efforts are making it incredibly easy to search through the text of books that are centuries old. But, as a large part of human literary and research history come under the digital microscope, many are expecting revelations of text lifting to be levied against long-dead authors and researchers.</p>
<p>However, that hasn&#8217;t exactly happened. Though there has been no shortage of plagiarism scandals over the past five years, many of which I&#8217;ve chronicled here. They all, more or less, have been against recent plagiarists. Simply put, the interest in locating and reporting on plagiarism from decades or centuries ago is not there.</p>
<p>But should this be something we do? The answer is complicated but it seems there are good reasons to investigate, even if there is a strong temptation to let dead plagiarists rest in peace.</p>
<h4>The Problem With Tracking Dead Plagiarists</h4>
<p>When it comes to tracking dead plagiarists, there&#8217;s one simple problem: Finding them is very hard.</p>
<p>Going through a book or research paper for plagiarism is easy from a technical standpoint, but doing the human analysis to piece together exactly how much of the work is infringed and how much of it is not can be very difficult.</p>
<p><a href="http://copybyte.com/for-content-users/">As someone who does plagiarism analysis as part of his consulting work</a>, I can say safely that even a thorough analysis of a short work can take several hours. Longer works, such as novels, can literally take days of working time and often require multiple people.</p>
<p>Most cases of detected plagiarism start with a reader noting something familiar and/or odd and then investigating. Most cases where software first spots plagiarism stem from situations where it is run over a wide array of work, such as using a plagiarism checker in a newsroom or classroom, and then analyzed by humans later.</p>
<p>There is simply little motivation to go through large bodies of historical work and detect plagiarism as the rewards for doing so are slim. Though it might be tempting to go through and try to find dishonesty on important political figures/authors, without an ulterior motive there&#8217;s little to gain by catching a long-dead plagiarist as the deed has been done and all rewards/punishments long since reaped.</p>
<p>But is it something we should be doing and, if so, who should do it?</p>
<h4>The Merits of Our Plagiarism History</h4>
<p>While there isn&#8217;t a great deal to motivate an individual plagiarism hunter to go through most older works, society could benefit greatly from an understanding of plagiarism and how it has changed over the centuries.</p>
<p>A plagiarism analysis could show much more than who stole from whom, but also show hidden influences of authors, how the ideals of authorship have changed over the centuries and even reveal possible collaborations. In short, we could get a very unique and very rich understanding of our literary history, all in a way that we lack today.</p>
<p>However, we also risk dragging great literary and scientific names through the mud needlessly. It&#8217;s possible we could accuse great thinkers of plagiarism when their actions were completely acceptable in their time period. But even if they were acting unethically, we have to ask ourselves tough questions about if and how that changes our views of their works.</p>
<p>The other risk is that such research, especially if it found rampant plagiarism historically (even more so than already known), it could undermine current plagiarism education and work. If so many famous authors built careers in part on the back of plagiarism, why should a college student writing a term paper give it a second thought?</p>
<p>Still, the benefits still seem to far outweigh the risks, especially considering how much plagiarism is already known. This is a part of our literary and scientific history we need to understand in greater detail. </p>
<p>So who should do it? The answer seems simple. Colleges and universities are best poised to tackle this kind of research. With the tools, knowledge, personnel and interest in research, they have the most to gain and are in the best position.</p>
<p>Sadly though, few schools are interested in doing any research on plagiarism, largely because of the feared stigma of being &#8220;plagiarism college&#8221;.</p>
<h4>Bottom Line</h4>
<p>In the end, it&#8217;s a shame that there isn&#8217;t more research done on the issue of plagiarism but, in this case, the issue may be moot. <a href="http://www.plagiarismtoday.com/2009/03/31/famous-plagiarists-could-it-happen-today/">As we discussed previously</a>, we are already aware of a lot of plagiarist authors from the past, including many famous ones. In fact, many were actually caught during their lives but, one way or another, managed to continue with their careers.</p>
<p>So, if this isn&#8217;t an issue worth researching, it&#8217;s likely not because there isn&#8217;t valuable information to glean, but because so much of the work has already been done.</p>
<p>Still, it would be nice to know how many others might be out there and what they could teach us about your written works. However, it&#8217;s a lesson we probably will never really learn.</p>
<p><em><strong>Hat Tip:</strong> A big thanks to <a href="http://www.mired.org/">Mike Meyer from Meyer Consulting</a> for the heads upon the Slate article.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2011/08/02/catching-dead-plagiarists-is-it-worth-it/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Plagium Introduces Deep Search</title>
		<link>http://www.plagiarismtoday.com/2011/04/13/plagium-introduces-deep-search/</link>
		<comments>http://www.plagiarismtoday.com/2011/04/13/plagium-introduces-deep-search/#comments</comments>
		<pubDate>Wed, 13 Apr 2011 18:32:59 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[plagiarism-detection]]></category>
		<category><![CDATA[plagium]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=9463</guid>
		<description><![CDATA[Plagiarism detection service Plagium has introduced a "Deep Search" tool to help you find more matches and better search through longer works.]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2009/05/plagium-logo-300x71.jpg" alt="" title="plagium-logo" width="300" height="71" class="alignleft size-medium wp-image-3419" /></p>
<p>Earlier this week, <a href="http://plagium.com">Plagium</a> <a href="http://blog.plagium.com/2011/04/plagium-announces-deep-search.html">announced its new &#8220;Deep Search&#8221; feature</a>, which it hopes will make it easier to spot duplicates and more subtle plagiarisms/copies in longer works.</p>
<p>The new feature works by separating a longer work into multiple sections, each roughly a paragraph in length, locating duplicated content within each section and displaying the matching content contained within each detected page. </p>
<p>The idea is to make it easier to go through longer document, to more quickly understand which copies are the most important, the content they are using and how much matching material there is.</p>
<p>The question, however, is how well does the system work and is it worth the money that Plagium is charging? To find out, I ran Plagium through a series of quick tests to see how well it performed.<span id="more-9463"></span></p>
<h4>How Plagium Deep Search Works</h4>
<p>Previously I talked about Plagium and <a href="http://www.plagiarismtoday.com/2009/05/07/plagium-a-copyscape-alternative/">compared it favorably to Copyscape</a> and other, similar plagiarism checkers for the purpose of finding plagiarisms and other copies of your work online. I even mentioned it in a case study showing how it was useful in <a href="http://www.plagiarismtoday.com/2010/10/05/case-study-tracking-a-sneaky-plagiarist-poet/">catching a plagiarizing poet</a>.</p>
<p>However, one of my gripes about Plagium was that it has always been difficult to parse the results. Plagium has always provided good information about the infringing pages, but not necessarily about what was being copied. This was especially problematic with longer documents where the copied text might be buried deep within the page. </p>
<p>Plagium&#8217;s deep search attempts to fix that. By breaking lengthy documents into sections and showing match results for each part of the document, it makes it easy to both get a general overview of the entire document via its &#8220;summary&#8221; feature and results for each part of the document.</p>
<p><img src="http://www.plagiarismtoday.com/wp-content/uploads/2011/04/plagium-sample-500x304.jpg" alt="" title="plagium-sample" width="500" height="304" class="alignnone size-large wp-image-9464" /></p>
<p>This deep searching, however, comes at a cost. Unlike Plagium&#8217;s &#8220;Quick Search&#8221; feature, which is the equivalent of its previous service, deep searching is not free. Deep searches from Plagium cost $1 for 100,000 characters (approximately 20,000 words), $2 for 200,000 characters (approximately 40,000 words) and $10 for 1,100,000 characters (approximately 220,000 words). </p>
<p>So is Deep worth the money? I put the process through a few tests to find out.</p>
<h4>Usability and Interface</h4>
<p>Right off the bat, there were a few things that annoyed me about Plagium&#8217;s Deep Search feature. First and foremost was that these searches were far from quick. </p>
<p>Even for medium-length documents, these searches routinely took longer than 40 seconds. While that might not seem long, bear in mind that other services, including Plagium&#8217;s Quick Search tool, usually take less than four seconds. Basically, if you perform a Deep Search, be prepared to wait.</p>
<p>The interface itself was functional but not exactly attractive or impressive. The results are broken up into a summary and a detailed report. The first provides just an overview of the pages detected and the latter does the section-by-section breakdown.</p>
<p>One useful feature is the ability to delete sites from the results. This is useful both if you have domains that you aren&#8217;t interested in, such as a permitted use or even your own site, or to remove sites you&#8217;ve already processed.</p>
<p>Still, it&#8217;s frustrating that there was no way to do a full side-by-side comparison of the original and the duplicate. Though hovering your mouse over each result in the detailed report would show you the matching text in that section, getting a complete view of the suspected copying is something that&#8217;s impossible with Plagium but easy with Copyscape.</p>
<p>That being said, I do like the way Plagium breaks down the similarities by words, sentences and highest search engine rank. It makes it very easy to get an &#8220;at a glance&#8221; understanding how serious the copying really is and lets you prioritize matches easily.</p>
<p>However, all of these features are meaningless without good matching.  To find out how well Plagium&#8217;s Deep Search performed, I ran it through a series of tests designed to compare it to similar services.</p>
<h4>Matching Tests</h4>
<p>To better understand how well Plagium&#8217;s Deep Search tool did at detecting plagiarism, I decided to do several side-by-side tests comparing it against both their free offering and Copyscape&#8217;s Premium offering. </p>
<p>The results are below:</p>
<p><strong>Client Page 1</strong></p>
<p>First I decided to run <a href="http://www.ravensrants.com/loner/">an old prose work</a> of mine that had relatively limited copying to see how well the various engines did when dealing with older works in a more traditional format.</p>
<table cellspacing=15>
<tr>
<td><strong>Plagium Deep Scan</strong></td>
<td><strong>Plagium Quick Scan</strong></td>
<td><strong>Copyscape Premium</strong></td>
</tr>
<tr align="center">
<td>5</td>
<td>2</td>
<td>1</td>
</tr>
</table>
<p><strong>Press Release</strong></p>
<p>Second, I tested <a href="http://www.businesswire.com/news/home/20110412005305/en/Copyright-Clearance-Center-Launches-%E2%80%98Get-Now%E2%80%99-Academic">a recent press release by the Copyright Clearance Center</a> to see how well it detected copying that had taken place very recently. </p>
<table cellspacing=15>
<tr>
<td><strong>Plagium Deep Scan</strong></td>
<td><strong>Plagium Quick Scan</strong></td>
<td><strong>Copyscape Premium</strong></td>
</tr>
<tr align="center">
<td>95</td>
<td>20</td>
<td>25</td>
</tr>
</table>
<p><strong>Poem</strong></p>
<p>Finally, I tested <a href="http://www.ravensrants.com/friends-or-lovers/">an old poem of mine</a> that I knew had widespread copying and reuse, both legitimate and illegitimate, to determine how well it handled poetry and works without traditional paragraph breaks. </p>
<table cellspacing=15>
<tr>
<td><strong>Plagium Deep Scan</strong></td>
<td><strong>Plagium Quick Scan</strong></td>
<td><strong>Copyscape Premium</strong></td>
</tr>
<tr align="center">
<td>15</td>
<td>11</td>
<td>12</td>
</tr>
</table>
<p>In every case, Plagium Deep Search came out on top, finding more results. However going through the results, especially with the second case, I found that many of the results were false positives, sharing less than 50 words. With no way to set the threshold for what is considered a match, I would have had to eliminate about half of the results from being actual matches.</p>
<p><img style=' float: right; padding: 4px; margin: 0 0 2px 7px;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/04/plagium-error.jpg" alt="" title="plagium-error" width="258" height="259" class="alignright size-full wp-image-9471" />That being said, even with the false positives removed, Plagium Deep Search outperformed both its free offering and Copyscape in finding matches. This is likely due to the fact that Plagium Deep Search seems to poll a wider range of sources, including Yahoo! News, Bing and Bing News. Though Plagium also polls Yahoo! search, that is now powered by Bing, making that search of limited usefulness.</p>
<p>One minor issue I did have with Plagium&#8217;s match detection was that, in its detailed report, perfect matches often had gaps in the highlighting, indicating that parts of the match were not detected. This didn&#8217;t seem to affect the overall accuracy in terms of finding pages, but it could limit Plagium&#8217;s usefulness for certain types of plagiarism analyses where greater precision is needed.</p>
<p>All in all, from a matches found perspective, Plagium seems to have a very compelling product on its hands and one that others may wish to start making broader use of.</p>
<h4>Bottom Line</h4>
<p>Despite some interface issues, Plagium&#8217;s Deep Search tool is a pretty compelling service offering and the $10 account for 1.1 million characters will likely last most searchers a full year, which is as long as the credits are good for.</p>
<p>Even with all of the tests that I did, I have only gone through about 50,000 characters on my account.</p>
<p>Personally, I&#8217;ll be using the Deep Search tool in lieu of the free offering, which I was already using to supplement Copyscape, Google Alerts and other search tools. </p>
<p>However, for most I would recommend testing your search with the free offering before deciding if there is any cause to use the Deep Search. If you find no results on the free offering, there isn&#8217;t any reason to spend the money, even if it is only a dollar or two.</p>
<p>That being said, if you do find a cause to dig deeper, you&#8217;ll likely find Plagium Deep Search to be well worth the cost. </p>
<p><em><strong>Disclosure:</strong> I was given 1.1 million characters of deep searches for free for the purpose of performing this review. This is valued at $10.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2011/04/13/plagium-introduces-deep-search/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>My Secret Plagiarism Detection Weapon</title>
		<link>http://www.plagiarismtoday.com/2011/03/02/my-secret-plagiarism-detection-weapon/</link>
		<comments>http://www.plagiarismtoday.com/2011/03/02/my-secret-plagiarism-detection-weapon/#comments</comments>
		<pubDate>Wed, 02 Mar 2011 21:22:13 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[collusion]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[plagiarism-detection]]></category>
		<category><![CDATA[wcopyfind]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=9108</guid>
		<description><![CDATA[Over the years I've talked about many of my favorite plagiarism detection tools but have been remiss in talking about one of my favorite, and most powerful, weapons.]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/03/wcopyfind-sized1.jpg" alt="" title="wcopyfind-sized" width="255" height="178" class="alignleft size-full wp-image-9111" /></p>
<p>One of the most common questions I get asked is &#8220;What is your favorite plagiarism checker?&#8221; The problem with the question being that there is no one correct answer. There are many different scenarios where one would want to use a plagiarism checker and no one plagiarism checker is right for every situation.</p>
<p>Consider, for example, the different challenges that a college professor and a blogger face. A professor needs to find only one similar piece to prove a work was plagiarized where a blogger wants to find as many copies of their work as they can. So, a checker that is geared toward returning a large number of results, such as Copyscape and Plagium might be good for a blogger, but less beneficial in an academic environment.</p>
<p>The other problem is that my opinions tend to change. There are at least half a dozen different checkers I use  and I find that they each have their strengths and weaknesses. Sometimes, when approaching a new project, I just have to try several of them to find the one that&#8217;s the right fit for the content I&#8217;m looking at.</p>
<p>However, there is one tool that I have used almost since day one and have greatly enjoyed. It&#8217;s not a Copyscape or a Turnitin replacement, but rather, a very different kind of plagiarism checker. I&#8217;ve used it in nearly every one of my most famous reports, <a href="http://www.usatoday.com/news/education/2009-04-23-university-plagiarism_N.htm">including the one in the Meehan plagiarism case</a>, and its power and flexibility means I&#8217;ll be using it for a long time to come.</p>
<p>Best of all, it&#8217;s completely free and it&#8217;s even open source software.</p>
<p>The application? <a href="http://plagiarism.phys.virginia.edu/Wsoftware.html">WCopyFind</a>, a standalone application from <a href="http://plagiarism.phys.virginia.edu/biography.html">Louis Bloomfield</a>, a physics professor at the University of Virginia. <span id="more-9108"></span></p>
<h4>When I Use WCopyFind</h4>
<p>Copyscape, Plagium, Turnitin, etc. are all great tools for when you don&#8217;t know if a document is plagiarized or where to find copies of it elsewhere on the Web. With large internal databases and advanced matching techniques, they can look through a large amount of work and find suspect passages.</p>
<p>But there are times where you don&#8217;t need to search the entire Web for suspected plagiarism, you already have a suspicion that Document A is plagiarized from Document B and you want to show the similarities between them. </p>
<p>Though some academic plagiarism checkers can do that fairly well, their one size fits all approach may leave gaps or false positives. Plus, depending on how many sources the papers pull from, you might be getting overlap from other similar phrases.</p>
<p>This is where WCopyFind comes in handy. If you suspect that two or more documents are extremely similar, you can use it to drill down and focus on just those works. </p>
<p>But the reason WCopyFind does this so well has less to do with its limitations and more to do with its features, which are very powerful and very useful.</p>
<h4>Why WCopyFind</h4>
<p>The reason I choose to use WCopyFind for these situations is simple: Flexibility.</p>
<p>With every check you run, you can set almost a dozen different options including the minimum match string length, the ability to ignore punctuation, the number of imperfections to allow and so forth. This makes it easy to set WCopyFind as sensitive or as insensitive as you want.</p>
<p>This is especially useful when dealing with cases of translated or edited plagiarism. If set the rules too strict, you may miss clearly plagiarized strings, set it too sensitive and you may get strings that match but are pure coincidence.</p>
<p>Having the options there allows you to play with the settings and determine what is the right combination of options for the exact case you&#8217;re looking at, striking the right balance between finding nearly all strings and getting too many false positives.</p>
<p>Best of all, WCopyFind, since it is entirely localized on your computer, is blazingly fast. Even comparing dissertations only takes a few seconds. This makes it fast to get results, look at the outcome and make a decision if tweaks are needed.</p>
<p>In short, not only can you get greater flexibility with this tool, but you can exercise it much more quickly with multiple checks than it would take to do jus tone on a different system.</p>
<h4>Limitations</h4>
<p>All of this being said, there are many limitations of WCopyFind and many reasons why it isn&#8217;t my main plagiarism checker. </p>
<ol>
<li><strong>No Internet Searches:</strong> WCopyFind can only look at documents on your hard drive that you tell it to look at. Nothing else.</li>
<li><strong>Difficult to Read Reports:</strong> The reports generated are in HTML format and, though useful, are ugly and require a lot of time to go through.</li>
<li><strong>Windows Only:</strong> Though it&#8217;s open source and can feasibly be run on other OSes, currently the only executable is for Windows, which very sad for me as a Mac user.</li>
</ol>
<p>However, the biggest problem many will have is that WCopyFind is a very bare bones application with little documentation. There&#8217;s a lot of settings, but little guidance on how to use them, making the learning curve fairly high for those not used to the system.</p>
<p>Still, for situations that call for it, WCopyFind is probably the best tool or at least one of the best tools, of its kind.</p>
<h4>Bottom Line</h4>
<p>Should every content creator or professor have a copy of WCopyFind on their computer? Probably not.</p>
<p>WCopyFind is a fairly specialized tool that, generally, is only used after other plagiarism checkers have hinted at the possibility of wrongdoing and there&#8217;s a need to drill deeper. Most people won&#8217;t have to do this and, in truth, most cases don&#8217;t require it either.</p>
<p>Also, it is worth noting that, as powerful as it is, WCopyFind is still no substitute for human judgment. As with other plagiarism detection systems, all it can do is finding matching strings of text and it is up to a human to decide which of those strings are plagiarized and which are coincidence, properly attributed or otherwise legitimate.</p>
<p>In short, WCopyFind is merely a tool, albeit a powerful one, for detecting and finding plagiarism between two or more documents. However, that alone makes it more than worth learning, especially if you want to do a detailed analysis. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2011/03/02/my-secret-plagiarism-detection-weapon/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Page Caching using disk: enhanced

Served from: www.plagiarismtoday.com @ 2012-02-13 12:02:10 -->
