<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Plagiarism Todaysearch | Plagiarism Today</title>
	<atom:link href="http://www.plagiarismtoday.com/tag/search/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.plagiarismtoday.com</link>
	<description>Content Theft, Plagiarism, Copyright Infringement</description>
	<lastBuildDate>Mon, 13 Feb 2012 06:51:37 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Is Bing Plagiarizing Google Search?</title>
		<link>http://www.plagiarismtoday.com/2011/02/03/is-bing-plagiarizing-google-search/</link>
		<comments>http://www.plagiarismtoday.com/2011/02/03/is-bing-plagiarizing-google-search/#comments</comments>
		<pubDate>Thu, 03 Feb 2011 19:29:10 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[bing]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[search results]]></category>
		<category><![CDATA[seo]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=8869</guid>
		<description><![CDATA[Google recently showed that Bing's results, on certain queries, seem to match their own a little too closely. Is Bing stealing from Google?]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2011/02/bing-logo-cropped-300x129.jpg" alt="Bing Logo Image" title="Bing Logo Cropped" width="300" height="129" class="alignleft size-medium wp-image-8882" />One of the stories that has been making the rounds lately has been involving allegations that the second-largest search engine, Bing, has been copying results from the first, Google. </p>
<p>The story, <a href="http://searchengineland.com/google-bing-is-cheating-copying-our-search-results-62914">which originally broke on Search Engine Land</a>, has since gartered public responses from both <a href="http://googleblog.blogspot.com/2011/02/microsofts-bing-uses-google-search.html">Google</a> and <a href="http://www.bing.com/community/site_blogs/b/search/archive/2011/02/02/setting-the-record-straight.aspx">Microsoft</a>, the owners of Bing.</p>
<p>The results have been <a href="http://techcrunch.com/2011/02/01/bing-google-fight/">a pretty ugly back and forth between the two companies</a>, one that has ended with both sides accusing the others of being unethical and/or dishonest and <a href="http://www.telegraph.co.uk/technology/google/8297337/Google-Microsoft-Bing-copies-our-search-results.html">a lot of negative press hurled at Microsoft</a>.</p>
<p>So what is really going on? It&#8217;s not a simple question to answer, partly because we only have a brief glimpse of what either side is doing or has done behind the scenes. The most we can try to do is make heads or tails of it and see what it means both legally and ethically for the two companies.<span id="more-8869"></span></p>
<h4>What Happened: Does Bing Copy Google?</h4>
<p>According to Google, after some recent relevancy updates to Bing, they began to notice a startling similarity between their results and Bing&#8217;s, especially on extremely unusual keyword results such &#8220;torsoraphy&#8221;, which is a misspelling for &#8220;tarsorrhaphy&#8221;, a rare surgery on the eyelids. According to Google, Bing wouldn&#8217;t correct the spelling, but would know to direct people to the same first result as Google.</p>
<p><img src="http://www.plagiarismtoday.com/wp-content/uploads/2011/02/bingtarsorr-500x236.jpg" alt="Bing Google" title="Bing Steal Google" width="500" height="236" class="alignnone size-large wp-image-8878" /></p>
<p>This, in turn, prompted Google to run a &#8220;sting&#8221; operation where they manipulated the results of one hundred random, nonsensical search terms, such as &#8220;hiybbprqag&#8221;, to add random pages to the top of the results. Google then sent 20 of its engineers home with new Windows 8 laptops and had them perform test queries on Google from home in Internet Explorer 8 with Suggested Sites, a feature in IE8 that tracks user browsing to recommend related websites, and the Bing Toolbar both enabled.</p>
<p>The result was that, in about eight of the searches, Bing&#8217;s results were changed to match Google&#8217;s, even though the pages chosen by Google didn&#8217;t make sense for the query.</p>
<p>This was enough to convince Google that something was afoot and the story started making the rounds.</p>
<p>Bing later responded saying that they don&#8217;t copy from Google directly but that they do use the anonymous click and surfing data from their users as one of over 1,000 points of data to determine results. They went on to say that Google&#8217;s experiment was a &#8220;click fraud&#8221; attack similar to what spammers do and was trying to manipulate Bing&#8217;s results on ultra-long tail keywords where it was most vulnerable.</p>
<p>This, in turn, has raised serious questions about Microsoft&#8217;s monitoring of user activity, which seemingly wasn&#8217;t widespread knowledge before this incident. Google has said that it doesn&#8217;t use data from Chrome or its Google Bar to build its search results, other than possibly tracking site loading times, but Microsoft pointed to <a href="http://www.android.com/privacy.html">Android&#8217;s privacy policy</a>, that Microsoft claims signals Google may be doing as such with their mobile operating system.</p>
<p>But all of this brings us back to our original question: Is Bing plagiarizing from Google?</p>
<h4>No Easy Answers</h4>
<p>As I read through the various points/counterpoints while researching this article, there were three facts that seemed to be largely overlooked:</p>
<ol>
<li><strong>Google&#8217;s Honeypot Worked Less Than 10% of the Time:</strong> Google attempted the honeypot with some 100 keywords but only 7-9 actually worked. That means that, in over 90 cases, it didn&#8217;t work.</li>
<li><strong>The Keywords Involved Were Extreme Long Tail:</strong> They keywords in the honeypot were ones that showed no results or no relevant results. They keywords that aroused suspicion were primarily typos of strange, rarely-used words. Major searches, it seems, are unaffected by this as there is a lot of variance.</li>
<li><strong>Engineers Had to Take Active Action:</strong> The engineers in the test didn&#8217;t simply alter Google&#8217;s results and wait for Bing to scrape them, they loaded up laptops, enabled tracking, performed the searches and clicked the desired result. </li>
</ol>
<p>Clearly, this isn&#8217;t a case of Bing scraping Google&#8217;s results (which is what <a href="http://www.scroogle.org/cgi-bin/scraper.htm">Scroogle</a> does by design and with attribution). Instead, it&#8217;s a case of Bing&#8217;s underlying technology giving weight to actions by Microsoft IE users who visit Google. In short, what most seem to agree happened is:</p>
<ol>
<li>Users submit surfing data via IE and the Bing toolbar. </li>
<li>Those users choose not to use Bing, use Google instead so Bing tracks those clicks.</li>
<li>On search terms where Bing has nothing or very little, those clicks sometimes get a lot of weight.</li>
</ol>
<p>So, this brings us to the question I&#8217;ve been loathing: Is this plagiarism or otherwise unethical or illegal? </p>
<p>Legally, it seems dubious that Google&#8217;s search results could be considered copyrightable. Considering recently <a href="http://whatisfairuse.blogspot.com/2008/06/unadorned-digital-models-that-can-be.html">wire frame models based off of cars were deemed to lack sufficient creativity for copyrightability</a>, it seems likely results generated solely by an algorithm, with no human involvement, would too. However, I wasn&#8217;t able to find a ruling directly aimed at this issue so, if anyone has one please send it my way.</p>
<p>Furthermore, Bing isn&#8217;t actually copying directly from Google, but looking at user data and drawing its own conclusions so, in the eyes of the law, it&#8217;s unlikely that there would be much in the way of a claim Google could make. It seems, largely, to be a matter between Microsoft and the users of its products.</p>
<p>That being said, the ethics are a much more complicated question. </p>
<p>Bing, when it set about introducing clicktracking as a factor in its search results, had to know that many of those it tracked would use Google and, therefore, it seems logical they knew that they would be getting information about Google results and they did nothing to prevent that. However, I&#8217;m not completely sure they should have.</p>
<p>For one, you can learn from your competitor&#8217;s results and product without copying it. By tracking clicks, Bing might be able to see that some sites that rank well in Google aren&#8217;t worth ranking well in their engine. This seems to be mostly what Bing does though as it was only non-competitive search terms that appears to be copied.</p>
<p>That being said, Bing, as Google&#8217;s main competitor, should be trying to create its own unique search experience, not merely trying to recreate Google&#8217;s with slightly better results. Though it may only be one factor that Bing considers, considering Google&#8217;s results for your own makes it look like you&#8217;re trying to build on Google&#8217;s back.</p>
<p>So is it unethical? I would consider it a gray area. At least as far as the relationship between Google and Bing goes, I don&#8217;t feel completely right giving Bing&#8217;s actions the OK, but I can&#8217;t outright condemn them either. Bing is walking a thin line here and a lot of what would determine their side on it depends on information we don&#8217;t have, such as exactly what information is collected and how it is used in Bing&#8217;s algorithm.</p>
<p>Sadly though, there are bigger questions to look at and, with those, even fewer good answers.</p>
<h4>Bigger Questions</h4>
<p>As important as the ethical and legal considerations are, there are other questions to ask, including:</p>
<ol>
<li><strong>How Aware Are Microsoft&#8217;s Users of the Tracking and It&#8217;s Use?</strong> Many seem to be surprised by what Microsoft is doing. How well was this use of private info disclosed and how clear was it made? </li>
<li><strong>How Easy is it to Game Bing?</strong> Considering that the false Google results were irrelevant to their searches, it seems like &#8220;click fraud&#8221; as Microsoft calls it might be an easy way to game Bing, especially for long tail terms.</li>
<li><strong>How, Exactly, is This Info Used?</strong> Though Microsoft makes it clear that it is just one of a thousand factors, it appears that click monitoring really has a sharp impact on the results. Exactly how much weight is this info given?</li>
</ol>
<p>There aren&#8217;t any easy answers to these questions right now but I suspect we&#8217;ll hear more about the first one as this story spreads. </p>
<h4>Bottom Line</h4>
<p>The ethics of what Bing is doing (at least in regards to Google) are debatable and even I don&#8217;t have any solid answers, largely because this concept is still very new and the ethics haven&#8217;t been hashed out fully. However, there are bigger, probably more important questions being raised about Bing thanks to this revelation.</p>
<p>Though I don&#8217;t believe Bing is plagiarizing Google, at least not in the traditional sense of the word, Google may have exposed even greater mistakes an misdeeds of Bing by releasing the results of this test. In short, they make Bing look lazy, sloppy and easy-to-game, something that for a search engine may be even worse than being a plagiarist. </p>
<p>Clearly, this is a PR disaster for Bing and Microsoft, but I think the worst might be yet to come for them. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2011/02/03/is-bing-plagiarizing-google-search/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>3 Count: Tax This</title>
		<link>http://www.plagiarismtoday.com/2010/01/08/3-count-tax-this/</link>
		<comments>http://www.plagiarismtoday.com/2010/01/08/3-count-tax-this/#comments</comments>
		<pubDate>Fri, 08 Jan 2010 16:54:34 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Copyright News]]></category>
		<category><![CDATA[acta]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[France]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Lawsuit]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[tax]]></category>
		<category><![CDATA[viacom]]></category>
		<category><![CDATA[Yahoo]]></category>
		<category><![CDATA[YouTube]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=5267</guid>
		<description><![CDATA[Have any suggestions for the 3 Count? Let me know via Twitter @plagiarismtoday. 1: French Solution to Illegal Download and Copyright Infringement &#8211; Tax Google and Yahoo First off today France, who is no stranger to controversial proposals for dealing with piracy on the Web, has recommended the taxing of search engines such as Google...]]></description>
			<content:encoded><![CDATA[<p><em>Have any suggestions for the 3 Count? Let me know via Twitter <a href="http://twitter.com/plagiarismtoday">@plagiarismtoday</a>.</em></p>
<h4>1: <a href="http://government.zdnet.com/?p=6738">French Solution to Illegal Download and Copyright Infringement &#8211; Tax Google and Yahoo</a></h4>
<p>First off today France, who is no stranger to controversial proposals for dealing with piracy on the Web, has recommended the taxing of search engines such as Google and Yahoo! to help fund legal alternatives for obtaining copyrighted works. This comes as from a panel that was commissioned to study the issue of online piracy and devise solutions for the issue.</p>
<p>The tax would be similar to taxes in other countries on VCR tapes, blank CDs and other media. Surprisingly, according to the article, Google seems fairly comfortable with the idea though it is unclear what the record and movie studios will say about it.</p>
<p>In addition to this proposal, France is also at the forefront of the &#8220;3 strikes&#8221; debate, recently having passed its second law that would disconnect file sharers repeatedly accused of infringement. The first one was stricken for constitutional reasons and the second includes judicial oversight.</p>
<h4>2: <a href="http://www.billboard.biz/bbbiz/content_display/industry/e3i3a9920d504eb965461fc996203b800ef">Report: Google/Viacom Case Set For Ruling</a></h4>
<p>Next up today, there are rumors that the Viacom v. Google case is possibly heading for an early conclusion. There are reports that both sides are requesting a meeting about their motions for a summary judgement, hinting that the judge may issue such a judgement, avoiding a trial completely.</p>
<p>Viacom sued Google over YouTube, citing over 60,000 clips that it accused of infringing their copyright and later claimed to have found evidence that YouTube employees were aware of the infringement and/or had uploaded the clips themselves. Possibly eroding their safe harbor status under the DMCA and exposing them to potential liability.</p>
<p>But while the case is possibly heading to an end soon, what is unclear is which side is the likely victor as both have elements of their case that are pretty strong. Needless to say, we will be following this closely over the next few days/weeks to see what happens.</p>
<h4>3: <a href="http://www.wired.com/threatlevel/2010/01/senator-demands-details">Senator Demands IP Treaty Details</a></h4>
<p>Finally today, Sen. Ron Wyden (D-Oregon), has sent a letter to U.S. Trade Representative Ron Kirk asking him to confirm or deny leaks about the ACTA treaty, which is being negotiated by the U.S., EU and other major copyright nations. </p>
<p>Those leaks have included concerns that the treaty may force ISPs to disconnect alleged file sharers, similar to France&#8217;s three strikes regime, as well as other restrictions on copyright. </p>
<p>The treaty, however, does not need congressional approval, meaning its impact on U.S. law should be almost nil (changes to law must have congressional approval) but the Senator still wants to know what is in the treaty, especially considering it has been shared with private citizens and corporations, including those on all sides of the copyright debate.</p>
<p>Kirk&#8217;s office has said that they are looking forward to responding to Sen. Wyden&#8217;s letter and will do so shortly.</p>
<h4>Suggestions</h4>
<p>That&#8217;s it for the three count today. We will be back tomorrow with three more copyright links. If you have a link that you want to suggest a link for the column or have any proposals to make it better. Feel free to leave a comment or send me an email. I hope to hear from you. </p>
<h4>Want the Full Story?</h4>
<p>Tune in <a href="http://www.talkshoe.com/tc/22590">every Saturday morning for the live recording of the Copyright 2.0 Show</a> or wait and get the edited version <a href="http://www.plagiarismtoday.com/category/podcast/">Monday morning right here on Plagiarism Today</a>. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2010/01/08/3-count-tax-this/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Review: The Plagiarism Checker</title>
		<link>http://www.plagiarismtoday.com/2008/12/16/review-the-plagiarism-checker/</link>
		<comments>http://www.plagiarismtoday.com/2008/12/16/review-the-plagiarism-checker/#comments</comments>
		<pubDate>Tue, 16 Dec 2008 17:45:07 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Products]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Digg]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[plagiarism checking]]></category>
		<category><![CDATA[Reddit]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[Social-News]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=2283</guid>
		<description><![CDATA[The rebirth of "The Plagiarism Checker" has made waves throughout social news sites and Twitter alike, but is the site worth the attention it has been getting?]]></description>
			<content:encoded><![CDATA[<p><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2008/12/plagiarism-checker-logo-300x39.png" alt="plagiarism-checker-logo" title="plagiarism-checker-logo" width="300" height="39" class="alignleft size-medium wp-image-2298" />Late last week, <a href="http://www.reddit.com/r/reddit.com/comments/7j4e6/the_plagiarism_checker_i_made_this_site_in_2002/">a post reached the front page of Reddit</a> that piqued the curiosity of copyright holders, teachers and professors alike. It was about a service called &#8220;<a href="http://www.dustball.com/cs/plagiarism.checker/">The Plagiarism Checker</a>&#8221; (dubbed by me the &#8220;Dustball&#8221; checker due to its domain), created by <a href="http://www.linkedin.com/in/brianklug">Brian Klug</a> in 2002, when he was a student at the University of Maryland at College Park, and abandoned until recently this year.</p>
<p>The site, according to Klug, was getting about 2,000 visits per day when it was forgotten but is almost certainly doing much better now as it has taken off, attracting countless Twitter Tweets and other social news attention. Librarians and teachers are especially captivated by this site.</p>
<p>But is &#8220;The Plagiarism Checker&#8221; worth using? Is it as powerful of a tool as some, although not the site itself, have made it to be? The sad answer is no, but it could, with a few simple tweaks, become a much more useful service for teachers and bloggers alike.<span id="more-2283"></span></p>
<h4>How it Works</h4>
<p>The basic premise of the minimalist site can be summed up by its instructions:</p>
<blockquote><p>Cut &#038; paste your students paper or homework assignment into the box below, and click the &#8220;check&#8221; button.  This free plagiarism detector will find plagiarized text in homework and other essays/reports.</p></blockquote>
<p>In short, you take an essay, article or other lengthy prose work, paste it into a textbox and hit &#8220;check&#8221;. From there, the site extracts several strings of text, runs them through Google and compiles the result, determining whether plagiarism is probable.</p>
<p>In that regard, the idea is actually very similar to Copyscape, which also uses Google via their API, to process results. However, where Copyscape&#8217;s keeps the &#8220;magic&#8221; hidden from the user, the &#8220;Dustball&#8221; plagiarism checker includes links to the Google results, encouraging users to click through and research the case for themselves.</p>
<p>That alone is a big part of the problem Webmasters, and many teachers, will have with the service. Where Copyscape, as well as academic tools such as TurnItIn, provide very simple and colorful results, The Plagiarism Checker is a very bare-bones approach, requiring the user to perform a large amount of research on their own.</p>
<p>Still, a bit of research will be welcomed if the service produces great results, unfortunately, it seems that the service performs only lukewarm, at best.</p>
<h4>My Tests</h4>
<p>To test the service, I decided to run it through a <a href="http://www.plagiarismtoday.com/2005/06/28/copyscape-not-ready-for-prime-time/">similar battery of tests</a> that I had run Copyscape through and then watched as they <a href="http://www.plagiarismtoday.com/2007/10/02/copyscape-improved-again/">improved upon the initial results</a>. </p>
<p>The first test was to run <a href="http://www.ravensrants.com/in-the-dark/print/">an old poem of mine</a> through the system, one that allegedly has over 300 matches in Google. However, that test was thwarted as The Plagiarism Checker refused to even look at the work, saying that it could not function with such short text strings.</p>
<p><a href="http://www.plagiarismtoday.com/wp-content/uploads/2008/12/plagiarism-checker-error.png"><img src="http://www.plagiarismtoday.com/wp-content/uploads/2008/12/plagiarism-checker-error-300x83.png" alt="plagiarism-checker-error" title="plagiarism-checker-error" width="300" height="83" class="alignnone size-medium wp-image-2284" /></a></p>
<p>I then shifted gears and started using prose works, <a href="http://www.ravensrants.com/loner/print/">the first being one</a> that had <a href="http://www.google.com/search?q=%22there+is+always+a+person+sitting+alone+in+a+corner+not+engaging+in+conversation%22&#038;hl=en&#038;client=firefox-a&#038;rls=org.mozilla:en-US:official&#038;hs=0aI&#038;filter=0">36 matches in Google</a> at the time I did the search. The result was stunning. </p>
<p><a href="http://www.plagiarismtoday.com/wp-content/uploads/2008/12/plagiarism-checker-none-found.png"><img src="http://www.plagiarismtoday.com/wp-content/uploads/2008/12/plagiarism-checker-none-found-300x138.png" alt="plagiarism-checker-none-found" title="plagiarism-checker-none-found" width="300" height="138" class="alignnone size-medium wp-image-2287" /></a></p>
<p>Despite the fact Google had reported three dozen matches on test snippets from the work itself, the &#8220;Dustball&#8221; checker was unable to find anything. To make matters worse, using some of the sample quotes from the test, I was able to locate other copies of the work, <a href="http://www.google.com/search?q=%22every+crowd+big+or+small+there+is+always+a+person%22&#038;ie=utf-8&#038;oe=utf-8&#038;aq=t&#038;rls=org.mozilla:en-US:official&#038;client=firefox-a">such as with the first quote</a>.</p>
<p>Clearly, The Plagiarism Checker was missing results that Google was finding, meaning it was discarding them for whatever reason.</p>
<p>A similar test for <a href="http://www.ravensrants.com/trees/print/">another prose work</a> only returned one sentence that was matched against anything and the results for it were all false positives. This work, in Google, <a href="http://www.google.com/search?q=%22trees+of+nature+that+I+hold+so+dear+will+soon%22&#038;hl=en&#038;client=firefox-a&#038;rls=org.mozilla:en-US:official&#038;hs=tLw&#038;filter=0">has six results</a>.</p>
<p><a href="http://www.plagiarismtoday.com/wp-content/uploads/2008/12/plagiarism-checker-none-found4.png"><img src="http://www.plagiarismtoday.com/wp-content/uploads/2008/12/plagiarism-checker-none-found4-300x146.png" alt="plagiarism-checker-none-found4" title="plagiarism-checker-none-found4" width="300" height="146" class="alignnone size-medium wp-image-2289" /></a></p>
<p>The only search using the service that seemed to work remotely well was when I ran the <a href="http://www.ushistory.org/Declaration/document/index.htm">Declaration of Independence</a> through it. Every search term, in this test, came back positive. </p>
<p><a href="http://www.plagiarismtoday.com/wp-content/uploads/2008/12/plagiarism-found.png"><img src="http://www.plagiarismtoday.com/wp-content/uploads/2008/12/plagiarism-found-300x175.png" alt="plagiarism-found" title="plagiarism-found" width="300" height="175" class="alignnone size-medium wp-image-2293" /></a></p>
<p>It appears that text that is not widely distributed around the Web may or may not show up as plagiarized in this work, something that has me very worried as many are starting to rely on this plagiarism checker as their main tool for detecting both copyright infringement and the plagiarism of students.</p>
<h4>The Sad Truth</h4>
<p>Simply put, any and all of these search results should have come back as being plagiarized. Even if there were no other matches of the content, these works existed on my site and are available through Google there. There is no reason that any of these works should have come back as anything short of 100% plagiarized since this site can not know I was the one submitting them.</p>
<p>For teachers, this is not good news. Is a student plagiarizes material from obscure sources, they are likely to escape detection. Likewise, Webmasters and those that might want to use this tool to track their own content, will likely be disappointed that it doesn&#8217;t seem to pick up when the infringement is only a few dozen sites. </p>
<p>This can most likely be fixed through tweaks in the algorithm, but as it sits right now, it doesn&#8217;t appear that it has much to offer teachers or Webmasters, especially when <a href="http://www.copyscape.com">Copyscape</a> is relatively effective and cheap to use.</p>
<p>Simply put, at this moment, Copyscape is easier, more effective and faster than The Plagiarism Checker and, at only five cents a search, is affordable too.</p>
<p>However, the best technique still appears to be taking the time to select good phrases from a work and manually searching for those. It returns the most results and seems to work well nearly all of the time.</p>
<h4>The Big Picture</h4>
<p>My issue with The Plagiarism Checker has less to do with the service itself and more to do with how others have been promoting it. The site itself is actually fairly humble about what it can do, but bloggers and Twitter users have been advertising it as if it were a silver bullet to detect plagiarism. Clearly, that is not the case.</p>
<p>With a few tweaks and fixes to the algorithm, I don&#8217;t doubt that this service, much like Copyscape, could become a very powerful tool. However, even if the results were on par with Copyscape, the latter remains faster and easier to use, meaning that there will not be much reason to use the &#8220;Dustball&#8221; checker.</p>
<p>To make matters worse, most teachers and professors have access to services such as TurnItIn that are far more accurate and covers a much larger breadth of sources than &#8220;The Plagiarism Checker&#8221;. Considering the ease of us and added features, there is not much that can be gleaned from a Google-only search, that can&#8217;t be gleaned from the more automated service (Though <a href="http://www.plagiarismtoday.com/2008/11/04/copyscape-tops-plagiarism-checker-testing/">Copyscape did top Turnitin in a recent plagiarism detection study</a>). </p>
<p>In short, I don&#8217;t see much usefulness for this tool, even if its accuracy improves, and I and more than a little confused as to why so many seem to have promoted it so heavily.</p>
<h4>Conclusions</h4>
<p>More than anything, this is a case against the reliance on any one plagiarism checking service. Even the best services will let results slip through the cracks. Furthermore, just because a service is popular does not mean that it should be trusted above all.</p>
<p>However, I find it very difficult to fault The Plagiarism Checker for this confusion and these problems. It is clear that the service was as much an experiment as anything, it is promoted humbly and was actually abandoned for approximately six years. It was others, perhaps desperate for some way to more effectively detect plagiarism, that gave it an unjustified reputation.</p>
<p>If anything, this case shows the need and the potential market for such services and illustrates why some companies have made millions in this field. People are eager for a solution and are excited by any promise of one.</p>
<p>Sadly though, this site is not the one people are looking for.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2008/12/16/review-the-plagiarism-checker/feed/</wfw:commentRss>
		<slash:comments>22</slash:comments>
		</item>
		<item>
		<title>Google Alerts to add RSS</title>
		<link>http://www.plagiarismtoday.com/2008/10/10/google-alerts-to-add-rss/</link>
		<comments>http://www.plagiarismtoday.com/2008/10/10/google-alerts-to-add-rss/#comments</comments>
		<pubDate>Fri, 10 Oct 2008 15:21:40 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[Products]]></category>
		<category><![CDATA[content detection]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[digital fingerprints]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Google Alerts]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[RSS]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[Search-Engines]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=1935</guid>
		<description><![CDATA[A recent article in the Wall Street Journal has given reason for many Google Alerts users to rejoice, the famous email alert service will soon be getting RSS support. ]]></description>
			<content:encoded><![CDATA[<p><IMG SRC="http://www.plagiarismtoday.com/images/google-alerts-20081010-100845.png" alt="Google Alerts Logo" align="left" class="picleft">A recent article in the Wall Street Journal by Walter Mossberg about <a href="http://online.wsj.com/article/SB122281243658792073.html">how to use alerts to keep track of the Web</a> dropped something of a bombshell for those of us who use <a href="http://www.google.com/alerts">Google Alerts</a> every day. According to Mossberg, Google Alerts will begin adding RSS alerts in addition to email ones &#8220;in about a month&#8221;.</p>
<p>Google Alerts, which is a service that sends out notices when content carrying the alert search term appears on the Web, currently only sends out its alerts via email. It is commonly used for vanity searches, for keeping on top of who mentions a person or site, and for keeping track of content, either through searches for <a href="http://www.plagiarismtoday.com/2005/11/07/tips-for-using-google-alerts/">statistically improbable phrases</a> or <a href="http://www.plagiarismtoday.com/2006/10/04/digital-fingerprints-to-detect-rss-scraping/">digital fingerprints</a>. </p>
<p>What this means to you will probably depend on how heavily you use RSS and how much use you make of Google Alerts. If you are not currently using Google Alerts and want to get started, I&#8217;ve <a href="http://www.plagiarismtoday.com/2008/01/24/video-how-to-use-google-alerts/">created a screencast to help you understand the basics</a>.</p>
<p>Obviously, I&#8217;ll have more to say on this once the new feature is made public. </p>
<p>However, at this time, I don&#8217;t see myself making heavy use of the RSS feature. I literally have years of experience meshing Google Alerts with email filters and creating a workflow around it. Though such a system could be moved to RSS easily, I don&#8217;t see how much is gained in my case.</p>
<p>Clearly though, this feature is not for people like myself and other current heavy users of Google Alerts, instead, it is for those who don&#8217;t because they can&#8217;t get the alerts in the format they want. This will change that and let them receive their alerts in a variety of places including their RSS reader, their Google home page and through a variety of mashup services.</p>
<p>Needless to say, this opens up a lot of new doors for Google Alerts but, personally, I&#8217;m just happy to hear that the service is still receiving some attention. After being so long without a significant upgrade, it is nice to see that Google is still working on their Google Alerts product. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2008/10/10/google-alerts-to-add-rss/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>An Inside Look at iCopyright Discovery</title>
		<link>http://www.plagiarismtoday.com/2008/09/30/inside-look-at-icopyright-discovery/</link>
		<comments>http://www.plagiarismtoday.com/2008/09/30/inside-look-at-icopyright-discovery/#comments</comments>
		<pubDate>Tue, 30 Sep 2008 17:22:15 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Products]]></category>
		<category><![CDATA[content detection]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[creators]]></category>
		<category><![CDATA[discovery]]></category>
		<category><![CDATA[icopyright]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[Search-Engines]]></category>
		<category><![CDATA[tracking]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=1824</guid>
		<description><![CDATA[The iCopyright Discovery system promises to revolutionize the way copyright holders track and protect their work. Now we get an inside look at what the system has to offer copyright holders. ]]></description>
			<content:encoded><![CDATA[<p><img src="http://www.plagiarismtoday.com/wp-content/uploads/2008/09/icopyright-logo1.png" alt="icopyright-logo.png" border="0" width="174" height="59" align="left" class="picleft" />Earlier this month, <a href="http://www.plagiarismtoday.com/2008/09/16/icopyright-announces-content-tracking-tool/">I reported on iCopyright&#8217;s new content tracking tool Discovery</a>. At that point, I only had the information provided in the press release for the service.</p>
<p>However, last week, Mike O&#8217;Donnell, the President and CEO of iCopyright, was kind enough to give me a guided tour of the backend. Though I wasn&#8217;t able to access anything hands on or experiment with the technology with my own content, that will have to wait until the service is available for <a href="http://creators.icopyright.com/">iCopyright for Creators</a> users, I was able to see what the service does, how it works and what it can do.</p>
<p>So here is a brief look at what the iCopyright Discovery system can do and how it will likely look when it is available for Creators users shortly. Please bear in mind that this is not a review, just a tour of the key features of the service. <span id="more-1824"></span><br />
<h4>The Basic Premise</h4>
<p>The big idea of Discovery is this: Discovery parses your content as you put it up on the Web, accessing either a created XML file or your RSS feed, and then searches for copies of it on the Web. </p>
<p>The service then searches for matches of your content, highlighting ones that it determines to be the most important, and gives you options for remedying the situation. Among the actions it can perform are removal requests, which fundamentally DMCA notices, license requests, which goes through iCopyright&#8217;s existing licensing system, and forwarding to legal counsel.</p>
<p>This idea is fundamentally very similar to <a href="http://attributor.com">Attributor</a> and <a href="http://www.blogwerx.com/">Blogwerx</a>, both of which are still in private testing. However, the execution of the system is going to be what is important. On that front, iCopyright has devised an interesting workflow system that seems to string the process together very well.</p>
<h4>Setting Up Discovery</h4>
<p>When a user first signs in to Discovery, the first page they&#8217;re likely going to head to is, oddly enough, the &#8220;Settings&#8221; page. The reason for this is that, without visiting the settings page, you have little control over the matches you see and you can&#8217;t use several of the remedy options. </p>
<p><a href="http://www.plagiarismtoday.com/wp-content/uploads/2008/09/settings.jpg"><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2008/09/settings-300x220.jpg" alt="" title="settings" width="300" height="220" class="alignleft size-medium wp-image-1830" /></a></p>
<p>From this page, you can set your enforcement agency, useful if you are part of a group that handles your copyright enforcement, and the email address to your legal counsel. This will let you enable addition redress steps down the road. However, the most important settings are the search sensitivity and risk assessment as they determine the matches you see down the road.</p>
<p>The search sensitivity feature allows users to tell Discovery how many matches they want. They can set it so that only the worst matches appear in the system or so that they see almost everything. This is done by tweaking the minimum match ratio, meaning how much of the original work must appear in the copy, the minimum risk factor, discussed below, the minimum site activity and the minimum number of copied words that must appear in the match, useful for sites with short posts.</p>
<p>The Risk Assessment tool is easily one of the most interesting features in iCopyright Discovery. It lets users set the criteria for determining how much of a risk a match site is. You do that by setting sliders for Unique Visitors, which looks at the estimated traffic of the site, the number of inbound links, whether the site displays ads or how much of the content it copies.</p>
<p>These sliders are intended to be abstract in nature and are used to indicate which attributes are more important than others. For example, if you set all to 10, they would be weighed equally. However, if you put one at 5 and the others at 10, the first one would be weighed much less. </p>
<p>These attributes, when combined with the site&#8217;s actual use of the content, are used to determine the risk level of the site itself. This, in turn, plays a major role in determining the priority the site is given when analyzing suspect pages. </p>
<h4>Sorting Matches</h4>
<p>Once you are done telling Discovery what matches you want to see, the system then does a refresh, which takes about an hour according to O&#8217;Donnell, and you can then view your matches or &#8220;suspects&#8221;.</p>
<p><a href="http://www.plagiarismtoday.com/wp-content/uploads/2008/09/suspect_list.jpg"><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2008/09/suspect_list-300x213.jpg" alt="" title="suspect_list" width="300" height="213" class="alignleft size-medium wp-image-1831" /></a></p>
<p>The match sort is organized by a combination of variables, focusing heavily on suspect pages with the highest risk. For each suspect, the system displays the URL of the work, whether it displays ads, whether it links back to your site, roughly how many visitors it gets, the number of inbound links to the site, the match percentage and the risk.</p>
<p>From this page, you can go through the matches and either archive the match, which functions similar to Gmail&#8217;s archive function and takes no action, move it to the Whitelist, either pending or approved, or send it to the redress list.</p>
<p>If a site is moved to the whitelist, that means that the use is licensed and future matches from the site will be ignored. You have the option of telling the system to either ignore matches on the URL, the subdomain or the entire domain.</p>
<p>If you move it to the redress list, you can then take further action on the match, including licensing the work or filing a removal demand.</p>
<h4>Taking Action</h4>
<p>The redress list, as you see below, looks very similar to the suspect list and contains much of the same information. However, the options for what one can do with a suspect are different on this page.</p>
<p><a href="http://www.plagiarismtoday.com/wp-content/uploads/2008/09/redress_list.jpg"><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2008/09/redress_list-300x205.jpg" alt="" title="redress_list" width="300" height="205" class="alignleft size-medium wp-image-1829" /></a></p>
<p>From this page, you can then either offer the site a license, which will send out an email encouraging the site admin to go through the existing iCopyright system, file a link request or send a removal notice.</p>
<p>Removal notices, fundamentally, are DMCA notices though they are written so that, at this stage, they can be sent to Webmasters directly. Link requests are more like informal license offers, but ones where the only stipulation is a link back.</p>
<p>All of the letter types are fully customizable and Discover offers a templating system that lets you build your own letter that automatically inserts the necessary information.</p>
<p>Once you file a redress, you can then track the status of it in the Redress Offers Status page. From there, it will let you know if the redress has been completed and, if it hasn&#8217;t, makes it available to be escalated. </p>
<p>If a suspect match is moved to the escalation list, then the user has a whole new series of options for how to deal with the site. </p>
<p><a href="http://www.plagiarismtoday.com/wp-content/uploads/2008/09/escalation_list.jpg"><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2008/09/escalation_list-300x196.jpg" alt="" title="escalation_list" width="300" height="196" class="alignleft size-medium wp-image-1828" /></a></p>
<p>The options include the ability to, forward the situation to your legal counsel (if set up), notify the ISP, which sends a more traditional DMCA notice, notify the enforcement agency (if set up), send a notice to the ad network or demand removal from the search engines. </p>
<p>All in all, the initial Redress List can be looked at as the cease and desist/licensing phase where the Escalation List deals more with the DMCA/lawyer phase. </p>
<p>However, no matter what redress steps you take, Discovery offers a powerful means to track and monitor the progress of the steps that you took. </p>
<h4>Tracking and Monitoring</h4>
<p>Once you&#8217;ve taken a redress action against a suspect site, you can then track and monitor everything that has to do with that particular match. </p>
<p><a href="http://www.plagiarismtoday.com/wp-content/uploads/2008/09/action_audit_trail.jpg"><img style=' float: left; padding: 4px; margin: 0 7px 2px 0;'  src="http://www.plagiarismtoday.com/wp-content/uploads/2008/09/action_audit_trail-300x219.jpg" alt="" title="action_audit_trail" width="300" height="219" class="alignleft size-medium wp-image-1826" /></a></p>
<p>It provides much more than just a brief history of what has taken place, giving a detailed history of every email sent, comments left in the system, both automatic ones and ones left by the user, as well as other information about the site.</p>
<p>The idea is to maintain a record of every action, including emails, phone calls and other steps, for the purpose of aiding in any potential legal case. </p>
<p>Once the matter is resolved, escalated outside of the system or the match is whitelisted, the case can be archived and thus removed from the suspect pool, allowing you to move on to other matches.</p>
<h4>Some personal thoughts</h4>
<p>It is very hard for me to offer any real review of the service. Without actually being hands on with the service and using it against my own content, there is not much that I can do.</p>
<p>Right now there are many unknowns for me, including the following: </p>
<ol>
<li><strong>Match Detection:</strong> O&#8217;Donnell has said they are partnering with a major search provider to perform the detection but it remains to be seen how effective it is. Match detection is not easy, even with a big search partner, <a href="http://www.plagiarismtoday.com/2007/10/02/copyscape-improved-again/">as Copyscape showed</a>. The system will not be of much use if its match detection is not the best in its class.</li>
<li><strong>Resolution Assistance:</strong> The hardest part about stopping a plagiarist is not composing the letter, but finding who to send it to. It is easily the biggest time sink in most of my cases and is the number one reason people approach me for help. It remains to be seen how effectively Discovery helps with this process.</li>
<li><strong>Speed/Usability:</strong> Obviously, without actually using the system, I can&#8217;t tell how fast it moves and how much time it will save you. If the system is sluggish or error-prone, it could greatly hurt its usefulness.</li>
</ol>
<p>This is not to say that these things are wrong with the current system, just that I don&#8217;t know right now and won&#8217;t until I can do a full review, likely later this year.</p>
<p>However, judging from what I can see, the system is very impressive. It looks very good, has a solid workflow built into it, though I somewhat disagree with having the ISP step be only available in the escalation section, and seems to be built with the user in mind.</p>
<p>What I like best about Discovery is how the user customizes the system to fit their needs, with their own definitions of what matches to worry about, their own letters and their own general strategy. Any such system should focus on automating what can be automated, but leaving the big decisions to the copyright holder.</p>
<p>What does worry me some is that the system is clearly geared toward larger clients. Discovery is designed to allow for multiple users to access an account and to work with attorneys as well as other rights enforcers. While those are great features for those that need them, it remains to be seen how the system will strip down for smaller copyright holders.</p>
<p>The other downside is that, according to O&#8217;Donnell, the version of Discovery for Creators will come with some kind of fee. Though pricing structure has not been discussed, he seemed confident that it would not be available for free.</p>
<p>Still, as these screenshots show, there is a lot to like in the Discovery system and the solution it promises.</p>
<p>It has a great deal of potential and Webmasters who are worried about tracking how their content is used should definitely take a serious look at what iCopyright has to offer.</p>
<h4>Conclusions</h4>
<p>There&#8217;s a lot of reason for me to be excited about the upcoming Discovery system. However, I have to restrain that excitement until I can use the system first hand and see both how effective it is and how smooth the process is.</p>
<p>No matter what though, I am happy to see that people are thinking about these issues and coming up with solutions. This has been a booming industry over the past few years and a lot of very smart companies are already involved and I am happy to be working in this field.</p>
<p>No matter what Discovery itself brings, it can only signal great things for copyright holders and Webmasters. Hopefully, this will help content creators not just enforce their rights, but understand how their work is being reused and encourage the kind of sharing that helps all involved.</p>
<p>Knowledge and tools can only help improve things, so long as those who use them do so wisely.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2008/09/30/inside-look-at-icopyright-discovery/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Tineye: Protecting Images, Preventing Orphans</title>
		<link>http://www.plagiarismtoday.com/2008/08/19/tineye-protecting-images-preventing-orphans/</link>
		<comments>http://www.plagiarismtoday.com/2008/08/19/tineye-protecting-images-preventing-orphans/#comments</comments>
		<pubDate>Tue, 19 Aug 2008 17:03:46 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[Products]]></category>
		<category><![CDATA[artists]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[images]]></category>
		<category><![CDATA[Orphan Works]]></category>
		<category><![CDATA[orphans]]></category>
		<category><![CDATA[Photos]]></category>
		<category><![CDATA[Plagairism]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[tineye]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=1560</guid>
		<description><![CDATA[New image search engine Tineye hopes to change the way artists and photographers track their work across the Web. In essence, they hope to do for the visual world what Google did for text.]]></description>
			<content:encoded><![CDATA[<p><img class="picleft" title="tineye" src="http://www.plagiarismtoday.com/wp-content/uploads/2008/08/tineye.png" alt="" width="300" height="64" align="left" />One of the greatest challenges facing artists when it comes to protecting their work is finding infringements.</p>
<p>This is difficult because search engines, including image search engines, are designed to look for text, not pixels. Though you can look up the title of an image, the filename or even metadata within the image, if that information was changed by a site reusing your work, it has traditionally escaped detection.</p>
<p>Though the technology has existed in various forms, there has never been a search engine available to the public that could take an image and look for other ones like it. That is, until <a href="http://tineye.com/">Tineye</a>.</p>
<p>Tineye works differently than any other image search engine. It doesn&#8217;t ask you for words or even a description. Instead, you upload an image and it returns results similar to that picture. It is fast, easy to use and, most importantly, effective.</p>
<p>However, there are limitations to Tineye, especially in its current form. Though artists have many reasons to celebrate, the dancing likely won&#8217;t commence for  some time.<span id="more-1560"></span></p>
<h4>How Tineye Works</h4>
<p>For the purpose of this demonstration, I am going to use a standard Google Logo, specifically, this image:</p>
<p><center><img style=' display: block; margin-right: auto; margin-left: auto;'  class="aligncenter size-full wp-image-1570" title="google_logo-test2" src="http://www.plagiarismtoday.com/wp-content/uploads/2008/08/google_logo-test2.jpg" alt="" width="360" height="150" /></center></p>
<p>First, after accessing your Tineye account, you upload the image from your computer to the service.</p>
<p><center><img style=' display: block; margin-right: auto; margin-left: auto;'  class="aligncenter size-full wp-image-1567" title="tineye-search" src="http://www.plagiarismtoday.com/wp-content/uploads/2008/08/tineye-search.png" alt="" width="455" height="45" align="center" /></center></p>
<p>Tineye then converts the image into a fingerprint and begins matching that fingerprint against others in its database, which currently has over 700 million images.</p>
<p>After it is done, Tineye returns the results, starting with the images most similar to the one you submitted, for example, the image to the left. In this case, Tineye found over 3000 matching images, the first one being an exact copy of the image I had used.</p>
<p><center><img title="tineye-screen1" src="http://www.plagiarismtoday.com/wp-content/uploads/2008/08/tineye-screen1.png" alt="Tineye Results" width="220" height="243" /></center></p>
<p>However, the real magic of Tineye is not in its ability to detect images that are identical, but to detect those that are similar, but altered. This includes images that have been resized, cropped, edited or otherwise changed. As long as enough of the original work is left behind for Tineye to understand what it is, it can report the altered version.</p>
<p>As you can see below, in a screen capture from page 23 of the results, that often includes very heavily altered versions of the original work.</p>
<p><center><img style=' display: block; margin-right: auto; margin-left: auto;'  class="aligncenter size-full wp-image-1576" title="tineye-diff2" src="http://www.plagiarismtoday.com/wp-content/uploads/2008/08/tineye-diff2.png" alt="" width="258" height="181" align="center" /></center></p>
<p>In addition to helping you find altered version of your original image, Tineye also helps you see what was changed. For each image you see, you&#8217;re able to do a comparison where you can flip back and forth between your image and the one on the Web, noting both similarities and differences easily.</p>
<p>Also, from the search results, you can visit the URL the image is located on, making it easy to follow through and, if appropriate, take action against any infringement.</p>
<p>The site also offers a Firefox/IE plugin that allows users to perform Tineye searches from any page on the Web, thus eliminating the need to download the image first.</p>
<h4>Why this is Important</h4>
<p>To be fair, Tineye is not the first to attempt and succeed at this kind of matching. Other companies, including both <a href="http://www.digimarc.com/mypicturemarc/">Digimarc</a> and <a href="http://www.picscout.com/home/index.aspx">Picscout</a>, have long offered similar matching services that work without text.</p>
<p>However, Tineye is the first to offer a robust image matching service that is free for everyone (at least as of this writing) and is simple enough to use so that artists can take advantage of it on a whim. There is no watermarking, no technology to apply to your images, just a simple upload and search.</p>
<p>As I see it, this has three potential implications that are both very large and very welcome:</p>
<ol>
<li><strong>Copyright Protection:</strong> The most clear use is for artists to punch their images into the service and receive results, thus enabling them to track down potential infringements of their work. They can then take action to secure removal of the images or request attribution.</li>
<li><strong>Image Tracking:</strong> Some images, including buttons and banners, are put on the Web with the intention of them being shared and passed around. Tineye can track the effectiveness of such a campaign and determine how many sites are displaying the image in question.</li>
<li><strong>Orphan Works Protection:</strong> Assuming that the current orphan works legislation gets passed either as is or with only a few modifications, finding a way to search for visual work is critical. Tineye can do that. If one found a work that they thought might be an orphan, they could run it through Tineye, even scanning it in if necessary, and search for copies of it on the Web, letting them track down the copyright holder. If such a tool were effective, any qualifying search would almost certainly require such an effort be made.</li>
</ol>
<p>In short, Tineye can help bring visual artists up on par with writers in tracking their content and being able to have their work easily searched. For this reason, Tineye has already garnered several big name clients, including the Associated Press, Digg and more.</p>
<h4>Limitations</h4>
<p><img class="picright" title="tineye-size" src="http://www.plagiarismtoday.com/wp-content/uploads/2008/08/tineye-size.png" alt="" width="419" height="70" align="right" />Of course, as with any new service, there are limitations to how effective it is. However, in Tineye&#8217;s case, those limitations appear to only be temporary and should be fixed as the service grows in size and adds features.</p>
<ol>
<li><strong>Limited Index Size:</strong> Currently, the Tineye database is at about 700 million images. While that is an impressive number, one has to remember that Photobucket alone has over 5 billion images according to their numbers. The site does not seem to detect duplications on Photobucket, Flickr or other popular image sharing sites, focusing instead on blogs. Thus, many images that are known to have many copies return no results. Though Tineye has stated that they are growing their database, the number in the index has not moved in the weeks I have been using the service and no indication was given as to when they would start indexing new images.</li>
<li><strong>No Case Tracking:</strong> Currently, with Tineye, there is no way to track cases of plagiarism or copying so that they are not acted upon a second time. Though the site does a respectable job finding duplicate images, it does little to help the artist sort through the mess. The good news is that this is a feature Tineye has expressed a willingness to implement later.</li>
<li><strong>No Alerts System:</strong> Where writers have <a href="http://www.google.com/alerts">Google Alerts</a> and even <a href="http://www.copyalerts.com/">CopyAlerts</a>, there is currently no system in Tineye that will alert artists to new copies of their work being posted. Once again, this is a feature Tineye has expressed an interest and willingness in adding later.</li>
</ol>
<p>In short, Tineye is not the system artists have been waiting for today, but it definitely has the potential to be that system in the near future.</p>
<p>If Tineye can continue growing and improving its service, it can easily solve a problem that has had artists struggling to protect their work for well over a decade.</p>
<h4>Conclusion</h4>
<p>Even though Tineye is a great service with tons of potential, in its current format with the existing limitations, it is little more than a preview of what is to come.</p>
<p>Though you should definitely consider registering for the Tineye beta, if nothing else than to pass along your thoughts to the creators, you should realize that the searches you perform will, for the most part, be ineffective. That will hopefully change soon though.</p>
<p>Tineye, right now, is not intended to be the solution to the problem, but rather, a preview of the solution. So if you want to search for your images and immediately find out who has copied all of your work, Tineye, right now, is not for you.</p>
<p>But if you want to see what might be coming down the pipe, definitely check it out.</p>
<h4>Related Links</h4>
<p><a href="http://arstechnica.com/news.ars/post/20080819-tineye-image-search-helps-ferret-out-copyright-ripoffs.html">Arstechnica</a> &#8211; Another test case<br />
<a href="http://www.inquisitr.com/2466/tineye-tracking-your-images-pixel-by-pixel/">The Inquisitr</a> &#8211; An overview of Tineye<br />
<a href="http://anniebee.posterous.com/thank-you-tineye">Anniebee’s Posterous</a> &#8211; An example where Tineye worked)<br />
<a href="http://daily-tech-report.com/2008/08/19/tineye-is-looking-to-become-the-google-of-image-based-searches/">Daily Tech Report</a> &#8211; Another Tineye overview</p>
<h4>Further Discussion</h4>
<ol>
<li>How will you use Tineye?</li>
<li>What features would you like to see added?</li>
<li>How do you think image rippers will respond to this kind of search?</li>
</ol>
<h4>Video</h4>
<p><embed src="http://blip.tv/play/1nG2lGaL_jE" type="application/x-shockwave-flash" width="640" height="390" allowscriptaccess="always" allowfullscreen="true"></embed> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2008/08/19/tineye-protecting-images-preventing-orphans/feed/</wfw:commentRss>
		<slash:comments>25</slash:comments>
		</item>
		<item>
		<title>Is Flickr Letting Down its Users?</title>
		<link>http://www.plagiarismtoday.com/2008/07/10/is-flickr-letting-down-its-users/</link>
		<comments>http://www.plagiarismtoday.com/2008/07/10/is-flickr-letting-down-its-users/#comments</comments>
		<pubDate>Thu, 10 Jul 2008 17:31:58 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Legal Issues]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[api]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Copyright-Law]]></category>
		<category><![CDATA[developers]]></category>
		<category><![CDATA[Flickr]]></category>
		<category><![CDATA[images]]></category>
		<category><![CDATA[Photography]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[RSS scraping]]></category>
		<category><![CDATA[Scraping]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[Spam]]></category>
		<category><![CDATA[Spam-Blogs]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=1296</guid>
		<description><![CDATA[Photo-sharing site Flickr has come under fire as developers have used its API to violate the rights of its users, seemingly unchecked by Flickr itself. ]]></description>
			<content:encoded><![CDATA[<p><img src="http://www.plagiarismtoday.comwp-content/uploads/2008/07/skitched-20080710-123426.png" alt="Flickr Logo" class="picleft" align="left" /><a href="http://www.jmg-galleries.com/blog/2008/07/07/how-every-flickr-photo-ended-up-on-sale-this-weekend/" title="Flickr API">A recent post by photographer J.M. Goldstein</a> raised a very interesting question about Flickr and its API, namely whether or not Flickr was policing its API well enough and doing an adequate job protecting the rights of photographers and artists that post to the service.</p>
<p>Goldstein took special issue with a series of recent cases where copyright licenses were being ignored, by users of the Flickr API, the latest of which involved making all Flickr images, regardless of license terms, available for download as cell phone wallpapers on the site Myxer (the article mistakenly reports the images as being for sale, though the download, according to comment 40, was free).</p>
<p>It is very clear that many services and companies that have used the Flickr API have violated copyright holder&#8217;s rights, either intentionally or accidentally, and that this is an ongoing issue as new services come online almost every day.</p>
<p>So what can be done to fix this problem? What responsibilities does Flickr have in this? The answers, unfortunately, are neither simple nor easy.<br />
<span id="more-1296"></span></p>
<h4>The Power of the API</h4>
<p>There is little doubt that Flickr&#8217;s API is a very powerful tool. It allows third parties to build services and tools that access Flickr and use the images there in new and exciting ways. It is behind many of my personal favorite tools, <a href="http://www.plagiarismtoday.com/2008/04/09/photodropper-creative-commons-made-easy/" title="Photodropper">including Photodropper</a>. </p>
<p>Also, most applications that use the API do so in a way that is fair to the rights of the artists that use Flickr. It is, fortunately, only a small minority that do not. This is because the API makes it simple to interpret the licensing of the images and <a href="http://www.flickr.com/services/api/tos/" title="Flickr API TOS">Flickr&#8217;s terms of service for the API</a> requires developers to respect user intellectual property. </p>
<p>However, some have not and those cases pose a great deal of risk to photographers. Since the infringers are using the API, much like an RSS scraper, they have the ability to take almost everything on the site and do with it as they please. This includes, theoretically, selling the works, creating new, high-resolution galleries and using the works in advertising or promotion.</p>
<p>This has many photographers worried and, judging from the comments on the original article, at least some are abandoning Flickr due to these issues.</p>
<h4>Flickr&#8217;s Role</h4>
<p><IMG SRC="http://www.plagiarismtoday.com/images/Flickr__Your_Account-20080710-120946.png" alt="Flickr Account Settings"align="right" class="picright">Flickr, for their part, is in a bad position here. Their powerful API is one of the critical reasons that both developers and users enjoy the site as much as they do. Flickr&#8217;s ability to interact with other services has been critical to its success and removing functionality from the API could be very costly to them.</p>
<p>Despite that, Flickr does have both a terms of use that forbids developers from abusing user&#8217;s rights and the ability to revoke API keys, thus shutting down services that might be infringing.</p>
<p>However, Flickr has been slow to use this tool against developers, especially those that create products with uses that have legitimate uses. This has not stopped Flickr from shutting down some services trying to access the site, <a href="http://www.plagiarismtoday.com/2007/08/14/feelimage-no-longer-displaying-flickr-photos/" title="FeelImage Stops Indexing Flickr">such as it did with the image search engine FeelImage</a> (though FeelImage was not using the API, just a tag search, and has since resumed indexing only CC-licensed material), but such cases usually only take place after a user uproar or if the service is clearly abusive in nature.</p>
<p>The simple truth is that that the vast majority of the responsibility is on users to license their photos correctly and developers to respect those licenses.</p>
<p>Flickr, though it is the middle man, has very little it can do in many cases.</p>
<h4>What Flickr Should do</h4>
<p>This is not to say that the site is immune from all responsibility or criticism in this matter. There are several things the site can and should do to reduce the number of such incidents.</p>
<p>If I were to make suggestions, I would include the following:</p>
<p><OL><LI><strong>Clearer User Licensing Terms:</strong> The image above and to the right is what I see when I log into Flickr&#8217;s privacy options. The options are confusing and overlapping. &#8220;Sharing&#8221; a photo, for example, allows users to embed or &#8220;blog&#8221; a photo, which is yet another option, there is also no clear way to remove an image from the API (Note: You have to disable public searching on &#8220;3rd Party Sites&#8221;) and it is unclear how any of this meshes with Creative Commons Licensing. If this is confusing to me, I can imagine many users feel overwhelmed.</LI><br />
<LI><strong>Quicker Disabling of API Keys:</strong> If a developer is infringing on the rights of Flickr users, their API key needs to be disabled, at least until a fix can be made to their system. Though Flickr is understandably uneasy about banning developers for a coding mistake, they could allow such sites 72 hours to correct the problem before disabling them.</LI><br />
<LI><strong>Licensing Trumps API Permissions:</strong> Under the current system, the API setting in Flickr trump the licensing settings on the photograph, it either should work the other way around or the user should be given the option to decide which is more important. Otherwise, the copyright licensing is fairly meaningless.</LI></OL></p>
<p>As always, I am seeking other suggestions as to what Flickr can do so please feel free to add your ideas in the comments below. Since I am not a heavy Flickr user, I realize my input is limited. </p>
<h4>Conclusions</h4>
<p>In the end, the responsibility to respect licenses will always fall on the developers. As with any API, the developer will have the ability to disregard both the terms of use and the rule of law, but have a duty to respect both their own agreements and the user wishes.</p>
<p>While there are steps that Flickr can and should take to reduce this problem, the issue of Flickr-based tools ignoring licensing terms falls squarely on the shoulders of the developers that made them.</p>
<p>If developers do not bear responsibility, legally and ethically, for the works they create, then there is absolutely nothing to stop them from abusing the system even more. They, as well as the users who abuse the tools they create (in some cases), need to be held accountable first and foremost.</p>
<p>Though the frustration with Flickr is understandable and certainly is some grounds for it, they are not the ones who abused the system nor are they the ones who made the mistake.</p>
<p>They just paved the road for those who did. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2008/07/10/is-flickr-letting-down-its-users/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
		</item>
		<item>
		<title>The Popularity of Plagiarism</title>
		<link>http://www.plagiarismtoday.com/2008/07/02/the-popularity-of-plagiarism/</link>
		<comments>http://www.plagiarismtoday.com/2008/07/02/the-popularity-of-plagiarism/#comments</comments>
		<pubDate>Wed, 02 Jul 2008 15:44:15 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Personal Experiences]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[google trends]]></category>
		<category><![CDATA[MPAA]]></category>
		<category><![CDATA[plagiarim]]></category>
		<category><![CDATA[RIAA]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[search spam]]></category>
		<category><![CDATA[Search-Engines]]></category>
		<category><![CDATA[Splogging]]></category>
		<category><![CDATA[Splogs]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=1290</guid>
		<description><![CDATA[Inspired by recent posts, I decided to take a look at Google Trends and see how search terms relative to content theft were doing. ]]></description>
			<content:encoded><![CDATA[<p><img src="http://www.plagiarismtoday.comwp-content/uploads/2008/07/skitched-20080702-110241.png" alt="Google Trends Logo" align="left" class="picleft"/>A pair of recent articles, <a href="http://www.louisgray.com/live/2008/06/on-web-if-youre-not-growing-youre-dying.html" title="If You're Not Growing You're Dying">one by Louis Gray</a> and <a href="http://codingexperiments.com/archives/149" title="">another by possible248</a> (who co-authors the blog along with, among others, Voyagerfan5761, are regular here) showcased public interest in relavent search terms, namely company names and Linux distributions respectively, using <a href="http://trends.google.com/trends?hl=en" title="Google Trends">Google Trends</a>.</p>
<p>This, in turn, inspired me to do my own keyword analysis to gauge if and how public interest in topics relevant to this site have changed over the years. </p>
<p>What I found was surprising and seemed to run counter to what I was seeing with my own traffic but was interesting nonetheless.<br />
<span id="more-1290"></span></p>
<h4>Plagiarism</h4>
<p><img src="http://www.plagiarismtoday.comwp-content/uploads/2008/07/skitched-20080702-105214.png" alt="Google Trends for Plagiarism"></p>
<p>Perhaps the most obvious keyword and definitely the most common one that leads visitors to this site, this keyword has <a href="http://trends.google.com/trends?q=plagiarism&#038;ctab=0&#038;hl=en&#038;geo=all&#038;date=all&#038;sort=0" title="Google Trends Plagiarism">seen surprisingly little change over the past few years</a>. </p>
<p>Over all, the graph for it is flat with a few &#8220;ticks&#8221; upward when news stories, such as the Obama controversy and the Kaavya Viswanathan scandal, broke. There are also season downward ticks at the end of every year, likely due to the holidays.</p>
<p>In general, it appears that the overall interest in plagiarism, both academically and artistically, has remained consistent and unchanged.</p>
<h4>Content Theft</h4>
<p><img src="http://www.plagiarismtoday.comwp-content/uploads/2008/07/content-theft-google-trends-20080702-103956.png" alt="Google Trends for Content Theft"></p>
<p>Probably the most unusual graph, <a href="http://trends.google.com/trends?q=content+theft&#038;ctab=0&#038;hl=en&#038;geo=all&#038;date=all&#038;sort=0" title="Content Theft on Google Trends">content theft as a search term</a> spiked in mid-2005, around the time this site was founded, and then leveled off, only to become a regular search term again in recent months.</p>
<p>It is unclear to me what has caused these specific spikes but the latest one seems to be holding and showing some sustainable interest in the topic. Something that could indicate greater public interest in the issue and in the term itself.</p>
<h4>Copyright</h4>
<p><img src="http://www.plagiarismtoday.comwp-content/uploads/2008/07/skitched-20080702-105332.png" alt="Google Trends for Copyright"></p>
<p>Copyright, on the other hand, <a href="http://trends.google.com/trends?q=Copyright&#038;ctab=0&#038;hl=en&#038;geo=all&#038;date=all&#038;sort=0" title="Google Trends Copyright">has seen a marked decrease over the past few years</a>, at least as a search term.</p>
<p>While this seems counter-intuitive, considering that stories about copyright, especially as it pertains to the RIAA/MPAA, seem to dominate social news sites, please are clearly not search for copyright information as much as they used to.</p>
<p>This is reflected even more strongly in the <a href="http://trends.google.com/trends?q=RIAA&#038;ctab=0&#038;hl=en&#038;geo=all&#038;date=all&#038;sort=0" title="Google Trends RIAA">related graph for the RIAA</a> and <a href="http://trends.google.com/trends?q=DMCA&#038;ctab=0&#038;hl=en&#038;geo=all&#038;date=all&#038;sort=0">the DMCA</a>, where the downward slope is even more pronounced and, in the case of the RIAA, seems to almost disappear completely.</p>
<p>Though it doesn&#8217;t appear that people have lost interest in copyright issues, it is clear that they are not searching for them as much as they once were.</p>
<h4>Duplicate Content</h4>
<p><img src="http://www.plagiarismtoday.comwp-content/uploads/2008/07/skitched-20080702-105447.png" alt="Google Trends for Duplicate Content"></p>
<p>One of the greater concerns people have about plagiarism is the issue of duplicate content. As we can see on the graph above, the term <a href="http://trends.google.com/trends?q=duplicate+content&#038;ctab=0&#038;hl=en&#038;geo=all&#038;date=all" title="Google Trends Duplicate Content">rocketed onto the chart in early 2007</a>, stabilized and seems to be slowly marching upward. </p>
<p>Duplicate content, of course, covers more than just plagiarism and scraping, but a wide variety of SEO concerns. However, it is clear that this is a topic being talked about more and more. It is unclear in what capacity this term is being searched for. </p>
<h4>Plagiarism Detection Tools</h4>
<p><img src="http://www.plagiarismtoday.comwp-content/uploads/2008/07/skitched-20080702-100727.png" alt="Google Trends for Duplicate Content"></p>
<p>Looking at the chart for <a href="http://www.copyscape.com">Copyscape</a> (shown above) shows a steady increase in the number of searches over the past year and a half. This seems to mesh with my own experience, which has shown a great increase in content protection over the past 18 months. </p>
<p>Other Plagiarism detection tools, such as <a href="http://www.bitscan.com">Bitscan</a> and <a href="http://www.attributor.com">Attributor</a>, did not have enough information for Google Trends to draw any conclusions. Academic plagiarism detection tools, such as Turnitin, <a href="http://trends.google.com/trends?q=Turnitin&#038;ctab=0&#038;hl=en&#038;geo=all&#038;date=all&#038;sort=0" title="Turnitin on Google Trends">have shown a steady increase with seasonal dips as school lets out</a>. </p>
<h4>Long Tail Keywords</h4>
<p>Unfortunately, a lot of the keywords most specific to this site such as &#8220;spam blogs&#8221;, &#8220;splogs&#8221;, &#8220;RSS scraping&#8221;, etc. did not have enough data to produce results. Many of these terms are fairly new, created since I started Plagiarism Today, and are not widely used. </p>
<p>It will be interesting to see in a year or two if these keywords start to register then.</p>
<h4>Caveats</h4>
<p>In doing this &#8220;study&#8221; I realize that Google Trends is both limited and a largely invalid source of data. Not only is the data proprietary, meaning it can not be vetted, but the information is relative and contains little hard data. </p>
<p>Also, many of the keywords looked at are not keywords that are searched for by typical searchers and instead would only be searched for by bloggers. Others, however, were likely searched by both. This means that we may not have an accurate picture of how just content creators feel about these issues.</p>
<p>The goal of this check was just to get a quick idea of what was going on and what the potential attitudes were.</p>
<h4>Conclusions</h4>
<p>When I personally look at these charts, I draw three conclusions.</p>
<p>First, I see that there is a sharp decrease in the interest of searchers in the legal aspects of copyright. This could be due to greater understanding about copyright, and thus less need to search about it, or just that that users have just moved on from the early copyright controversies of the late nineties.</p>
<p>Second, there is a clear, if slow, increase in interest in tracking one&#8217;s own content and the non-legal penalties that come from infringing or being infringed. This could be a sign that creators are not thinking about these issues in the light of a legal paradigm, but rather, in a more practical framework.</p>
<p>Finally, it is clear that the interest in plagiarism, both academically and artistically, remains fairly steady and that it remains an issue of interest even after the scandals fade from the headlines.</p>
<p>Personally, this site has seen an explosive growth over the past year, both doubling in traffic and enabling me to leave my day job to work full-time as a consultant. Clearly, things are changing in this area. </p>
<p>I look forward to following these changes closely over the coming years.</p>
<p><strong>Note:</strong> All of the graphs in this post are <a href="http://www.google.com/intl/en/trends/about.html#18" title="Google Trends Terms of Use">used with permission from Google</a>. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2008/07/02/the-popularity-of-plagiarism/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Finding the Age of a Page</title>
		<link>http://www.plagiarismtoday.com/2008/06/06/finding-the-age-of-a-page/</link>
		<comments>http://www.plagiarismtoday.com/2008/06/06/finding-the-age-of-a-page/#comments</comments>
		<pubDate>Fri, 06 Jun 2008 15:52:16 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Products]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[google blog search]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[search spam]]></category>
		<category><![CDATA[Search-Engines]]></category>
		<category><![CDATA[seo]]></category>
		<category><![CDATA[Spam-Blogs]]></category>
		<category><![CDATA[Splogs]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=1254</guid>
		<description><![CDATA[If you need a quick and easy way to get an idea of when a post went life, there is a Firefox plugin that uses google to put that information just a click away.]]></description>
			<content:encoded><![CDATA[<p><IMG SRC="http://www.plagiarismtoday.com/images/linkdiagnosis-logo-20080606-104242.png" alt="Link Diagnosis Logo" align="left" class="picleft">One of the more difficult challenges on the Web is determining when a page was created. We simply can not trust the date and time stamps provided with the content we read as both good guys and bad guys alike <a href="http://www.plagiarismtoday.com/2008/05/27/spam-bloggers-who-backdate/" title="Spam Bloggers who Backdate">change the date of their posts as necessary</a>.</p>
<p>Search engines, however, can provide a much better set of statistics than a site&#8217;s own timestamps. The only issue is that gleaning the needed information can be difficult. Fortunately, a relatively new Firefox plugin entitled <a href="http://www.linkdiagnosis.com" title="Link Diagnosis">Link Diagnosis</a> helps with that by taking the dirty work out of determining when a page was indexed by Google.</p>
<p>The tool, while not perfect, can be a valuable asset when trying to determine approximately when a page appeared on the Web.<br />
<span id="more-1254"></span></p>
<h4>How it Works</h4>
<p><IMG SRC="http://www.plagiarismtoday.com/images/get-page-age-20080606-104402.png" alt="Get Page Age Screenshot"align="right" class="picright">Link Diagnosis is actually a robust plugin designed to analyze incoming links to a URL for SEO purposes. However, as one of its &#8220;hidden features&#8221; it is able to deteremine, approximately, <a href="http://blog.linkdiagnosis.com/?p=19" title="http://blog.linkdiagnosis.com/?p=19">the day the URL appeared in Google</a>.</p>
<p>It works simply by having the user right click the page they want to check, select the &#8220;Get Page Age&#8221; option and, after a few seconds they are greeted with a JavaScript popup containing the date the script detected the site appeared.</p>
<p>It works by using <a href="http://www.googletutor.com/2006/08/22/more-google-hacking-using-the-inurl-operator/" title="Google INURL">Google&#8217;s INURL command</a> which, when used in conjunction with a date filter, causes Google to display a date by each resulting URL. What the plugin does is take the URL you wish to check, create the search query and then automatically extract the applicable date, thus turning a multi-step process into a one-click solutions.</p>
<p>For anyone seeking to find out the date of a site, this could prove to be both a powerful tool and a good time saver as well.</p>
<h4>Why to Use It</h4>
<p>There are many reasons why you might want to check out the age of a particular page. </p>
<p>For one, you can use it to check if a spam blog or a plagiarist was indexed by Google before or after your original post (provided it was indexed at all). This can help determine what action you should take against the site. </p>
<p>However, many will also find its non-repudiation services to be very useful. If there ever is a dispute about who posted an article or an image first, this tool can help resolve it by providing an independent view on which went up first.</p>
<p>Though certainly not as accurate as <a href="http://www.numly.com">Numly</a> or <a href="http://www.myfreecopyright.com">MyFreeCopyright</a>, using Google is far more accurate than looking at the <a href="http://www.archive.org">Web Archive</a>, especially considering that the latter can take over six months to display any information about a URL.</p>
<p>Still, Link Diagnosis is still far from perfect in this area. there are many issues one will have if one tries to rely upon this for non-repudiation.</p>
<h4>Limitations</h4>
<p><IMG SRC="http://www.plagiarismtoday.com/images/page-age-capture-20080606-104544.png" alt="Get Page Age Error" align="left" class="picleft">Before you begin to make heavy use of this service bear in mind the following caveats:</p>
<p><OL><LI><strong>Google&#8217;s Limitations:</strong> The biggest issue of using the INURL method is that Google is not always index a site or a page immediately after it goes up. There are often delays. Also, the service can only work with pages already in the Google database, anything that has been blacklisted, either by the creator or by Google, will return no results.</LI><br />
<LI><strong>URLs and Not Content:</strong> The function will tell you when the URL appeared in Google, not the content on the page. For permalinks that may be acceptable but dynamic pages, such as the front page of Plagiarism Today, it can create a problem.</LI><br />
<LI><strong>Different Owners:</strong> Also, the system detects when a URL was first indexed by Google, not who owned it at the time. If a site changes ownership, even if it is taken out of Google during the transition, the date shown for the home page will be long to the original owner. </LI></OL></p>
<p>In short, the tools is subject to the exact same gaming and manipulation that Google and the other search engines are. As such, it can provide some quick and dirty information, especially on permalinks, but should never be taken as the ultimate gospel on the age of a page.</p>
<p>Link Diagnosis is no substitute for a true non-repudiation service and it does not claim to be.</p>
<h4>Conclusions</h4>
<p>Personally, I find the other features of Link Diagnosis much more compelling than its &#8220;page age&#8221; feature. Though it is great for a quick analysis, especially of a spam blog permalink, it may not always tell the complete truth or have the information you are seeking.</p>
<p>It is a great analysis tool but it should not be assumed to be the plain truth. There are plenty of ways that it could be wrong.</p>
<p>So, as with every tool, be sure to use it in conjunction with common sense and logic. Have it available, use it if needed, but don&#8217;t use it as a replacement for your own judgment.</p>
<p>No tool is that powerful.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2008/06/06/finding-the-age-of-a-page/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Takedown FAQ</title>
		<link>http://www.plagiarismtoday.com/2008/05/15/takedown-faq/</link>
		<comments>http://www.plagiarismtoday.com/2008/05/15/takedown-faq/#comments</comments>
		<pubDate>Thu, 15 May 2008 16:10:55 +0000</pubDate>
		<dc:creator>Jonathan Bailey</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[DMCA]]></category>
		<category><![CDATA[Legal Issues]]></category>
		<category><![CDATA[Content-Theft]]></category>
		<category><![CDATA[copyright basics]]></category>
		<category><![CDATA[Copyright-Infringement]]></category>
		<category><![CDATA[Hosting]]></category>
		<category><![CDATA[hosts]]></category>
		<category><![CDATA[notice-and-takedown]]></category>
		<category><![CDATA[Plagiarism]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[Search-Engines]]></category>
		<category><![CDATA[Splogging]]></category>
		<category><![CDATA[Splogs]]></category>
		<category><![CDATA[takedown]]></category>

		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=1051</guid>
		<description><![CDATA[Whenever copyright holders are first introduced to the idea of a DMCA takedown, they inevitably have many questions about it. Here are ten of the more common questions answered. ]]></description>
			<content:encoded><![CDATA[<table align="left" cellspacing=15>
<tr>
<td><a href="http://www.flickr.com/photos/37996580417@N01/2475936101/" title="DMCA painter's van, London, UK.JPG" target="_blank"><img src="http://farm3.static.flickr.com/2143/2475936101_8e5f2651c3_m.jpg" alt="DMCA painter's van, London, UK.JPG" border="0" /></a><br /><small><a href="http://creativecommons.org/licenses/by-sa/2.0/" title="Attribution-ShareAlike License" target="_blank"><img src="http://www.plagiarismtoday.comwp-content/uploads/2008/05/cc1.png" alt="Creative Commons License" border="0" width="16" height="16" align="absmiddle" /></a> <a href="http://www.photodropper.com/photos/" target="_blank">photo</a> credit: <a href="http://www.flickr.com/photos/37996580417@N01/2475936101/" title="gruntzooki" target="_blank">gruntzooki</a></small></td>
</tr>
</table>
<p>Whenever I work with Webmasters and bloggers to help them file DMCA notices to get their content removed from copycat sites, they inevitably have a lot of questions about the law and how to use it. Though I am not a lawyer, I do my best to answer them.</p>
<p>However, to save time and effort, as well as help those who didn&#8217;t want to ask, I&#8217;ve compiled a collection of FAQs about the process with my answers to them. </p>
<p>Hopefully this FAQ collection will answer most of your questions about the DMCA process and, if it doesn&#8217;t, please feel free to ask your question in the comments below so it can be added.<br />
<span id="more-1051"></span></p>
<h4>What is a DMCA Takedown?</h4>
<p>The Digital MIllennium Copyright Act of 1998, among its many parts, granted a &#8220;safe harbor&#8221; to Web hosts and search engines for infringement perpetrated by their customers. This means that hosts can not be held liable for any copyright infringement that their customers perform so long as they meet certain criteria.</p>
<p>One of the criteria is that they have to &#8220;expeditiously&#8221; remove allegedly infringing material when properly notified. A DMCA notice, also known as a DMCA takedown, is simply a letter that fulfills the requirements of the DMCA demands the removal of the work either from the search engine or the host.</p>
<p>Hosts, in order to preserve that safe harbor, need to comply with properly-filed DMCA notices.</p>
<h4>Should I file with Search Engines or Hosts?</h4>
<p>There are different schools of thought here. Some feel that, by filing with the search engines and waiting to file with the hosts until the search removal is complete, you can more completely wipe out an infringing site and prevent it from coming back.</p>
<p>However, search engines are slow to respond. Google can take several weeks and has complicated notification requirements. The quickest route is almost always to file directly with the host and have the work removed directly. It is also by far the easiest way, requiring just one notice, as opposed to five or more the other route.</p>
<p>Still, in most cases the decision is up to the filer. However, if the site is hosted in a country that does not have a takedown provision, search engine removal may be the only option.</p>
<h4>I Am From Another Country, Can I Use the DMCA?</h4>
<p>Yes. The DMCA allows all copyright holders, no matter where they are located, to use the takedown process. THe jurisdiction of the law is based upon where the site or search engine is hosted and the vast majority of both are within the U.S.</p>
<p>I have seen cases where a British man used the DMCA against an Australian plagiarist simply because the plagiarist hosted the infringing site with the U.S.</p>
<h4>What if the Site is Hosted in Another Country?</h4>
<p>The procedure for requesting a takedown was created by a WIPO treaty that mots countries are signatories to. However, many have not fully implemented the treaty and, as such, have no such procedure.</p>
<p>However, most countries that host a large number of sites, including the whole of the EU and Australia, have a process in place that functions very similar to the DMCA.</p>
<p>In many cases, sending a DMCA notice will work, even if the host is foreign. However, even hosts in countries without takedown procedures consider copyright infringement to be a violation of their terms of service. Therefore, even in those cases, you can often file an abuse report and secure removal of the site.</p>
<h4>Can they Get the Work Restored?</h4>
<p>One who has a DMCA notice filed against them has two choices for restoring the site. First, they can move to a different host and restore the site that way. Second, they can file what is known as a counter-notice and secure the return of the work. </p>
<p>A counter-notice is much like a DMCA notice but in reverse. Where a DMCA notice claims that the work is infringing, the counter-notice claims that it is not and demands that it be restored. Unless the person who filed the original notice files suit and secures an injunction, the work will be reposted after a waiting period.</p>
<p>Counter-notices, however, are extremely rare, especially if the DMCA notice was clearly justified. Such notices open up the person filing them to a slew of legal problems and, in general, it is easier to just move on. </p>
<p>Most cases where a site is restored involve moving the content to a new host.</p>
<h4>How Many Should I Report?</h4>
<p>There is no set answer to this. You can send as many or as few as you want. Just remember that any you don&#8217;t include you can always file another notice regarding later, in the event that the entire site isn&#8217;t taken down.</p>
<p>Still, most DMCA notices include between 5-10 items whenever a large list is involved. Some will include far fewer and others will do one item per notice. However, for the most part, it&#8217;s better to find a balance.</p>
<h4>How Long Does it Take to for a Response?</h4>
<p>The answer varies. Some hosts will act in less than 24 hours, others will take over a week. The more typical timeframe is between 48 and 96 hours.</p>
<p>However, it is important to note that hosts will typical secure removal of a work and then wait a day or two before sending an email to confirm the takedown. The reason is that they want to ensure that all cached copies of the work are cleared to avoid any confusion about the work still being up.</p>
<h4>Will the Host Shut Down the Whole Site?</h4>
<p>This depends on the circumstance and on the host.</p>
<p>If the site is largely composed of infringing material, such as with a spam blog, the host is likely going to just cancel the whole account. If the infringing work is just one or two items in a larger site, they will likely have owner of the site take down the specific items or, in some cases, surgically remove the works themselves. </p>
<p>Some hosts, such as Myspace, are known for surgically removing infringing works and almost never shut down an account on DMCA complaint. Others, such as iPowerWeb, <a href="http://www.plagiarismtoday.com/2005/12/07/ipowerwebcom-the-nuclear-option/" title="iPowerWeb and the DMCA">frequently shut down whole sites</a>.</p>
<p>To improve the odds of a domain being shut down, if you have more items to include in your DMCA notice, do so. If you reported the site once and the works were surgically removed while other infringing items remain, report the other works in a second notice.</p>
<p>Under the DMCA, hosts are required to ban repeat infringers from using their service.</p>
<h4>Can I Get Into Trouble?</h4>
<p>In order to file a DMCA notice you need to swear under penalty of perjury that you have a &#8220;good faith&#8221; believe that the work is infringing and that you are the copyright holder or an authorized agent.</p>
<p>If you file a knowingly false DMCA notice there are many potential legal consequences, <a href="http://www.plagiarismtoday.com/2007/03/15/michael-crook-case-settled/" title="Michael Crook DMCA Case Settled">some of them very dire</a>. However, there is <a href="http://www.plagiarismtoday.com/2008/02/07/the-dangers-of-the-dmca/" title="">much debate and even conflicting rulings</a> about what constitutes a &#8220;good faith&#8221; belief and where the bar is placed for meeting that test.</p>
<p>Generally speaking, though I am not an attorney, if you stick to cases of clear-cut copyright infringement such as scraping, plagiarism, etc. and avoid cases that raise fair use issues, the risk of trouble is relatively low.</p>
<p>Sadly, even cases where the DMCA notice was clearly false rarely result in as much as a counter-notice due to the legal uncertainties. This has enabled much of the DMCA abuse we see and has contributed to the reputation of the law as being one used to silence critics or stop fair use.</p>
<h4>Is There Anything Else?</h4>
<p>The DMCA is a powerful tool and should be used carefully. Be responsible with your plagiarism fighting and be cooperative with hosts as much as possible.</p>
<p>If you have any questions, feel free to write me either via the <a href="http://www.plagiarismtoday.com/contact-pt/" title="Contact Plagiarism Today">contact form</a> or sending me an email to jonathan at plagiarismtoday dot com. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.plagiarismtoday.com/2008/05/15/takedown-faq/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Page Caching using disk: enhanced

Served from: www.plagiarismtoday.com @ 2012-02-13 06:41:41 -->
