Copyscape: Not Ready for Prime Time

By Jonathan Bailey • Jun 28th, 2005 • Category: Products

As much as I love seeing people take a stand against Internet plagiarism and work on new ways to use technology to fight the fight, Copyscape isn’t the “invaluable tool” it’s testimonials claim. Rather, it’s just another pay product that can’t outstrip what clever Webmasters can already do for free.

To test out the site, I first entered my other homepage, ravensrants.com, into the search box and pressed enter. After churning for a second, it spit back ten results, the most that one can get for free, supposedly alerting me to plagiarism of my home page.

However, out of the top ten results, none were actual incidents of plagiarism. Most were search engines and directories that sample my intro test for the site description, a completely legal use. Apparently, in order to see any incidents of plagiarism, I’d have to pay the ten dollar monthly fee.

As frustrating as that was, things were about to get worse.

I decided that, maybe, searching for the home page was a bit unfair. It is widely linked and quoted. So I decided to search for a poem and, specifically, one that I knew plagiarism existed for.

The poem I tried was “Home is Not My Own”. I pasted the address into Copyscape and it returned nothing. I tried again with the old link, before the comments feature was added, thinking maybe the extra text was throwing it off. Once again, nothing.

I then did a simple Google Search for the line “but I must leave so I can continue to live”, which is a line in the poem itself, and turned up the incident I knew about.

I then decided to run one of my longer pieces through it thinking that my poems were too short for Copyscape to use effectively. For the final experiment, I chose the RavenSpeak “No ‘Fat Chicks’ Allowed”.

I entered the link into Copyscape and it turned up the one incident of plagiarism. I then went to Google and entered the quote “all you have to do is go to a movie, visit a store or simply stick your nose in a magazine” and it turned up the exact same, single result.

In the end, Copyscape missed plagiarism incidents I knew about. cluttered up limited results with search engines and other legal uses and failed, even under ideal circumstances, to turn up anything new. Since Copyscape uses Google to find its results, it can’t do anything that Google and Google Alerts can’t do already and it won’t find any new incidents. The only thing it offers is the convenience of letting you search without looking for a good phrase to search for.

If you have longer works that copyscape can find meat on, it might be a useful addition to your search scheme. However, be prepared for lots of false positives and to spend a lot of time wading through them. For most, Google searches and Google Alerts are still the best, you’ll get fewer false positives with a clever phrase than a blind Copyscape search.

The one thing that Copyscape is doing well is bringing attention to the issue of plagiarism online through the media and public attention. They’re also launching a banner/button campaign to help spread the word over the Web. These are all great things, it’s just a shame that the technology doesn’t live up to the promise.

Besides, paying ten dollars or more a month for automatic searches and unlimited results doesn’t make a lot of sense, not when Google does it for free already.

Short URL to this Post: http://copybyte.com/z/j8

Jonathan Bailey is The Webmaster and author of Plagiarism Today, which he founded in 2005 as a way to help Webmasters going through content theft problems get accurate information and stay up to date on the rapidly-changing field. He is also a consultant to Webmasters and companies to help them devise practical content protection strategies and develop good copyright policies.
Email this author | All posts by Jonathan Bailey

  • Jest: You might want to see some of the updates to this article. The latest one is here: http://www.plagiarismtoday.com/2007/10/02/copys...

    They've made a lot of improvements to their service and much of this article is really out of date. It was written in 2005 after all.

    Hope that helps!
  • nothing new here besides copyscape USES the google api so they are much dependent on it plus they don't do much else from what you have done take random rare excerpts and run it trough google.

    The reason nothing was returned on your side is because their algorithym didn't pick that part of the poem as a rare combination phrase. you didn't discover anything new plus the people stealing content already know about this and using it as an asset.
  • Thanks for the interesting review of CopyScape. It's interesting that CopyScape isn't returning the same results that Google is, since Google offers an API that allows Web developers to access its search results directly. This API is only made available under a special license to commercial developers, though, so perhaps the functionality is limited in those cases.

    Richard Hamming gave an interesting speech about how scientists can turn problems into research opportunities:

    http://www.paulgraham.com/hamming.html

    Maybe the problem of plagiarism is also a research opportunity for people like us, research to help curb a problem that is both immoral and illegal.
blog comments powered by Disqus