Update: Copyscape Drastically Improved

Jonathan BaileySeptember 18, 2007

3 minutes read

Since my previous review of Copyscape Premium, I’ve been communicating with Gideon Greenspan, one of the co-founders of Copyscape, about the issues I experienced with the service. He, and the others who work on Copyscape, have been very interested in my results and in improving their service.

They repeated my tests with roughly the same results and discovered that there is an issue with the Google API, which they use as their backend, that was limiting its usefulness at detecting plagiarism in works that had been lifted many different times.

I received an email from Greenspan yesterday telling me that they’ve made adjustments on their end and they have greatly improved Copyscape’s ability to detect widely plagiarized works.

Eager to see if it worked, I opened up my Copyscape Premium account and decided to give it another try. The differences were immediately clear.

A Definite Change

To retest Copyscape Premium, I decided to try the service again with two of the poems I used in my original experiment.

In my first run, neither poem produced any results and subsequent tests since then only turned up one or two copies. This is despite the fact that both works are widely available on the Web, both as legitimately licensed copies and as plagiarized works.

When I analyzed the first poem, using the print page to avoid scanning comments (at Greenspan’s suggestion), Copyscape picked up ten results. The second one, using the same method, picked up fifteen.

To call the difference noticeable would be an understatement. Copyscape had gone from being completely ineffective in my experiments to producing some very powerful results.

First Run:

Second Run:

Limitations

As great as the improvements to Copyscape are, the results do have a few caveats.

First, a text search for one of the poems only produced four results. Still, far more than previous, but far less than using the printable page. The reasons for this are unclear.

The second, and largest, caveat is that the results were still only a fraction of the actual copies available. When I did a Google search for the two poems, the first produced 25 results, the second produced 36.

All in all, that leaves over half of all the copies of the test works undetected by Copyscape.

However, according to Greenspan, the fact that my works were so heavily plagiarized and copied was, and still is, part of the problem. Since Google tends to put duplicates of a work in their duplicate content filter, which in turn seems to hide it from the Google API, works that are copied less or are paired with other content may fare much better.

Unfortunately, this appears to be an issue on Google’s end and is not something Copyscape can easily correct.

Conclusions

It is clear that the steps Copyscape took to tweak their service had a drastic impact. Though it is still somewhat limited in how it handles widely-plagiarized works, it may be effective enough for many to consider using.

After all, even though Copyscape does not seem to detect every single use of the work, it does offer greater convenience and tools to help monitor and track plagiarism cases. Whether or not this trade off is worth the cost involved in using the premium service is a decision best left up to individual Webmasters.

Personally, I’m going to continue to watch the service and test it. Greenspan has said there may be future adjustments that could improve it even more. For me, Copyscape is reaching the tipping point where the benefits start to outweigh the drawbacks, however, I still have to hold back from giving an unrestrained recommendation since so many copies were still undetected.

The best thing to do right now is consider the content you post, the nature of the plagiarism you’ve faced and see if Copyscape Premium is a good fit.

It may not be right for everyone, but there are now many people who it may be right for.

Want to Reuse or Republish this Content?

If you want to feature this article in your site, classroom or elsewhere, just let us know! We usually grant permission within 24 hours.

Click Here to Get Permission for Free

Jonathan BaileySeptember 18, 2007

3 minutes read

Want to Reuse or Republish this Content?

Follow us