Plagium Introduces Deep Search

Earlier this week, Plagium announced its new “Deep Search” feature, which it hopes will make it easier to spot duplicates and more subtle plagiarisms/copies in longer works.

The new feature works by separating a longer work into multiple sections, each roughly a paragraph in length, locating duplicated content within each section and displaying the matching content contained within each detected page.

The idea is to make it easier to go through longer document, to more quickly understand which copies are the most important, the content they are using and how much matching material there is.

The question, however, is how well does the system work and is it worth the money that Plagium is charging? To find out, I ran Plagium through a series of quick tests to see how well it performed.

How Plagium Deep Search Works

Previously I talked about Plagium and compared it favorably to Copyscape and other, similar plagiarism checkers for the purpose of finding plagiarisms and other copies of your work online. I even mentioned it in a case study showing how it was useful in catching a plagiarizing poet.

However, one of my gripes about Plagium was that it has always been difficult to parse the results. Plagium has always provided good information about the infringing pages, but not necessarily about what was being copied. This was especially problematic with longer documents where the copied text might be buried deep within the page.

Plagium’s deep search attempts to fix that. By breaking lengthy documents into sections and showing match results for each part of the document, it makes it easy to both get a general overview of the entire document via its “summary” feature and results for each part of the document.

This deep searching, however, comes at a cost. Unlike Plagium’s “Quick Search” feature, which is the equivalent of its previous service, deep searching is not free. Deep searches from Plagium cost $1 for 100,000 characters (approximately 20,000 words), $2 for 200,000 characters (approximately 40,000 words) and $10 for 1,100,000 characters (approximately 220,000 words).

So is Deep worth the money? I put the process through a few tests to find out.

Usability and Interface

Right off the bat, there were a few things that annoyed me about Plagium’s Deep Search feature. First and foremost was that these searches were far from quick.

Even for medium-length documents, these searches routinely took longer than 40 seconds. While that might not seem long, bear in mind that other services, including Plagium’s Quick Search tool, usually take less than four seconds. Basically, if you perform a Deep Search, be prepared to wait.

The interface itself was functional but not exactly attractive or impressive. The results are broken up into a summary and a detailed report. The first provides just an overview of the pages detected and the latter does the section-by-section breakdown.

One useful feature is the ability to delete sites from the results. This is useful both if you have domains that you aren’t interested in, such as a permitted use or even your own site, or to remove sites you’ve already processed.

Still, it’s frustrating that there was no way to do a full side-by-side comparison of the original and the duplicate. Though hovering your mouse over each result in the detailed report would show you the matching text in that section, getting a complete view of the suspected copying is something that’s impossible with Plagium but easy with Copyscape.

That being said, I do like the way Plagium breaks down the similarities by words, sentences and highest search engine rank. It makes it very easy to get an “at a glance” understanding how serious the copying really is and lets you prioritize matches easily.

However, all of these features are meaningless without good matching. To find out how well Plagium’s Deep Search performed, I ran it through a series of tests designed to compare it to similar services.

Matching Tests

To better understand how well Plagium’s Deep Search tool did at detecting plagiarism, I decided to do several side-by-side tests comparing it against both their free offering and Copyscape’s Premium offering.

The results are below:

Client Page 1

First I decided to run an old prose work of mine that had relatively limited copying to see how well the various engines did when dealing with older works in a more traditional format.

Plagium Deep Scan Plagium Quick Scan Copyscape Premium
5 2 1

Press Release

Second, I tested a recent press release by the Copyright Clearance Center to see how well it detected copying that had taken place very recently.

Plagium Deep Scan Plagium Quick Scan Copyscape Premium
95 20 25

Poem

Finally, I tested an old poem of mine that I knew had widespread copying and reuse, both legitimate and illegitimate, to determine how well it handled poetry and works without traditional paragraph breaks.

Plagium Deep Scan Plagium Quick Scan Copyscape Premium
15 11 12

In every case, Plagium Deep Search came out on top, finding more results. However going through the results, especially with the second case, I found that many of the results were false positives, sharing less than 50 words. With no way to set the threshold for what is considered a match, I would have had to eliminate about half of the results from being actual matches.

That being said, even with the false positives removed, Plagium Deep Search outperformed both its free offering and Copyscape in finding matches. This is likely due to the fact that Plagium Deep Search seems to poll a wider range of sources, including Yahoo! News, Bing and Bing News. Though Plagium also polls Yahoo! search, that is now powered by Bing, making that search of limited usefulness.

One minor issue I did have with Plagium’s match detection was that, in its detailed report, perfect matches often had gaps in the highlighting, indicating that parts of the match were not detected. This didn’t seem to affect the overall accuracy in terms of finding pages, but it could limit Plagium’s usefulness for certain types of plagiarism analyses where greater precision is needed.

All in all, from a matches found perspective, Plagium seems to have a very compelling product on its hands and one that others may wish to start making broader use of.

Bottom Line

Despite some interface issues, Plagium’s Deep Search tool is a pretty compelling service offering and the $10 account for 1.1 million characters will likely last most searchers a full year, which is as long as the credits are good for.

Even with all of the tests that I did, I have only gone through about 50,000 characters on my account.

Personally, I’ll be using the Deep Search tool in lieu of the free offering, which I was already using to supplement Copyscape, Google Alerts and other search tools.

However, for most I would recommend testing your search with the free offering before deciding if there is any cause to use the Deep Search. If you find no results on the free offering, there isn’t any reason to spend the money, even if it is only a dollar or two.

That being said, if you do find a cause to dig deeper, you’ll likely find Plagium Deep Search to be well worth the cost.

Disclosure: I was given 1.1 million characters of deep searches for free for the purpose of performing this review. This is valued at $10.

Want to Reuse or Republish this Content?

If you want to feature this article in your site, classroom or elsewhere, just let us know! We usually grant permission within 24 hours.

Click Here to Get Permission for Free