Turnitin Analyzes the Spectrum of Plagiarism


Recently, plagiarism detection service Turnitin performed a survey of some 879 educators in a bid to understand what kinds of plagiarism were the most common in academia and, equally importantly, which were viewed as being the most problematic.

In a recent whitepaper posted to its site, Turnitin reported on the findings of its study and laid out what the educators said along with some analysis of their own.

The results were not shocking, but still provided a great deal of insight into both how educators view and treat plagiarism as well as what students are actually doing.

So what did the survey find? I’ve included some of the highlights below.

What the Survey Says

The survey first attempted to identify the different types of plagiarism that are common in academic settings Specifically they broke it apart into 10 categories, as defined below in the order of the type with the greatest intent to the lowest intent.

  1. Clone: Verbatim copying without additions/subtractions.
  2. CTRL+C: Largely verbatim copying from a single source with minor changes.
  3. Find-Replace: Verbatim copying with key words/phrases changed, often automatically.
  4. Remix: Paraphrasing content so that it flows seamlessly with other work.
  5. Recycle: Plagiarizing from older works of your own, self plagiarism.
  6. Hybrid: Combining correctly cited material with non-cited material in the same passage.
  7. Mashup: A mix of copied and original content from various sources without attribution.
  8. 404 Error: Including citations that do not exist or are inaccurate.
  9. Aggregator: Properly cited material that contains little original content.
  10. Re-Tweet: Includes proper citation but uses too much of the original wording, content that should have been quoted but was paraphrased.

The instructors were then asked to rate the various types of plagiarism in terms of both how often they see the plagiarism and how problematic it is (both in terms of discipline and effort required to locate).

The results were pretty straightforward. Clone plagiarism was the highest both on the problematic and the frequency scale, easily taking the highest score in both. Mashup plagiarism was nearly as common as Clone plagiarism but was ranked 3rd in terms of being problematic. CTRL+C, instead, was second in problematic and it was also third in frequency.

From there, both the frequency and the problematic numbers drop quickly. In terms of frequency, Remix, Recycle, Re-tweet, Find-Replace, Aggregator, 404 Error and Hybrid rounded up the list. In terms of problematic, Aggregator, Recycle, 404 Error, Find-Replace, Hybrid, Remix and Re-tweet finished out the list.

From this information, Turnitin wrapped up the report by making three recommendations for educators. Those are:

  1. Intent Matters: Stating that the intent of the alleged plagairist matters and should be weighed when deciding what, if any, disciplinary action should be taken.
  2. Guide Students: Help students avoid unintentional plagiarism and make them aware that professors know what is going on.
  3. Use OriginalityCheck: Finally, they encouraged educators to give students access to their originality reports so they can see their mistakes and correct them.

All in all, the results and the recommendations probably won’t come as a surprise to many, but there is still a few things worth looking at deeper.

My Thoughts on the Survey

In general, as I said above, I didn’t find too much surprising. Clone plagiarism, as the survey calls it, is definitely both the most problematic and the most common. At the end of the day, most cases of plagiarism are still just matters of copy and paste.

The prevalence of “Mashup” plagiarism was interesting and shows that there a large number of students who are not simply copying and pasting what they take. This kind of plagiarism lends itself both to cheaters who are trying to circumvent plagiarism detection and students who simply don’t understand how and when to attribute the works they use. It can be very hard to tell the difference in these cases.

Also interesting was the prevalence of “Recycle” plagiarism (fifth on both lists) as it is thought by many to be a rare phenomenon for a student to simply reuse content they had created previously. This hints at the idea that many educators may want to get more creative with their assignments and focus on crafting tasks that are more plagiarism-resistant.

I also largely agreed with the conclusions of the whitepaper. I’ve written before that the way many schools take plagiarism as a one-size fits all discretion is hurting the fight against it,

Also, guiding students and bringing them into the plagiarism checking process are both common-sense ideas to reduce plagiarism, especially accidental plagiarism.

The only issue I have with the report is many of the names chosen for the types of plagiarism. While I know first hand the complexities in trying to distinguish between different types of plagiarism and explain those differences to others, some of the names seemed to be confusing.

For example “Clone” and “CTRL+C”, to mean mean the same thing since CTRL+C is the command to copy (exact words). Likewise “Mashup” and “Remix” mean similar things online though, in this study, they cover two very different types of plagiarism. Finally, I don’t know if “404 Error” and “Re-Tweet”, both of which are Web terms, are great analogies for the type of accusations (or would be widely understood by many educators).

However, that’s a very minor quibble in general as the break down of the actual types of plagiarism was still very good and the results very interesting and useful. For what was a very small survey, it answered some tough questions and got some very interesting data.

Bottom Line

In the end, the study didn’t have many surprises but the few it had were interesting and its breakdown does provide a good jumping-off point for future conversations about plagiarism and academic dishonesty.

In short, it’s an important survey and whitepaper for educators to look at and consider. As school lets out for the year and educators begin to prepare for the next semester, now’s a great time to reconsider the approaches schools take to plagiarism and if, maybe, their policies don’t mesh with the realities of what is going on.


(click for full size)

Turnitin Spectrum of Plagiarism Infographic

  1. “Find/Replace” is way more problematic than a 1.2 — it should be more like a 10. This is a pretty disturbing result. Educators think it’s OK to take credit for somebody else’s words if you simply substitute a few of them? Ugh!!!

    •  @PeteForsyth I have to agree with you. However, if I had to offer a justification for the discrepencacy, I would say that it’s because find/replace plagiarism shows at least an attempt to paraphrase in some people’s minds. The difference between find/replace plagiarism and CTRL+C is shades of gray in many ways. I have to wonder if educators were looking at extremely mild cases of Find/Replace when making their judgments.

      •  @plagiarismtoday It reflects an attempt to do *something* — but is that something adding value, or is it obfuscating the source? Maybe a grey area exists, but from the “find/replace” description I see much more black than white in what’s being identified.
        If the change is mechanical, that pretty much means that there is no original thought being added, by definition. If value is being added, I’d say it’s something different from “find/replace.”

  2. We are interested in using this white paper in a tutorial about plagiarism, but I’m concerned about the lack of attribution. Have you seen anything about who actually conducted this study? I don’t like using an anonymous source from a commercial website.

    • Turnitin conducted the survey, as per the footer on the full infographic. They used their own data to come up with the classifications. So this is an internal project for them completely.