The Current State of Detecting AI Writing

Earlier this week, Turnitin issued a press release saying that, since April, the company had reviewed more than 65 million student papers for AI writing.

Of those papers, more than 2 million, or 3.3% were flagged as containing 80% or more AI-written text and 6 million papers, or 10.3% were flagged as containing at least 20% AI-written material.

In a vacuum, this announcement would be huge. Not only is the number of papers examined almost unfathomable, but the high percentage of papers found to have contained AI writing points to a serious and likely growing problem.

But this announcement is not in a vacuum.

Back in April, the Washington Post performed a test on Turnitin’s AI detection. That test found that only six of the sixteen papers presented to it were identified correctly, and the errors included one false positive on a human-written essay. 

Just last month, Turnitin acknowledged the fallibility of their AI detection system, as other computer scientists warned that true and reliable detection of AI writing may never be feasible.

And it’s not just Turnitin facing these challenges. Yesterday, OpenAI shut down their ChatGPT plagiarism detector, citing a “low rate of accuracy”. 

Though the company says they have “made a commitment to develop and deploy mechanisms that enable users to understand if audio or visual content is AI-generated,” there are no immediate plans for a replacement. 

This means that even the creator of ChatGPT cannot reliably tell if ChatGPT wrote a particular work. 

As such, it’s difficult to take Turnitin’s numbers with any degree of seriousness. While it may have detected 10 million papers as containing significant AI content, that means nothing if we can’t be reasonably certain that number is accurate.

How many of those 10 million papers actually contained AI text? How many of the 55 million not flagged did? We don’t know. We have no way of knowing. The lack of reliability of AI detection means that statistics about AI usage are impossible to know.

That, in turn, puts us in a difficult position as we prepare for the 2023 school year. There is functionally no reliable detection of AI works, and there’s no clear solution on the horizon.

Where We’re At Today

Earlier this year, things looked halfway decent on this front. Though generative AIs were clearly a major challenge for schools. AI detection seemed to be working very well.

In a January 2023 post, I even said that AI detection systems “are generally very reliable at detecting AI writing.” That’s because, initial evidence, seemed to indicate that they were. 

However, as additional evidence came in, so did reports of significant false positives and proof that the reliability of these services was not what they initially appeared to be.

To be clear, we’re still waiting for larger-scale studies that may give us a clearer picture of the accuracy of AI detection systems. But, as we’ve moved from anecdotal evidence to having some limited data, the findings have not been encouraging.

This has been worsened by a flood of new companies entering the space, many of which are offering wildly inaccurate products but aggressively selling a false sense of confidence in their results.

It’s no wonder that we’ve seen professors accuse large portions of their students of plagiarism on bad information and students forced to fight inaccurate allegations.

Right now, this space is fully enveloped in a fog of war. We don’t know if AI detection is even feasible, if it is how accurate current services are, and if there is any way to act on existing detections of AI writing.

Needless to say, this is far less than ideal. But there are potential solutions.

Focusing on Prevention, Not Detection

Recently, Tyton Partners performed a survey of some 1,748 instructors and the found that “preventing student cheating” had become the most significant “institutional challenge” for teachers. Some 43% of respondents marked it as a concern.

That represents a tremendous jump from the previous year. In that survey, it was only marked as a concern by 15% of respondents and was the tenth-most cited concern. In short, the issue had leapfrogged nine other issues and tripled the number of instructors concerned about it in just one year.

The reason, clearly, is AI, the rise of which has largely happened over the past 6–8 months.  

But for the instructors who are appropriately concerned, there’s simply not much hope coming from the detection side.

Even if a perfect and infallible AI detector were released tomorrow, we would not be in a position to trust it accordingly for quite some time. That’s because these tools, much like AI itself, are black boxes and we would have no way of knowing it could be trusted without extensive research. 

With a traditional plagiarism analysis, a human examines the automated findings and makes a determination. An AI checker doesn’t afford any opportunity for such an analysis.

While there are things that instructors can do to increase certainty in such detections, most notably use multiple services to confirm results, focusing on the detection side doesn’t make sense.

Instead, the focus needs to be put on prevention. As we discussed in March 2022, there are a myriad of ways that schools can work to prevent or reduce plagiarism.

Some can be done in the classroom, such as using plagiarism-resistant assignments and offering alternative forms of assessment, while others require institutional pushes, such as offering better student support and having programs to spot and help students who are struggling.

But even small things can help. Having students submit assignments through Google Docs or another platform where versioning is tracked can help see if a paper was written organically or was copied/pasted. This can help provide physical evidence that can support or refute any suspicion of plagiarism.

One can also ask students suspected of using AI to perform some additional task or assignment that can’t be done through an AI to prove understanding of the information.

But, while such approaches can be used to target specific cases of suspected plagiarism, they are likely better used as general prevention. Students who know that they are going to face other challenges of their understanding are less likely to try and take shortcuts to that understanding, AI or not.

That, in turn, is likely the best outcome for educators and students alike.

Bottom Line

In the end, the future of AI detection is uncertain. This article is being written in July 2023, and it’s impossible to say what the future holds in this space.

Personally, I still believe that it’s possible and even likely that AI detection will catch up, as plagiarism detection did during the nascent years of the internet. There’s simply too much money and too much interest in solving this problem for me to write it off as impossible right now.

But getting to that point will likely take many years. Even if that is correct, it doesn’t do much to change things for instructors or students in the upcoming school year.

As such, now is the perfect time to put the focus on prevention. While this may mean making major changes to the way we teach and assess students, those changes could have benefits well beyond just reducing cheating.

There’s no doubt that AI is a significant challenge. But, as the cliche goes, challenges are also opportunities and this is the opportunity educators can seize with this problem.

It may feel like cold comfort, but it could lead to better education and better assessment for students down the road. 

Want to Reuse or Republish this Content?

If you want to feature this article in your site, classroom or elsewhere, just let us know! We usually grant permission within 24 hours.

Click Here to Get Permission for Free