New tool can spot ChatGPT-generated academic science writing with over 99% accuracy

Publicly released:
International
CC-0
CC-0

US scientists have developed a tool that can identify academic science writing composed by ChatGPT with over 99% accuracy. To start, the team trained ChatGPT to be a better science writer using 64 'perspectives' articles from scientific journals, which provide an overview of specific research topics. ChatGPT then composed 128 articles based on these perspective pieces. The team says ChatGPT gives itself away because it's so predictable, using simpler paragraph structures than human authors, more uniform sentence length, and a more consistent number of words per paragraph. Vocabulary also differed, with humans more likely to use words such as "however", "but" and "although", while ChatGPT was more likely to use the words "others" and "researchers". Feeding this information into their tool, they were able to spot ChatGPT articles with 100% accuracy, and individual paragraphs with 92% accuracy.

Media release

From: Cell Press

AI-generated academic science writing can be identified with over 99% accuracy

The debut of artificial intelligence chatbot ChatGPT has set the world abuzz with its ability to churn out human-like text and conversations. Still, many telltale signs can help us distinguish AI chatbots from humans, according to a study published on June 7 in the journal Cell Reports Physical Science. Based on the signs, the researchers developed a tool to identify AI-generated academic science writing with over 99% accuracy.

“We tried hard to create an accessible method so that with little guidance, even high school students could build an AI detector for different types of writing,” says first author Heather Desaire, a professor at the University of Kansas. “There is a need to address AI writing, and people don’t need a computer science degree to contribute to this field.”

“Right now, there are some pretty glaring problems with AI writing," says Desaire. "One of the biggest problems is that it assembles text from many sources and there isn't any kind of accuracy check — it's kind of like the game Two Truths and a Lie."

Although many AI text detectors are available online and perform fairly well, they weren’t built specifically for academic writing. To fill the gap, the team aimed to build a tool with better performance precisely for this purpose. They focused on a type of article called perspectives, which provide an overview of specific research topics written by scientists. The team selected 64 perspectives and created 128 ChatGPT-generated articles on the same research topics to train the model. When they compared the articles, they found an indicator of AI writing — predictability.

Contrary to AI, humans have more complex paragraph structures, varying in the number of sentences and total words per paragraph, as well as fluctuating sentence length. Preferences in punctuation marks and vocabulary are also a giveaway. For example, scientists gravitate towards words like "however," "but" and "although," while ChatGPT often uses "others" and "researchers" in writing. The team tallied 20 characteristics for the model to look out for.

When tested, the model aced a 100% accuracy rate at weeding out AI-generated full perspective articles from those written by humans. For identifying individual paragraphs within the article, the model had an accuracy rate of 92%. The research team's model also outperformed an available AI text detector on the market by a wide margin on similar tests.

Next, the team plans to determine the scope of the model's applicability. They want to test it on more extensive datasets and across different types of academic science writing. As AI chatbots advance and become more sophisticated, the researchers also want to know if their model will stand.

"The first thing people want to know when they hear about the research is 'Can I use this to tell if my students actually wrote their paper?'" said Desaire. While the model is highly skilled at distinguishing between AI and scientists, Desaire says it was not designed to catch AI-generated student essays for educators. However, she notes that people can easily replicate their methods to build models for their own purposes.

Multimedia

ChatGPT vs Human
ChatGPT vs Human

Attachments

Note: Not all attachments are visible to the general public. Research URLs will go live after the embargo ends.

Research Cell Press, Web page The URL will go live after the embargo ends
Journal/
conference:
Cell Reports Physical Science
Research:Paper
Organisation/s: University of Kansas, USA
Funder: This work was supported by NIH grant R35GM130354 (to H.D.) and by funding from the Madison and Lila Self Graduate Fellowship, University of Kansas (to A.E.C. and M.I.).
Media Contact/s
Contact details are only visible to registered journalists.