Detecting AI-Generated Research Increasingly Difficult for Scientific Journals
June 12, 2024
This essay is the work of a dinobaby. Unlike some folks, no smart software improved my native ineptness.
Reputable scientific journals would like to only publish papers written by humans, but they are finding it harder and harder to enforce that standard. Researchers at the University of Chicago Medical Center examined the issue and summarize their results in, “Detecting Machine-Written Content in Scientific Articles,” published at Medical Xpress. Their study was published in Journal of Clinical Oncology Clinical Cancer Informatics on June 1. We presume it was written by humans.
The team used commercial AI detectors to evaluate over 15,000 oncology abstracts from 2021-2023. We learn:
“They found that there were approximately twice as many abstracts characterized as containing AI content in 2023 as compared to 2021 and 2022—indicating a clear signal that researchers are utilizing AI tools in scientific writing. Interestingly, the content detectors were much better at distinguishing text generated by older versions of AI chatbots from human-written text, but were less accurate in identifying text from the newer, more accurate AI models or mixtures of human-written and AI-generated text.”
Yes, that tracks. We wonder if it is even harder to detect AI generated research that is, hypothetically, run through two or three different smart rewrite systems. Oh, who would do that? Maybe the former president of Stanford University?
The researchers predict:
“As the use of AI in scientific writing will likely increase with the development of more effective AI language models in the coming years, Howard and colleagues warn that it is important that safeguards are instituted to ensure only factually accurate information is included in scientific work, given the propensity of AI models to write plausible but incorrect statements. They also concluded that although AI content detectors will never reach perfect accuracy, they could be used as a screening tool to indicate that the presented content requires additional scrutiny from reviewers, but should not be used as the sole means to assess AI content in scientific writing.”
That makes sense, we suppose. But humans are not perfect at spotting AI text, either, though there are ways to train oneself. Perhaps if journals combine savvy humans with detection software, they can catch most AI submissions. At least until the next generation of ChatGPT comes out.
Cynthia Murrell, June 12, 2024