AI Checks Professors Work: Who Is Hallucinating?

March 19, 2025

This blog post is the work of a humanoid dino baby. If you don’t know what a dinobaby is, you are not missing anything. Ask any 80 year old why don’t you?

I read an amusing write up in Nature Magazine, a publication which does not often veer into MAD Magazine territory. The write up “AI Tools Are Spotting Errors in Research Papers: Inside a Growing Movement” has a wild subtitle as well: “Study that hyped the toxicity of black plastic utensils inspires projects that use large language models to check papers.”

Some have found that outputs from large language models often make up information. I have included references in my writings to Google’s cheese errors and lawyers submitting court documents with fabricated legal references. The main point of this Nature article is that presumably rock solid smart software will check the work of college professors, pals in the research industry, and precocious doctoral students laboring for love and not much money.

Interesting but will hallucinating smart software find mistakes in the work of people like the former president of Stanford University and Harvard’s former ethics star? Well, sure, peers and co-authors cannot be counted on to do work and present it without a bit of Photoshop magic or data recycling.

The article reports that their are two efforts underway to get those wily professors to run their “work” or science fiction through systems developed by Black Spatula and YesNoError. The Black Spatula emerged from tweaked research that said, “Your black kitchen spatula will kill you.” The YesNoError is similar but with a crypto twist. Yep, crypto.

Nature adds:

Both the Black Spatula Project and YesNoError use large language models (LLMs) to spot a range of errors in papers, including ones of fact as well as in calculations, methodology and referencing.

Assertions and claims are good. Black Spatula markets with the assurance its system “is wrong about an error around 10 percent of the time.” The YesNoError crypto wizards “quantified the false positives in only around 100 mathematical errors.” Ah, sure, low error rates.

I loved the last paragraph of the MAD inspired effort and report:

these efforts could reveal some uncomfortable truths. “Let’s say somebody actually made a really good one of these… in some fields, I think it would be like turning on the light in a room full of cockroaches…”

Hallucinating smart software. Professors who make stuff up. Nature Magazine channeling important developments in research. Hey, has Nature Magazine ever reported bogus research? Has Nature Magazine run its stories through these systems?

Good question. Might be a good idea.

Stephen E Arnold, March 19, 2025

Written by Stephen E. Arnold · Filed Under AI, News, Publishing

Comments

Comments are closed.

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.