AI Checks Professors Work: Who Is Hallucinating?

March 19, 2025

Hopping Dino_thumbThis blog post is the work of a humanoid dino baby. If you don’t know what a dinobaby is, you are not missing anything. Ask any 80 year old why don’t you?

I  read an amusing write up in Nature Magazine, a publication which does not often veer into MAD Magazine territory. The write up “AI Tools Are Spotting Errors in Research Papers: Inside a Growing Movement” has a wild subtitle as well: “Study that hyped the toxicity of black plastic utensils inspires projects that use large language models to check papers.”

Some have found that outputs from large language models often make up information. I have included references in my writings to Google’s cheese errors and lawyers submitting court documents with fabricated legal references. The main point of this Nature article is that presumably rock solid smart software will check the work of college professors, pals in the research industry, and precocious doctoral students laboring for love and not much money.

Interesting but will hallucinating smart software find mistakes in the work of people like the former president of Stanford University and Harvard’s former ethics star? Well, sure, peers and co-authors cannot be counted on to do work and present it without a bit of Photoshop magic or data recycling.

The article reports that their are two efforts underway to get those wily professors to run their “work” or science fiction through systems developed by Black Spatula and YesNoError. The Black Spatula emerged from tweaked research that said, “Your black kitchen spatula will kill you.” The YesNoError is similar but with a crypto  twist. Yep, crypto.

Nature adds:

Both the Black Spatula Project and YesNoError use large language models (LLMs) to spot a range of errors in papers, including ones of fact as well as in calculations, methodology and referencing.

Assertions and claims are good. Black Spatula markets with the assurance its system  “is wrong about an error around 10 percent of the time.” The YesNoError crypto wizards “quantified the false positives in only around 100 mathematical errors.” Ah, sure, low error rates.

I loved the last paragraph of the MAD inspired effort and report:

these efforts could reveal some uncomfortable truths. “Let’s say somebody actually made a really good one of these… in some fields, I think it would be like turning on the light in a room full of cockroaches…”

Hallucinating smart software. Professors who make stuff up. Nature Magazine channeling important developments in research. Hey, has Nature Magazine ever reported bogus research? Has Nature Magazine run its stories through these systems?

Good question. Might be a good idea.

Stephen E Arnold, March 19, 2025

An Econ Paper Designed to Make Most People Complacent about AI

March 19, 2025

dino orange_thumb_thumb_thumb_thumb_thumbYep, another dinobaby original.

I zipped through — and I mean zipped — a 60 page working paper called “Artificial Intelligence and the Labor Market.” I have to be upfront. I detested economics, and I still do. I used to take notes when Econ Talk was actually discussing economics. My notes were points that struck me as wildly unjustifiable. That podcast has changed. My view of economics has not. At 80 years of age, do you believe that I will adopt a different analytical stance? Wow, I hope not. You may have to take care of your parents some day and learn that certain types of discourse do not compute.

This paper has multiple authors. In my experience, the more authors, the more complicated the language. Here’s an example:

“Labor demand decreases in the average exposure of workers’ tasks to AI technologies; second, holding the average exposure constant, labor demand increases in the dispersion of task exposures to AI, as workers shift effort to tasks that are not displaced by AI.” ?

The idea is that the impact of smart software will not affect workers equally. As AI gets better at jobs humans do, humans will learn more and get a better job or integrate AI into their work. In some jobs, the humans are going to be out of luck. The good news is that these people can take other jobs or maybe start their own business.

The problem with the document I reviewed is that there are several fundamental “facts of life” that make the paper look a bit wobbly.

First, the minute it is cheaper for smart software to do a job that a human does, the human gets terminated. Software does not require touchy feely interactions, vacations, pay raises, and health care. Software can work as long as the plumbing is working. Humans sleep which is not productive from an employer’s point of view.

Second, government policies won’t work. Why? Government bureaucracies are reactive. By the time, a policy arrives, the trend or the smart software revolution has been off to the races. One cannot put spilled radioactive waste back into its containment vessel quickly, easily, or cheaply. How’s that Fukushima remediation going?

Third, the reskilling idea is baloney. Most people are not skilled in reskilling themselves. Life long learning is not a core capability of most people. Sure, in theory anyone can learn. The problem is that most people are happy planning a vacation, doom scrolling, or watch TikTok-type videos. Figuring out how to make use of smart software capabilities is not as popular as watching the Super Bowl.

Net net: The AI services are getting better. That means that most people will be faced with a re-employment challenge. I don’t think LinkedIn posts will do the job.

Stephen E Arnold, March 19, 2025

AI: Meh.

March 19, 2025

It seems consumers can see right through the AI hype. TechRadar reports, “New Survey Suggests the Vast Majority of iPhone and Samsung Galaxy Users Find AI Useless—and I’m Not Surprised.” Both iPhones and Samsung Galaxy smartphones have been pushing AI onto their users. But, according to a recent survey, 73% of iPhone users and 87% of Galaxy users respond to the innovations with a resounding “meh.” Even more would refuse to pay for continued access to the AI tools. Furthermore, very few would switch platforms to get better AI features: 16.8% of iPhone users and 9.7% of Galaxy users. In fact, notes writer Jamie Richards, fewer than half of users report even trying the AI features. He writes:

“I have some theories about what could be driving this apathy. The first centers on ethical concerns about AI. It’s no secret that AI is an environmental catastrophe in motion, consuming massive amounts of water and emitting huge levels of CO2, so greener folks may opt to give it a miss. There’s also the issue of AI and human creativity – TechRadar’s Editorial Associate Rowan Davies recently wrote of a nascent ‘cultural genocide‘ as a result of generative AI, which I think is a compelling reason to avoid it. … Ultimately, though, I think AI just isn’t interesting to the everyday person. Even as someone who’s making a career of being excited about phones, I’ve yet to see an AI feature announced that doesn’t look like a chore to use or an overbearing generative tool. I don’t use any AI features day-to-day, and as such I don’t expect much more excitement from the general public.”

No, neither do we. If only investors would catch on. The research was performed by phone-reselling marketplace SellCell, which surveyed over 2,000 smartphone users.

Cynthia Murrell, March 19, 2025

What Sells Books? Publicity, Sizzle, and Mouth-Watering Titbits

March 18, 2025

dino orangeAnother dinobaby blog post.

Editor note: This post was written on March 13, 2025. Availability of the articles and the book cited may change when this appears in Mr. Arnold’s public blog.

I have heard that books are making a comeback. In rural Kentucky, where I labor in an underground nook, books are good for getting a fire started. The closest bookstore is filled with toys and odd stuff one places on a desk. I am rarely motivated to read a whatchamacallit like a book. I must admit that I read one of those emergence books from a geezer named Stuart A. Kauffman at the Santa Fe Institute, and it was pretty good. Not much in the jazzy world of social media but it was a good use of my time.

I now have another book I want to read. I think it is a slice of reality TV encapsulated in a form of communication less popular than TikTok- or Telegram Messenger-type of media. The bundle of information is called Careless People: A Cautionary Tale of Power, Greed, and Lost Idealism. Many   and pundits have grabbed the story of a dispute between everyone’s favorite social media company and an authoress named Sarah Wynn-Williams.

There is nothing like some good old legal action, a former employee, and a very defensive company.

The main idea is that a memoir published on March 11, 2025, and available via Amazon at https://shorturl.at/Q077l is not supposed to be sold. Like any good dinobaby who actually read a dead tree thing this year, I bought the book. I have no idea if it has been delivered to my Kindle. I know one thing. Good old Amazon will be able to reach out and kill that puppy when the news reaches the equally sensitive leadership at that outstanding online service.

image

A festive group ready to cook dinner over a small fire of burning books. Thanks, You.com. Good enough.

According to The Verge, CNBC, and the Emergency International Arbitral Tribunal, an arbitrator (Nicholas Gowen) decided that the book has to be put in the information freezer. According to the Economic Times:

… violated her contract… In addition to halting book promotions and sales, Wynn-Williams must refrain from engaging in or ‘amplifying any further disparaging, critical or otherwise detrimental comments… She also must retract all previous disparaging comments ‘to the extent within her control.’”

My favorite green poohbah publication The Verge offered:

…it’s unclear how much authority the arbitrator has to do so.

Such a bold statement: It’s unclear, we say.

The Verge added:

In the decision, the arbitrator said Wynn-Williams must stop making disparaging remarks against Meta and its employees and, to the extent that she can control, cease further promoting the book, further publishing the book, and further repetition of previous disparaging remarks. The decision also says she must retract disparaging remarks from where they have appeared.

Now I have written a number of books and monographs. These have been published by outfits no longer in business. I had a publisher in Scandinavia. I had a publisher in the UK. I had a publisher in the United States. A couple of these actually made revenue and one of them snagged a positive review in a British newspaper.

But in all honesty, no one really cared about my Google, search and retrieval, and electronic publishing work.

Why?

I did not have a giant company chasing me to the Emergency International Arbitral Tribunal and making headlines for the prestigious outfit CNBC.

Well, in my opinion Sarah Wynn-Williams has hit a book publicity home run. Imagine, non readers like me buying a book about a firm to which I pay very little attention. Instead of writing about the Zuckbook, I am finishing a book (gasp!) about Telegram Messenger and that sporty baby maker Pavel Durov. Will his “core” engineering team chase me down? I wish. Sara Wynn-Williams is in the news.

Will Ms. Wynn-Williams “win” a guest spot on the Joe Rogan podcast or possibly the MeidasTouch network? I assume that her publisher, agent, and she have their fingers crossed. I heard somewhere that any publicity is good publicity.

I hope Mr. Beast picks up this story. Imagine what he would do with forced arbitration and possibly a million dollar payoff for the PR firm that can top the publicity the apparently Meta has delivered to Ms. Wynn-Williams.

Net net: Win, Wynn!

Stephen E Arnold, March 18, 2025

A Swelling Wave: Internet Shutdowns in Africa

March 18, 2025

dino orange_thumbAnother dinobaby blog post. No AI involved which could be good or bad depending on one’s point of view.

How does a government deal with information it does not like, want, or believe? The question is a pragmatic one. Not long ago, Russia suggested to Telegram that it cut the flow of Messenger content to Chechnya. Telegram has been somewhat more responsive to government requests since Pavel Durov’s detainment in France, but it dragged its digital feet. The fix? The Kremlin worked with service providers to kill off the content flow or at least as much of it as was possible. Similar methods have been used in other semi-enlightened countries.

Internet Shutdowns at Record High in Africa As Access Weaponised’ reports:

A report released by the internet rights group Access Now and #KeepItOn, a coalition of hundreds of civil society organisations worldwide, found there were 21 shutdowns in 15 African countries, surpassing the existing record of 19 shutdowns in 2020 and 2021.

There are workarounds, but some of these are expensive and impractical for the people in Cormoros, Guinea-Bassau, Mauritius, Burundi, Ethiopia, Equatorial Guinea, and Kenya. I am not sure the list is complete, but the idea of killing Internet access seems to be an accepted response in some countries.

Several observations:

  1. Recent announcements about Google making explicit its access to users’ browser histories provide a rich and actionable pool of information. Will these type of data be used to pinpoint a dissident or a problematic individual? In my visits to Africa, including the thrilling Zimbabwe, I would suggest that the answer could be, “Absolutely.”
  2. Online is now pervasive, and due to a lack of meaningful regulation, the idea of going online and sharing information is a negative. In the late 1980s, I gave a lecture for ASIS at Rutgers University. I pointed out that flows of information work like silica grit in a sand blasting device to remove rust in an autobody shop. I can say from personal experience that no one knew what I was talking about. In 40 years, people and governments have figured out how online flows erode structures and social conventions.
  3. The trend of shutdown is now in the playbook of outfits around the world. Commercial companies can play the game of killing a service too. Certain large US high technology companies have made it clear that their service would summarily be blocked if certain countries did not play ball the US way.

As a dinobaby who has worked in online for decades, I find it interesting that the pigeons are coming home to roost. A failure years ago to recognize and establish rules and regulation for online is the same as having those lovable birds loose in the halls of government. What do pigeons produce? Yep, that’s right. A mess, a potentially deadly one too.

Stephen E Arnold, March 18, 2025

Management Insights Circa Spring 2025

March 18, 2025

dino orangeAnother dinobaby blog post. Eight decades and still thrilled when I point out foibles.

On a call today, one of the people asked, “Did you see that excellent leadership comes from ambivalence?” No, sorry. After my years at the blue chip consulting firm, I ignore those insights. Ambivalence. The motivated leader cares about money, the lawyers, the vacations, the big customer, and money. I think I have these in the correct order.

Imagine my surprise when I read another management breakthrough. Navigate to “Why Your ‘Harmonious’ Team Is Actually Failing.” The insight is that happy teams are in coffee shop mode. If one is not motivated by one of the factors I identified in the first paragraph of this essay, life will be like a drive-through smoothie shop. Kick back, let someone else do the work, and lap up that banana and tangerine goodie.

The write up reports about a management concept that is that one should strive for a roughie, maybe with a dollop of chocolate and some salted nuts. Get that blood pressure rising. Here’s a passage I noted:

… real psychological safety isn’t about avoiding conflict. It’s about creating an environment where challenging ideas makes the team stronger, not weaker.

The idea is interesting. I have learned that many workers, like helicopter parents, want to watch and avoid unnecessary conflicts, interactions, and dust ups. The write up slaps some psycho babble on this management insight. That’s perfect for academics on tenure track and talking to quite sensitive big spending clients. But often a more dynamic approach is necessary. If it is absent, there is a problem with the company. Hello, General Motors, Intel, and Boeing.

Stifle much?

The write up adds:

I’ve seen plenty of “nice” teams where everyone was polite, nobody rocked the boat, and meetings were painless. And almost all of those teams produced ok work. Why? Because critical thinking requires friction. Those teams weren’t actually harmonious—they were conflict-avoidant. The disagreements still existed; they just went underground. Engineers would nod in meetings then go back to their desks and code something completely different. Design flaws that everyone privately recognized would sail through reviews untouched. The real dysfunction wasn’t the lack of conflict—it was the lack of honest communication. Those teams weren’t failing because they disagreed too little; they were failing because they couldn’t disagree productively.

Who knew? Hello, General Motors, Intel, and Boeing.

Here’s the insight:

Here’s the weird thing I’ve found: teams that feel safe enough to hash things out actually have less nasty conflict over time. When small disagreements can be addressed head-on, they don’t turn into silent resentment or passive-aggressive BS. My best engineering teams were never the quiet ones—they were the ones where technical debates got spirited, where different perspectives were welcomed, and where we could disagree while still respecting each other.

The challenge is to avoid creating complacency.

Stephen E Arnold, March 18, 2025

AI May Be Discovering Kurt Gödel Just as Einstein and von Neumann Did

March 17, 2025

Hopping Dino_thumb_thumb_thumbThis blog post is the work of a humanoid dino baby. If you don’t know what a dinobaby is, you are not missing anything.

AI re-thinking is becoming more widespread. I published a snippet of an essay about AI and its impact in socialist societies on March 10, 2025. I noticed “A Bear Case: My Predictions Regarding AI Progress.” The write is interesting, and I think it represents thinking which is becoming more prevalent among individuals who have racked up what I call AI mileage.

The main theme of the write up is a modern day application of Kurt Gödel’s annoying incompleteness theorem. I am no mathematician like my great uncle Vladimir Arnold, who worked for year with the somewhat quirky Dr. Kolmogorov. (Family tip: Going winter camping with Dr. Kolmogorov wizard was not a good idea unless. Well, you know…)

The main idea is a formal axiomatic system satisfying certain technical conditions cannot decide the truth value of all statements about natural numbers. In a nutshell, a set cannot contain itself. Smart software is not able to go outside of its training boundaries as far as I know.

Back to the essay, the author points out that AI something useful:

There will be a ton of innovative applications of Deep Learning, perhaps chiefly in the field of biotech, see GPT-4b and Evo 2. Those are, I must stress, human-made innovative applications of the paradigm of automated continuous program search. Not AI models autonomously producing innovations.

The essay does contain a question I found interesting:

Because what else are they [AI companies and developers] to do? If they admit to themselves they’re not closing their fingers around godhood after all, what will they have left?

Let me offer several general thoughts. I admit that I am not able to answer the question, but some ideas crossed my mind when I was thinking about the sporty Kolmogorov, my uncle’s advice about camping in the winter, and this essay:

  1. Something else will come along. There is a myth that technology progresses. I think technology is like the fictional tribble on Star Trek. The products and services are destined to produce more products and services. Like the Santa Fe Institute crowd, order emerges. Will the next big thing be AI? Probably AI will be in the DNA of the next big thing. So one answer to the question is, “Something will emerge.” Money will flow and the next big thing cycle begins again.
  2. The innovators and the AI companies will pivot. This is a fancy way of saying, “Try to come up with something else.” Even in the age of monopolies and oligopolies, change is relentless. Some of the changes will be recognized as the next big thing or at least the thing a person can do to survive. Does this mean Sam AI-Man will manage the robots at the local McDonald’s? Probably not, but he will come up with something.
  3. The AI hot pot will cool. Life will regress to the mean or a behavior that is not hell bent on becoming a super human like the guy who gets transfusions from his kid, the wonky “have my baby” thinking of a couple of high profile technologist, or the money lust of some 25 year old financial geniuses on Wall Street. A digitized organization man man living out the theory of the leisure class will return. (Tip: Buy a dark grey suit. Lose the T shirt.)

As an 80 year old dinobaby, I find the angst of AI interesting. If Kurt Gödel were alive, he might agree to comment, “Sonny, you can’t get outside the set.” My uncle would probably say, “Those costs. Are they crazy?”

Stephen E Arnold, March 17, 2025

An Intel Blind Spot in Australia: Could an October-Type Event Occur?

March 17, 2025

dino orange_thumb_thumb_thumb_thumbYep, another dinobaby original.

I read a “real” news article (I think) in the UK Telegraph. The story “How Chinese Warships Encircled Australia without Canberra Noticing” surprised me. The write up reports:

In a highly unusual move, three Chinese naval vessels dubbed Task Group 107 – including a Jiangkai-class frigate, a Renhai-class cruiser and a Fuchi-class replenishment vessel – were conducting exercises in Australia’s exclusive economic zone.

The date was February 21, 2025. The ships were 300 miles from Australia. What’s the big deal?

According to the write up:

Anthony Albanese, Australia’s prime minister, downplayed the situation, while both the Australian Defence Force and the New Zealand Navy initially missed that the exercise was even happening.

Let me offer several observations based on what may a mostly accurate “real” news report:

  1. Australia like Israel is well equipped with home grown and third-party intelware. If the write up’s content is accurate, none of these intelware systems provided signals about the operation before, during, and after the report of the live fire drill
  2. As a member of Five Eyes, a group about which I know essentially nothing, Australia has access to assorted intelligence systems, including satellites. Obviously the data were incomplete, ignored, or not available to analysts or Preligens-type of systems. Note: Preligens is now owned by Safran
  3. What remediating actions are underway in Australia? To be fair, the “real” news outfit probably did not ask this question, but it seems a reasonable one to address. Someone was responsible, so what’s the fix?

Net net: Countries with sophisticated intelligence systems are getting some indications that these systems may not live up to the marketing hyperbole nor the procurement officials’ expectations of these systems. Israel suffered in many ways because of its 9/11 in October. One hopes that Australia can take this allegedly true incident involving China to heart and make changes.

Stephen E Arnold, March 17, 2025

What is the Difference Between Agentic and Generative AI? A Handy Chart

March 17, 2025

Agentic is the new AI buzzword. But what does it mean? Data-platform and AI firm Domo offers clarity in, "Agentic AI Explained: Definition, Benefits, and Use Cases." Writer Haziqa Sajid defines the term:

"Agentic AI is an advanced AI system that can act independently, make decisions, and adapt to changing situations. These AI systems can handle complex tasks such as strategic planning, multi-step automation, and dynamic problem-solving with minimal human oversight. This makes them more capable than traditional rule-based AI. … Agentic AI is designed to work like a human employee performing tasks that comprehend natural language input, set objectives, reason through a task, and modify actions based on updated input. It employs advanced machine learning, generative AI, and adaptive decision-making to learn from the data, refine its approach, and improve performance over time."

Wow, that sounds a lot like what we were promised with generative AI. Perhaps this version will meet expectations. AI Agents are still full of potential, poised on the edge of infiltrating real-world tools. The post describes what Domo sees as the tech’s advantages and gives the basics of how it works.

The most useful part is the handy chart comparing agentic and generative AI. For example, while the (actual) purpose of generative AI is mainly to generate text, image, and audio content, agentic ai is for executing tasks and making decisions in changing environments. The chart’s other measures of comparison include autonomy, interactivity, use cases, learning processes, and integration methods. See the post for that bookmark-worthy chart.

Founded back in 2010, Domo is based in Utah. The publicly traded firm boasts over 2,600 clients across diverse industries.

Cynthia Murrell, March 17, 2025

A NewPlay for Google and Huawei: A Give and Go?

March 14, 2025

Time for Google to get smarmy in court. We learn from TorrentFreak, "Google Must Testify as LaLiga Demands Criminal Liability for ‘Piracy Profits’." Writer Andy Maxwell summarizes:

"A court in Murcia, Spain, has ordered Google to testify in a criminal case concerning IPTV app, NewPlay. Football league LaLiga, whose matches were allegedly offered illegally through the app, previously called for the directors of Google, Apple, and Huawei to face criminal charges. LaLiga criticized the companies for failing to disable copies of NewPlay already installed on users’ devices. Google and Huawei must now testify as ‘profit-making participants’ in an alleged piracy scheme."

See the write-up for the twists and turns of the case thus far. The key point: Removing NewPlay from app stores was not enough for LaLiga. The league demands they reach in and remove the player from devices it already inhabits. The court agrees. We learn:

"The court order required Google, Apple, and Huawei to disable or delete NewPlay to prevent future use on users’ mobile devices. That apparently didn’t happen. Unhappy with the lack of compliance, in 2024 LaLiga called on the investigating judge to punish the tech companies’ directors. LaLiga says that the installed NewPlay apps still haven’t been remotely disabled but given the precedent that may set, LaLiga seems unlikely to let the matter go without a fight. With support from Telefónica, Mediapro and rights group EGEDA, LaLiga wants to hold Google and Huawei responsible for pirated content reportedly made available via the NewPlay app."

So now the court is requiring reps from Google and Huawei to appear and address this partial compliance. Why not Apple, too? That is a mystery, reports Maxwell. He also wonders about the bigger picture. The Newplay website is still up, he notes, though both its internal and external links are currently disabled. Besides, at least one other app exists that appears to do the same thing. Will the court become embroiled in a long-term game of IPTV whack-a-mole? Google is a magnet for the courts it seems.

Cynthia Murrell, March 14, 2025

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta