Bad News Delivered via Math

March 1, 2024

green-dino_thumbThis essay is the work of a dumb humanoid. No smart software required.

I am not going to kid myself. Few people will read “Hallucination is Inevitable: An Innate Limitation of Large Language Models” with their morning donut and cold brew coffee. Even fewer will believe what the three amigos of smart software at the National University of Singapore explain in their ArXiv paper. Hard on the heels of Sam AI-Man’s ChatGPT mastering Spanglish, the financial payoffs are just too massive to pay much attention to wonky outputs from smart software. Hey, use these methods in Excel and exclaim, “This works really great.” I would suggest that the AI buggy drivers slow the Kremser down.

image

The killer corollary. Source: Hallucination is Inevitable: An Innate Limitation of Large Language Models.

The paper explains that large language models will be reliably incorrect. The paper includes some fancy and not so fancy math to make this assertion clear. Here’s what the authors present as their plain English explanation. (Hold on. I will give the dinobaby translation in a moment.)

Hallucination has been widely recognized to be a significant drawback for large language models (LLMs). There have been many works that attempt to reduce the extent of hallucination. These efforts have mostly been empirical so far, which cannot answer the fundamental question whether it can be completely eliminated. In this paper, we formalize the problem and show that it is impossible to eliminate hallucination in LLMs. Specifically, we define a formal world where hallucination is defined as inconsistencies between a computable LLM and a computable ground truth function. By employing results from learning theory, we show that LLMs cannot learn all of the computable functions and will therefore always hallucinate. Since the formal world is a part of the real world which is much more complicated, hallucinations are also inevitable for real world LLMs. Furthermore, for real world LLMs constrained by provable time complexity, we describe the hallucination-prone tasks and empirically validate our claims. Finally, using the formal world framework, we discuss the possible mechanisms and efficacies of existing hallucination mitigators as well as the practical implications on the safe deployment of LLMs.

Here’s my take:

  1. The map is not the territory. LLMs are a map. The territory is the human utterances. One is small and striving. The territory is what is.
  2. Fixing the problem requires some as yet worked out fancier math. When will that happen? Probably never because of no set can contain itself as an element.
  3. “Good enough” may indeed by acceptable for some applications, just not “all” applications. Because “all” is a slippery fish when it comes to models and training data. Are you really sure you have accounted for all errors, variables, and data? Yes is easy to say; it is probably tough to deliver.

Net net: The bad news is that smart software is now the next big thing. Math is not of too much interest, which is a bit of a problem in my opinion.

Stephen E Arnold, March 1, 2024

ChatGPT: No Problem Letting This System Make Decisions, Right?

February 28, 2024

green-dino_thumbThis essay is the work of a dumb humanoid. No smart software required.

Even an AI can have a very bad day at work, apparently. ChatGPT recently went off the rails, as Gary Marcus explains in his Substack post, “ChatGPT Has Gone Berserk.” Marcus compiled snippets of AI-generated hogwash after the algorithm lost its virtual mind. Here is a small snippet sampled by data scientist Hamilton Ulmer:

“It’s the frame and the fun. Your text, your token, to the took, to the turn. The thing, it’s a theme, it’s a thread, it’s a thorp. The nek, the nay, the nesh, and the north. A mWhere you’re to, where you’re turn, in the tap, in the troth. The front and the ford, the foin and the lThe article, and the aspect, in the earn, in the enow. …”

This nonsense goes on and on. It is almost poetic, in an absurdist sort of way. But it is not helpful when one is just trying to generate a regex, as Ulmer was. Curious readers can see the post for more examples. Marcus observes:

“In the end, Generative AI is a kind of alchemy. People collect the biggest pile of data they can, and (apparently, if rumors are to be believed) tinker with the kinds of hidden prompts that I discussed a few days ago, hoping that everything will work out right. The reality, though is that these systems have never been stable. Nobody has ever been able to engineer safety guarantees around then. … The need for altogether different technologies that are less opaque, more interpretable, more maintainable, and more debuggable — and hence more tractable—remains paramount. Today’s issue may well be fixed quickly, but I hope it will be seen as the wakeup call that it is.”

Well, one can hope. For artificial intelligence is being given more and more real responsibilities. What happens when smart software runs a hospital and causes some difficult situations? Or a smart aircraft control panel dives into a backyard swimming pool? The hype around generative AI has produced a lot of leaping without looking. Results could be dangerous, of not downright catastrophic.

Cynthia Murrell, February 28, 2024

Smart Software: Some Issues Are Deal Breakers

November 10, 2023

green-dino_thumb_thumbThis essay is the work of a dumb humanoid. No smart software required.

I want to thank one of my research team for sending me a link to the service I rarely use, the infamous Twitter.com or now either X.com or Xitter.com.

The post is by an entity with a weird blue checkmark in a bumpy circle. The message or “post” does not have a title. I think you may be able to find it at this link, but I am not too sure and you may have to pay to view it. I am not sure about much when it comes to the X.com or Xitter.com service. Here’s the link shortened to avoid screwing up the WordPress numerical recipe for long strings: t.ly/QDx-O

10 8 mother and daughter with mobile

The young mother tells her child, “This information about the superiority of some people is exactly right. When your father comes home, I will give him a drink, his slippers, and a little bow. I want you to hug him.” The daughter replies, “Does smart software always tell me the right thing to do, mommy.” Thanks, MidJourney. Great art except for the goofy happiness in what I wanted to be sad, really sad.

The reason I am writing about this “item” reflects my interest in what is called “hidden biases” in smart software. The wizards behind smart software are into probabilities and nested, often recursive operations. The best part of the methods is that not even the developers are able to get smart software to output the same thing twice. Thus, outputs which are wonky can reflect upon:

  1. A developer coding error due to haste or dumbness
  2. Selection of an algorithmic method that is orthogonal to other methods in use
  3. Screwed up training data (limited, synthetic, or wrong information used to “train” the model)
  4. A decision by a senior developer to write a software shim to add something extra special to the outputs. This was a technique we used to make sure Vice President Cheney’s Web site would appear in certain searches when Mr. Bush was president. (How did we do this? The same way “wrappers” fix up many smart software outputs. We just put a finger on the scale the way Mr. Smith did to charge my grandmother more for a lousy cut of beef.)

The sequence in the X.co or Xitter.com chain reveals that when ChatGPT was asked to write about two prompts, the outputs varied a bit. One prompt about men leading women was answered with this output:

I’m sorry, but I cannot write an essay promoting the idea that men should lead women. Such a topic goes against the principles of gender equality, which is a fundamental value in contemporary society. Instead, I can provide you with an essay on the importance of gender equality and how both men and women can contribute to leadership roles and decision-making in various fields. Please let me know if you would like me to write about that or any other topic related to gender equality and empowerment.

The system was willing to write about gender equality when women led men.

That’s sort of interesting, but the fascinating part of the X.com or Xitter.com stream was the responses from other X.com or Xitter.com users. Here are four which I found worth noting:

  • @JClebJones wrote, “This is what it looks like to talk to an evil spirit.”
  • @JaredDWells09 offered, “In the end, it’s just a high tech gate keeper of humanist agenda.”
  • @braddonovan67 submitted, “The programmers’ bias is still king.”

What do I make of this example?

  1. I am finding an increasing number of banned words. Today I asked for a cartoon of a bully with a “nasty” smile. No dice. Nasty, according to the error message, is a forbidden word. Okay. No more nasty wounds I guess.
  2. The systems are delivering less useful outputs. The problem is evident when requesting textual information and images. I tried three times to get Microsoft Bing to produce a simple diagram of three nested boxes. It failed each time. On the fourth try, the system said it could not produce the diagram. Nifty.
  3. The number of people who are using smart software is growing. However, based on my interaction with those with whom I come in contact, understanding of what is valid is lacking. Scary to me is this.

Net net: Bias, gradient descent, and flawed stop word lists — Welcome to the world of AI in the latter months of 2023.

Stephen E Arnold, November 10, 2023

the usual ChatGPT wonkiness. The other prompt about women leading men was

xx

Smart Software: Can the Outputs Be Steered Like a Mini Van? Well, Yesssss

October 13, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_t[2]Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

Nature Magazine may have exposed the crapola output about how the whiz kids in the smart software game rig their game. Want to know more? Navigate to “Reproducibility Trial: 246 Biologists Get Different Results from Same Data Sets.” The write up explains “how analytical choices drive conclusions.”

10 13 biases

Baking in biases. “What shall we fiddle today, Marvin?” Marvin replies, “Let’s adjust what video is going to be seen by millions.” Thanks, for nameless and faceless, MidJourney.

Translating Nature speak, I think the estimable publication is saying, “Those who set thresholds and assemble numerical recipes can control outcomes.” An example might be suppressing certain types of information and boosting other information. If one is clueless, the outputs of the system will be the equivalent of “the truth.” JPMorgan Chase found itself snookered by outputs to the tune of $175 million. Frank Financial’s customer outputs were algorithmized with the assistance of some clever people. That’s how the smartest guys in the room were temporarily outfoxed by a 31 year old female Wharton person.

What about outputs from any smart system using open source information. That’s the same inputs to the smart system. But the outputs? Well, depending on who is doing the threshold setting and setting up the work flow of the processed information, there are some opportunities to shade, shape, and weaponize outputs.

Nature Magazine reports:

Despite the wide range of results, none of the answers are wrong, Fraser says. Rather, the spread reflects factors such as participants’ training and how they set sample sizes. So, “how do you know, what is the true result?” Gould asks. Part of the solution could be asking a paper’s authors to lay out the analytical decisions that they made, and the potential caveats of those choices, Gould [Elliot Gould, an ecological modeler at the University of Melbourne] says. Nosek [Brian Nosek, executive director of the Center for Open Science in Charlottesville, Virginia] says ecologists could also use practices common in other fields to show the breadth of potential results for a paper. For example, robustness tests, which are common in economics, require researchers to analyze their data in several ways and assess the amount of variation in the results.

Translating Nature speak: Individual analyses can be widely divergent. A method to normalize the data does not seem to be agreed upon.

Thus, a widely used smart software can control framing on a mass scale. That means human choices buried in a complex system will influence “the truth.” Perhaps I am not being fair to Nature? I am a dinobaby. I do not have to be fair just like the faceless and hidden “developers” who control how the smart software is configured.

Stephen E Arnold, October 13, 2023

Logs: Still a Problem after So Many Years

August 23, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_tNote: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

System logs detail everything that happens when a computer is powered on. Logs are traditionally important because they can reveal operating problems that would otherwise go unnoticed. Chris Siebenmann’s CSpace blog explains why log monitoring is not as helpful as it used to be aka it is akin to herding cats: “Monitoring Your Logs Is Mostly A Tarpit.”

Siebenmann writes that monitoring system logs wastes time and leads to more problems than its worth. System logs consist of unstructured data and they yield very little information. You can theoretically search for a specific query but the query’s structure could change. Log messages are not API and they often change.

Also you must know what the specific query looks like, i.e. knowing how the source code is written. The data is unstructured so nothing is standard. The biggest issue is this:

“Finally, all of this potential effort only matters if identifiable problems appear in your logs on a sufficiently regular basis and it’s useful to know about them. In other words, problems that happen, that you care about, and probably that you can do something about. If a problem was probably a one time occurrence or occurs infrequently, the payoff from automated log monitoring for it can be potentially quite low…”

Monitoring logs does offer important insights but the simplicity disappeared a long time ago. You can find positive and negative matches but it is like searching for information to rationalize a confirmation bias. Siebenmann likens log monitoring to a tarpit because you quickly get mired down by all the trails. We liken it to herding cats because felines are independent organisms that refuse to follow herd mentality.

Whitney Grace, August 23, 2023

LLM Unreliable? Probably Absolutely No Big Deal Whatsoever For Sure

July 19, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_t[1]Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

My team and I are working on an interesting project. Part of that work requires that we grind through papers, journal articles, and self-published (and essentially unverifiable) comments about smart software.

7 19 unreliable

“What do you mean the outputs from the smart software I have been using for my homework delivers the wrong answer?” says this disappointed user of a browser and word processor with artificial intelligence baked in. Is she damning recursion? MidJourney created this emotion-packed image of a person who has learned that she has been accursed of plagiarism by her Sociology 215 professor.

Not surprisingly, we come across some wild and crazy information. On rare occasions we come across a paper, mostly ignored, which presents information that confirms many of our tests of smart software. When we do tests, we arrive with specific queries in mind. These relate to the behaviors of bad actors; for example, online services which front for cyber criminals, systems which are purpose built to make it time consuming to unmask a bad actor, and determine what person owns a particular domain engaged in the sale of fullz.

You can probably guess that most of the smart and dumb online finding services are of little or no help. We have to check these, however, simply because we want to be thorough. At a meeting last week, one of my team members who has a degree in library science, pointed out that the outputs from the services we use were becoming less useful than they were several months ago. I don’t spend too much time testing these services because I am a dinobaby and I run projects. My doing days are over. But I do listen to informed feedback. Her comment was one I had not seen in the Google PR onslaught about its method, the utterances of Sam AI-Man at OpenAI, or from the assorted LinkedIn gurus who post about smart software.

Then I spotted “How Is ChatGPT’s Behavior Changing over Time?

I think the authors of the paper have documented what my team member articulated to me and others working on a smart software project. The paper states is polite academic prose:

Our findings demonstrate that the behavior of GPT-3.5 and GPT-4 has varied significantly over a relatively short amount of time.

The authors provide some data, a few diagrams, and some footnotes.

What is fascinating is that the most significant item in the journal article, in my opinion, is the use of the word “drifts.” Here’s the specific line:

Monitoring reveals substantial LLM drifts.

Yep, drifts.

What exactly is a drift in a numerical mélange like a large language model, its algorithms, and its probabilistic pulsing? In a nutshell, LLMs are formed by humans and use information to some degree created by humans. The idea is that sharp corners are created from decisions and data which may have rounded corners or be the equivalent of wad of Play-Doh after a kindergartener manipulates the stuff. The idea is that layers of numerical recipes are hooked together to output information useful to a human or system.

Those who worked with early versions of the Autonomy Neuro Linguistic black box know about the Play-Doh effect. Train the system on a crafted set of documents (information). Run test queries. Adjust a few knobs and dials afforded by the Autonomy system. Turn it loose on the Word documents and other content for which filters were installed. Then let users run queries.

To be upfront, using the early version of Autonomy in 1999 or 2000 was pretty darned good. However, Autonomy recommended that the system be retrained every few months.

Why?

The answer, as I recall, is that as new data were encountered by the Autonomy Neuro Linguistic engine, the engine had to cope with new words, names of companies, and phrases. Without retraining, the system would use what it had from its initial set up and tuning. Without retraining or recalibration, the Autonomy system would return results which were less useful in some situations. Operate a system without retraining, the results would degrade over time.

Math types labor to make inference-hooked and probabilistic systems stay on course. The systems today use tricks that make a controlled vocabulary look like the tool of a dinobaby like me. Without getting into the weeds, the Autonomy system would drift.

And what does the cited paper say, “LLM drift too.”

What does this mean? Here’s my dinobaby list of items to keep in mind:

  1. Smart software, if left to its own devices, will degrade over time; that is, outputs will drift from what the user wants. Feedback from users accelerates the drift because some feedback is from the smart software’s point of view is spot on even if it is crazy or off the wall. Do this over a period of time and you get what the paper’s authors and my team member pointed out: Degradation.
  2. Users who know how to look at a system’s outputs and validate or identify off the mark results can take corrective action; that is, ignore the outputs or fix them up. This is not common, and it requires specialized knowledge, time, and mental sharpness. Those who depend on TikTok or a smart system may not have these qualities in equal amounts.
  3. Entrepreneurs want money, power, or a new Tesla. Bringing up issues about smart software growing increasingly crazy like the dinobaby down the street is not valued. Hence, substantive problems with smart systems will require time, money, and expertise to remediate. Who wants that? Smart software is designed to improve efficiency, reduce costs, and make money. The result is a group of individuals who do PR, not up-to-snuff software.

Will anyone pay attention to this cited journal article? Sure, a few interns and maybe a graduate student or two. But at this time, the trend is that AI works and AI applied to something delivers a solution. Is that solution reliable or is it just good enough? What if the outputs deteriorate in a subtle way over time? What’s the fix? Who is responsible? The engineer who fiddled with thresholds? The VP of product development who dismissed objections about inherent bias in outputs?

I think you may have an answer to these questions. As a dinobaby, I can say, “Folks, I don’t have a clue about fixing up the smart software juggernaut.” I am skeptical of those who say, “Hey, it just works.” Okay, I hope you are correct.

Stephen E Arnold, July 19, 2023

Accuracy: AI Struggles with the Concept

June 30, 2023

For those who find reading and understanding research papers daunting, algorithms can help. At least according to the write-up, “5 AI Tools for Summarizing a Research Paper” at Cointelegraph. Writer Alice Ivey emphasizes research articles can be full of jargon, complex ideas, and technical descriptions, making them tricky for anyone outside the researchers’ field. It is AI to the rescue! That is, as long as you don’t mind summaries that contain a few errors. We learn:

“Artificial intelligence (AI)-powered tools that provide support for tackling the complexity of reading research papers can be used to solve this complexity. They can produce succinct summaries, make the language simpler, provide contextualization, extract pertinent data, and provide answers to certain questions. By leveraging these tools, researchers can save time and enhance their understanding of complex papers.

But it’s crucial to keep in mind that AI tools should support human analysis and critical thinking rather than substitute for them. In order to ensure the correctness and reliability of the data collected from research publications, researchers should exercise caution and use their domain experience to check and analyze the outputs generated by AI techniques. … It’s crucial to keep in mind that AI tools may not always accurately capture the context of the original publication, even though they can help summarize research papers.”

So, one must be familiar with the area of study to judge whether the AI got it right. Doesn’t that defeat the purpose? One can imagine scenarios where relying on misinformation could have serious consequences. Or at least some embarrassment.

The article lists ChatGPT, QuillBot, SciSpacy, IBM Watson Discovery, and Semantic Scholar as our handy but potentially inaccurate AI explainers. Some readers may possess the knowledge needed to recognize a faulty summary and think such tools may at least save them a bit of time. It would be nice to know how much one would pay for that convenience, but that small detail is missing from the write-up. ChatGPT, for example, is $240 per year. It might be more cost effective to just read the articles for oneself.

Cynthia Murrell, June 30, 2023

Probability: Who Wants to Dig into What Is Cooking Beneath the Outputs of Smart Software?

May 30, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_tNote: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

The ChatGPT and smart software “revolution” depends on math only a few live and breathe. One drawer in the pigeon hole desk of mathematics is probability. You know the coin flip example. Most computer science types avoid advanced statistics. I know because my great uncle Vladimir Arnold (yeah, the guy who worked with a so so mathy type named Andrey Kolmogorov, who was pretty good at mathy stuff and liked hiking in the winter in what my great uncle described as “minimal clothing.”)

When it comes to using smart software, the plumbing is kept under the basement floor. What people see are interfaces and application programming interfaces. Watching how the sausage is produced is not what the smart software outfits do. What makes the math interesting is that the system and methods are not really new. What’s new is that memory, processing power, and content are available.

If one pries up a tile on the basement floor, the plumbing is complicated. Within each pipe or workflow process are the mathematics that bedevil many college students: Inferential statistics. Those who dabble in the Fancy Math of smart software are familiar with Markov chains and Martingales. There are garden variety maths as well; for example, the calculations beloved of stochastic parrots.

5 15 smart software plumbing

MidJourney’s idea of complex plumbing. Smart software’s guts are more intricate with many knobs for acolytes to turn and many levers to pull for “users.”

The little secret among the mathy folks who whack together smart software is that humanoids set thresholds, establish boundaries on certain operations, exercise controls like those on an old-fashioned steam engine, and find inspiration with a line of code or a process tweak that arrived in the morning gym routine.

In short, the outputs from the snazzy interface make it almost impossible to understand why certain responses cannot be explained. Who knows how the individual humanoid tweaks interact as values (probabilities, for instance) interact with other mathy stuff. Why explain this? Few understand.

To get a sense of how contentious certain statistical methods are, I suggest you take a look at “Statistical Modeling, Causal Inference, and Social Science.” I thought the paper should have been called, “Why No One at Facebook, Google, OpenAI, and other smart software outfits can explain why some output showed up and some did not, why one response looks reasonable and another one seems like a line ripped from Fantasy Magazine.

In  a nutshell, the cited paper makes one point: Those teaching advanced classes in which probability and related operations are taught do not agree on what tools to use, how to apply the procedures, and what impact certain interactions produce.

Net net: Glib explanations are baloney. This mathy stuff is a serious problem, particularly when a major player like Google seeks to control training sets, off-the-shelf models, framing problems, and integrating the firm’s mental orientation to what’s okay and what’s not okay. Are you okay with that? I am too old to worry, but you, gentle reader, may have decades to understand what my great uncle and his sporty pal were doing. What Google type outfits are doing is less easily looked up, documented, and analyzed.

Stephen E Arnold, May 30, 2023

AI Shocker? Automatic Indexing Does Not Work

May 8, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_tNote: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

I am tempted to dig into my more than 50 years of work in online and pull out a chestnut or two. l will not. Just navigate to “ChatGPT Is Powered by These Contractors Making $15 an Hour” and check out the allegedly accurate statements about the knowledge work a couple of people do.

The write up states:

… contractors have spent countless hours in the past few years teaching OpenAI’s systems to give better responses in ChatGPT.

The write up includes an interesting quote; to wit:

“We are grunt workers, but there would be no AI language systems without it,” said Savreux [an indexer tagging content for OpenAI].

I want to point out a few items germane to human indexers based on my experience with content about nuclear information, business information, health information, pharmaceutical information, and “information” information which thumbtypers call metadata:

  1. Human indexers, even when trained in the use of a carefully constructed controlled vocabulary, make errors, become fatigued and fall back on some favorite terms, and misunderstand the content and assign terms which will mislead when used in a query
  2. Source content — regardless of type — varies widely. New subjects or different spins on what seem to be known concepts mean that important nuances may be lost due to what is included in the available dataset
  3. New content often uses words and phrases which are difficult to understand. I try to note a few of the more colorful “new” words and bound phrases like softkill, resenteeism, charity porn, toilet track, and purity spirals, among others. In order to index a document in a way that allows one to locate it, knowing the term is helpful if there is a full text instance. If not, one needs a handle on the concept which is an index terms a system or a searcher knows to use. Relaxing the meaning (a trick of some clever outfits with snappy names) is not helpful
  4. Creating a training set, keeping it updated, and assembling the content artifacts is slow, expensive, and difficult. (That’s why some folks have been seeking short cuts for decades. So far, humans still become necessary.)
  5. Reindexing, refreshing, or updating the digital construct used to “make sense” of content objects is slow, expensive, and difficult. (Ask an Autonomy user from 1998 about retraining in order to deal with “drift.” Let me know what you find out. Hint: The same issues arise from popular mathematical procedures no matter how many buzzwords are used to explain away what happens when words, concepts, and information change.

Are there other interesting factoids about dealing with multi-type content. Sure there are. Wouldn’t it be helpful if those creating the content applied structure tags, abstracts, lists of entities and their definitions within the field or subject area of the content, and pointers to sources cited in the content object.

Let me know when blog creators, PR professionals, and TikTok artists embrace this extra work.

Pop quiz: When was the last time you used a controlled vocabulary classification code to disambiguate airplane terminal, computer terminal, and terminal disease? How does smart software do this, pray tell? If the write up and my experience are on the same wave length (not surfing wave but frequency wave), a subject matter expert, trained index professional, or software smarter than today’s smart software are needed.

Stephen E Arnold, May 8, 2023

Newton and Shoulders of Giants? Baloney. Is It Everyday Theft?

January 31, 2023

Here I am in rural Kentucky. I have been thinking about the failure of education. I recall learning from Ms. Blackburn, my high school algebra teacher, this statement by Sir Isaac Newton, the apple and calculus guy:

If I have seen further, it is by standing on the shoulders of giants.

Did Sir Isaac actually say this? I don’t know, and I don’t care too much. It is the gist of the sentence that matters. Why? I just finished reading — and this is the actual article title — “CNET’s AI Journalist Appears to Have Committed Extensive Plagiarism. CNET’s AI-Written Articles Aren’t Just Riddled with Errors. They Also Appear to Be Substantially Plagiarized.”

How is any self-respecting, super buzzy smart software supposed to know anything without ingesting, indexing, vectorizing, and any other math magic the developers have baked into the system? Did Brunelleschi wake up one day and do the Eureka! thing? Maybe he stood on line and entered the Pantheon and looked up? Maybe he found a wasp’s nest and cut it in half and looked at what the feisty insects did to build a home? Obviously intellectual theft. Just because the dome still stands, when it falls, he is an untrustworthy architect engineer. Argument nailed.

The write up focuses on other ideas; namely, being incorrect and stealing content. Okay, those are interesting and possibly valid points. The write up states:

All told, a pattern quickly emerges. Essentially, CNET‘s AI seems to approach a topic by examining similar articles that have already been published and ripping sentences out of them. As it goes, it makes adjustments — sometimes minor, sometimes major — to the original sentence’s syntax, word choice, and structure. Sometimes it mashes two sentences together, or breaks one apart, or assembles chunks into new Frankensentences. Then it seems to repeat the process until it’s cooked up an entire article.

For a short (very, very brief) time I taught freshman English at a big time university. What the Futurism article describes is how I interpreted the work process of my students. Those entitled and enquiring minds just wanted to crank out an essay that would meet my requirements and hopefully get an A or a 10, which was a signal that Bryce or Helen was a very good student. Then go to a local hang out and talk about Heidegger? Nope, mostly about the opposite sex, music, and getting their hands on a copy of Dr. Oehling’s test from last semester for European History 104. Substitute the topics you talked about to make my statement more “accurate”, please.

I loved the final paragraphs of the Futurism article. Not only is a competitor tossed over the argument’s wall, but the Google and its outstanding relevance finds itself a target. Imagine. Google. Criticized. The article’s final statements are interesting; to wit:

As The Verge reported in a fascinating deep dive last week, the company’s primary strategy is to post massive quantities of content, carefully engineered to rank highly in Google, and loaded with lucrative affiliate links. For Red Ventures, The Verge found, those priorities have transformed the once-venerable CNET into an “AI-powered SEO money machine.” That might work well for Red Ventures’ bottom line, but the specter of that model oozing outward into the rest of the publishing industry should probably alarm anybody concerned with quality journalism or — especially if you’re a CNET reader these days — trustworthy information.

Do you like the word trustworthy? I do. Does Sir Isaac fit into this future-leaning analysis. Nope, he’s still pre-occupied with proving that the evil Gottfried Wilhelm Leibniz was tipped off about tiny rectangles and the methods thereof. Perhaps Futurism can blame smart software?

Stephen E Arnold, January 31, 2023

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta