Content Injection Can Have Unanticipated Consequences

February 24, 2025

The work of a real, live dinobaby. Sorry, no smart software involved. Whuff, whuff. That’s the sound of my swishing dino tail. Whuff.

Years ago I gave a lecture to a group of Swedish government specialists affiliated with the Forestry Unit. My topic was the procedure for causing certain common algorithms used for text processing to increase the noise in their procedures. The idea was to input certain types of text and numeric data in a specific way. (No, I will not disclose the methods in this free blog post, but if you have a certain profile, perhaps something can be arranged by writing benkent2020 at yahoo dot com. If not, well, that’s life.)

We focused on a handful of methods widely used in what now is called “artificial intelligence.” Keep in mind that most of the procedures are not new. There are some flips and fancy dancing introduced by individual teams, but the math is not invented by TikTok teens.

In my lecture, the forestry professionals wondered if these methods could be used to achieve specific objectives or “ends”. The answer was and remains, “Yes.” The idea is simple. Once methods are put in place, the algorithms chug along, some are brute force and others are probabilistic. Either way, content and data injections can be shaped, just like the gizmos required to make kinetic events occur.

The point of this forestry excursion is to make clear that a group of people, operating in a loosely coordinated manner can create data or content. Those data or content can be weaponized. When ingested by or injected into a content processing flow, the outputs of the larger system can be fiddled: More emphasis here, a little less accuracy there, and an erosion of whatever “accuracy” calculations are used to keep the system within the engineers’ and designers’ parameters. A plebian way to describe the goal: Disinformation or accuracy erosion.

I read “Meet the Journalists Training AI Models for Meta and OpenAI.” The write up explains that journalists without jobs or in search of extra income are creating “content” for smart software companies. The idea is that if one just does the Silicon Valley thing and sucks down any and all content, lawyers might come calling. Therefore, paying for “real” information is a better path.

Please, read the original article to get a sense of who is doing the writing, what baggage or mind set these people might bring to their work.

If the content is distorted — either intentionally or unintentionally — the impact of these content objects on the larger smart software system might have some interesting consequences. I just wanted to point out that weaponized information can have an impact. Those running smart software and buying content assuming it is just fine, might find some interesting consequences in the outputs.

Stephen E Arnold, February 24, 2025

Written by Stephen E. Arnold · Filed Under AI, Analytics, News, Text analytics, Text processing

Comments

Got something to say?

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.