IBM Embraces a Younger Hot Number. Tough Luck, Watson, You Old Dog, You

May 12, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_tNote: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

That outstanding newspaper, The New York Post, published “IBM Pauses Hiring for 7,800 Jobs Because They Could Be Performed by AI.” The story picks up where the dinobaby tale ends. As you may recall, IBM decided that old timers could train contractors and then head to the old age home. The evictees were dubbed “dinobabies.” As a former supplier to IBM, I eagerly adopted the moniker and use an anigif to illustrate how spritely a dinobaby can be.

The new approach to work at IBM, according to the estimable newspaper, is smart software, not smart software elder uncle. The article states:

Krishna said that the company will either slow down or altogether suspend hiring for so-called “back office” functions such as human resources.

Back office functions is not defined. Perhaps it will include [a] junior and mid level programmers, [b] customer facing engineers who do Zoom type calls demonstrating sympathy and technical skills in looking up information in Big Blue’s proprietary technical databases, [c] some annoying MBAs who churn out slide decks and viewpoints about how to make IBM young again, and [d] non essential personnel like expensive old lawyers, assorted strategic planners working on the old money machines like the mainframes, and annoying design professionals who want to add L.E.D.s to IBM’s once speed champion super computers.

But whose AI will Big Blue embrace? My hunch is that it will be a combination of the forward forward technology employed by a few renegade researchers who embraced Google methods and open source software which could be dressed up with a RedHat business model. You may have a different idea. I am sticking with mine, thank you, until IBM reveals its new, rejuvenated self after a weekend in the Bahamas with its new bestie or is it best-ai?

Who says you can teach an old dog how to do an old trick with a new bone? Not me. And Watson? Who?

Stephen E Arnold, May 12, 2013

The Big Show from the Google: Meh

May 11, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_tNote: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

I ran a query on You.com, asking where I could view the Google Big Show* (no Tallulah Bankhead, just Sundar and friends). You replied as the show was airing on YouTube Live, “I don’t know where the program is.” Love that smart software, right? I clicked off because it was not as good as what Microsoft hit the slopes with in Davos. After Paris, I figured the Googlers would enlist its industry leading smart software and the really thrilled merged Google Brain and DeepMind wizards and roll out a killer program. I was thinking a digital Steve Jobs explaining killer innovations and an ending with “one more thing.” Alas, no reality distortion field, just me too, me too, me too.

sad juggler 5 11 23

A sad amateur vaudeville performer holds a tomato thrown at him when his song and dance act flopped. The art was created by the helpful and available MidJourney system. I wanted to use Bing, but I am not comfortable with the alleged surveillance characteristics of Credge.

How do I know my reaction is semi-valid. Today’s Murdochy Wall Street Journal ran the story about the Big Show on page three with the headline “Google Unveils Search Revamped for AI Era.” That’s like a vaudeville billing toward the bottom with the dog act and phrase “exotic animals.” Page three for the company that ignores the fact that it is selling online advertising with a system that generates oodles of cash yet not enough to keep a full complement of staff? That’s amazing!

I listened — briefly — to the This Week in Google podcast. I can’t understand how a program about Google can beat up on the firm with such gentle punches. I recall the phrase “a lack of strategic vision.” That was it. Navigate away to Lawfare, a program which actually discusses topics with some intellectual body blows.

I spoke with one of my research team. That person’s comment was:

I think Sundar is hitting the applause button and nothing is happening.

I though Google smart music could generate an applause track. Failing that, why not snip an applause track from one of Steve Jobs’s presentations. I like the one with the computer in the envelope or the roll out of the iPhone. I wonder if the AI infused Google search could not locate the video? You.com couldn’t locate the Google in out or off on program, but that is understandable. It was definitely a “don’t fail to miss it” event.

And where was Prabhakar Raghavan, the head of search? Where was Danny Sullivan, Google’s “we deliver relevant results”. Where was the charming head of DeepMind, an executive beloved by his team? Where was Dr. Jeff Dean, the inventor of Chubby and champion of recipes?

I know that OpenAI has been enjoying the Google wizard who explained that Google cannot keep up. See this allegedly accurate report called “Google and OpenAI Will Lose the AI Arms Race to Open-Source Engineers, a Googler Said in a Leaked Document.” Microsoft is probably high fiving and holding Team meetings with happy faces on the Microsofties who are logged in.

* The Big Show was a big flop for NBC when it aired in the early 1950s. Ah, Tallulah and the endless recycling of Jimmy Durante, snippets of stage plays, and truly memorable performers whose talent is different from today’s rap and pop stars. Here’s a famous quote from Tallulah which may be appropriate for Google’s hurry and catch up approach to innovation:

“There’s less here than meets the eye.”

I love that Tallulah quote.

Stephen E Arnold, May 11, 2023

Publishers: Why Not Replace Authors with ChatGPT and Raise Subscription Rates?

May 11, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_tNote: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

I read another article about professional publishers. Nature Magazine reports that 40 editors have bailed from medical journals due to the fees Elsevier imposes on authors. The individuals who publish peer reviewed journal articles are often desperate for getting their name in what one supposes is a prestigious journal. I remember hearing at the Cornell Theory Center years ago that online “free” publications would not be considered for tenure evaluation or for certain grant applications. Why? Hey, that’s what universities want: Old school scholarship, thank you. Professional publishers cheerfully support the scheme. Libraries have to pay big subscription fees; commercial database producers are hamstrung due to restrictions on certain content; and the aspiring PhD student or starving adjunct professor is supposed to pay hundreds of dollars for output to proof. Yeah, that’s a great approach.

Now some professors (presumably with tenure) are doing a bit of the crawfish thing; that is, backing up and getting away from what is now viewed as a bit of a scam. I used to review articles for publishers. Guess what? I did not get paid. I was improving the quality of the publication. Yeah, right. As soon as I rejected papers written in incomprehensible English with statistics which actually did not add up, I learned via a friendly chat that I should not reject so many papers.

Oh, right. I quit. What baloney.

If you want to read about Elsevier’s explanation of the fees in today’s Word to typeset page fees, check out the original. I am not an academic, a fact I happily share with crazy publishers who want me to write for their “prestigious” journals. I write stuff and have for decades. Now I post information in my blog and I write monographs which I make available to those in my lectures.

Publishers are not for me. Most are dead tree types, snared in the craziness of slicing and dicing non reproducible research results, specious cross references to legal and accounting content, and pretending that their industry is essential to the smooth running of the knowledge centric world.

Nope. Too bad it has taken decades for a handful of editors to wake up and smell the ersatz which passes for real coffee.

Stephen E Arnold, May 11, 2023

Google Wobblies: Are Falling Behind and Falling Off Buildings Linked?

May 11, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_tNote: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

I read “Google and OpenAI Struggling to Keep Up with Open Source AI, Senior Engineer Warns.” I understand the Google falling behind because big technology outfits are not exactly known for their agile footwork or blazing speed. Let’s face it. Google is not a digital Vinícius Júnior of Real Madrid fame.  But OpenAI? The write up states:

Open-source models are faster, more customizable, more private, and pound-for-pound more capable.

Open source? I thought open source had been sucked into the business strategies of Amazon AWS, the Google Cloud, and Microsoft Azure and GitHub. Apparently not.

I think the idea is not “open source,” however. Open source is a phrase which means in my view a heck of a lot of people fooling around with whatever free and low cost generative software is available. What happens when many cooks crowd into big kitchen? The output is going to be voluminous with some lousy, some okay, and a few dishes spectacular. The more cooks, the greater the chances that something spectacular will emerge. Probability low but a Bocuse d’Or-grade entrée may pop out of one’s Le Creuset.

Now what about the falling off buildings? I thought that was a Russian thing. If the New York Post’s reporting is spot on in its write up, there are some real-world consequences of Google’s falling behind.

Stephen E Arnold, May 11, 2023

The APA Zips Along Like … Like a Turtle, a Really Snappy Turtle Too

May 10, 2023

I read “American Psychology Group Issues Recommendations for Kids’ Social Media Use”. The article reports that social media is possibly, just maybe, perhaps, sort of an issue for some, a few, a handful, a teenie tiny percentage of young people. I am not sure when “social media” began. Maybe it was something called Six Degrees or Live Journal. I definitely recall the wonky weirdness of flashing MySpace pages. I do know about Orkut which if one cares to check was a big hit among a certain segment of Brazilians. The exact year is irrelevant; social media has been kicking around for about a quarter century.

Now, I learn:

The report doesn’t denounce social media, instead asserting that online social networks are “not inherently beneficial or harmful to young people,” but should be used thoughtfully. The health advisory also does not address specific social platforms, instead tackling a broad set of concerns around kids’ online lives with commonsense advice and insights compiled from broader research.

What are the data about teen suicides? What about teen depression? What about falling test scores? What about trend oddities among impressionable young people? Those data are available and easy to spot. In June 2023, another Federal agency will provide information about yet another clever way to exploit young people on social media.

Now the APA is taking a stand? Well, not really a stand, more of a general statement about what I think is one of the most destructive online application spaces available to young and old today.

How about this statement?

The APA recommends a reasonable, age-appropriate degree of “adult monitoring” through parental controls at the device and app level and urges parents to model their own healthy relationships with social media.

How many young people grow up with one parent and minimal adult monitoring? Yeah, how many? Do parents or a parent know what to monitor? Does a parent know about social media apps? Does a parent know the names of social media apps?

Impressive, APA. Now I remember why I thought Psych 101 was a total, absolute, waste of my time when I was a 17 year old fresh person at a third rate college for losers like me. My classmates — also losers — struggle to suppress laughter during the professor’s lectures. Now I am giggling at this APA position.

Sorry. Your paper and recommendations are late. You get an F.

Stephen E Arnold, May 10, 2023

Am I a Moron Because I Use You.com?

May 10, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_tNote: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

“Only Morons Use ChatGPT As a Substitute for Google” is a declarative statement. Three words strike me as important in the title of the Lifehacker (an online publication).

First, “morons.” A moron according to TheFreedictionary.com citation is: A city in Eastern Argentina although it has the accented ó. On to the next definition which is “A person who is considered foolish or stupid.” I think this is closer to the mark. I am not comfortable invoking the third definition because it aims denotative punch a a person with a person having a mental age of from seven to 12. I am 78, so let’s go with “foolish or stupid.” I am in that set.

Second, “ChatGPT.” I think the moniker can apply specifically to the for-fee service of OpenAI. It is possible that “ChatGPT” stands for an entire class of generative software. I tried to make a list of a who’s who in generative software and abandoned the task. Quite a few companies are in the game either directly like the aforementioned OpenAI or a bandwagon of companies joyfully tallied by ProductWatch.com and a few LinkedIn contributors. I think the idea is that ChatGPT outputs content which is either derivative (a characteristic of a machine eating other people’s words and images) or hallucinatory (a feature of software which can go off the rails and output like a digital Lewis Carroll galumphing around a park in which young females frolic).

Third, “Google.” My hunch is that the author is an expert online searcher who like many open source intelligence professionals rely on the advertising-supported Google search for objective, on-point answers. Oh, my, that’s quite a reliable source of information. I want to point out that Google focuses on revenue-generation from advertising. Accuracy of results often has little connection to the user’s query. My interpretation of the word “Google” is that Google is good, probably better than “ChatGPT” in providing answers designed to meet the needs of users who may not read above the 9th grade level, struggle with derivatives, and cannot name the capital of Tasmania. (It is Hobart, by the way.)

I am on the fence with the word “only.” I am not comfortable with categorical affirmatives. Given the context of the article and the fact that Google is the Web search engine of choice (conscious or manipulated) for 90 percent of people in North America and Western Europe, I can understand why the field of view is narrow. An expert with regard to Google knows more and more about less and less.

Why is ChatGPT presented as the yan to Google’s yang? The write up says:

If you can’t fact-check the bot, the answers it gives are useless.

Okay, so people fact-check Google outputs. What does Google present with regard to Larry Page and his alleged interactions with Mr. Epstein, the late Mr. Epstein I should say. Run this query Google Page Epstein and this is what the system presented to me on May 6, 2023:

image

Are these stories accurate? Is CNBC the most relevant story about a lawsuit involving JPMorgan Chase and Jeffrey Epstein? What about the New York Post article “Virgin Islands Struggles to Subpoena Google’s Larry Page in Jeffrey Epstein Lawsuit against JPMorgan.” Or the tech insider outfit Gizmodo which offers “Google’s Larry Page Could Be Served in Jeffrey Epstein Case.” There are citations to the British newspaper Telegraph and one from Law & Crime. There is a Yahoo story and a Daily Beast article.

Now what does You.com offer me when I query Google Page Epstein?

There have been recent news articles reporting that the US Virgin Islands has issued subpoenas to Google co-founder Larry Page in connection with a lawsuit against JPMorgan Chase over the Jeffrey Epstein sex trafficking case 1. The lawsuit alleges that JPMorgan Chase facilitated Epstein’s trafficking scheme by continuously offering financial support to Epstein despite knowing about the illegal activities conducted at Little St. James, a private island owned by Epstein. The subpoenas seek documents and communications related to JPMorgan’s banking relationship with Epstein. Sergey Brin, another Google co-founder, has also reportedly been subpoenaed. Despite the subpoenas, it appears that the US Virgin Islands is struggling to serve Page with the legal documents. A recent ruling allows Page to be served with the subpoenas via his attorney in California. cnbc1

I would suggest that the Google citations provide a list, no indication of which source is more or less highly regarded for accuracy. Google wants me to click on one or more of the links, ingest the content of each article, and then synthesize the items of information which strike me as on the money. You.com on the other hand provides me with the bare bones of the alleged involvement with a person who like Lewis Carroll may have had an interest in hanging out around a park on a sunny Saturday afternoon. Catching some rays and perhaps coming up with new ideas are interpretations of such as action by a lawyer hired to explain the late and much lamented Mr. Epstein.

So which is it? The harvesting of buckwheat the old-fashioned way or the pellet of information spat out in a second or two?

I think the idea is that morons are going to go the ChatGPT-like route. Wizards and authors of online “real” news articles want to swing that sickle and relive the thrill of the workers in Vincent van Gogh’s “The Harvest.”

The article says:

you can’t tell whether an AI-generated fact is true or not by the way the text looks; it’s designed to look plausible and correct. You have to fact-check it.

Does one need to fact-check what Google spits out? What about the people who follow Google Maps’s instructions and drive off a cliff? What about the links in Google Scholar to papers with non-reproducible results?

Here’s the conclusion to the write up:

So if you want to use ChatGPT to get ideas or brainstorm places to look for more information, fine. But don’t expect it to base its answers on reality. Even for something as innocuous as recommending books based on your favorites, it’s likely to make up books that don’t even exist.

I like that “don’t even exist.” Google Bard would never do that. Google management would never fire a smart software executive who points out that Google’s smart software is biased. Google would never provide search results that explain how to steal copyright protected software. Well, maybe just one time like this:

image

Oh, no. Wonky software would never ever do that but for Google’s results via YouTube for the query “Magix Vegas crack.” Now who is a moron? Perhaps an apologist for Google?

Stephen E Arnold, May 10, 2023

Microsoft Bing Causes the Google Lights to Flicker

May 10, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_tNote: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

The article “The Updated Bing Chat Leapfrogs ChatGPT in 6 Important New Ways” shakes the synapses of Googzilla. The Sundar & Prabhakar Comedy Show has been updating its scripts and practicing fancy dancing. Now the Redmond software, security, and strategy outfit has dragged fingernails across the chalk board in Google World. Annoying? Yes, indeed.

The write up does not mention Google directly, but the eerie light from the L.E.D.s illuminating the online ad vendor’s logo shine between the words in the article. Here’s an example:

opening up access to all.

None of this “to be” stuff from the GOOG. The Microsofties are making their version of ChatGPT available to “all.” (Obviously the categorical “all” is crazy marketing logic, but the main idea is “here and now”, not a progressive or future tense fantasy land.

Also, the write up uses jargon to explain what’s new from the skilled professionals who crafted Windows 3.11. Microsoft has focused on the image generation feature and hooking more people who want smart software into the Edge world of a browser.

But between the spaces in the article, one message flickers. Microsoft is pushing product. Google is reorganizing, watching Dr. Jeff Dean with side glances, and running queries to find out what Dr. Hinton is saying about the online ad outfit’s sense of ethical behavior. In short, the Google is passive with synapses jarred by Microsoft marketing plus actual applications of smart software.

Fascinating. Is the flickering of the Google L.E.D.s a sign that power is failing or flawed electrical engineering is causing wobbles?

Stephen  E Arnold, May 10, 2023

Vint Cerf: Explaining Why Google Is Scrambling

May 9, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_tNote: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

One thing OpenAI’s ChatGPT legions of cheerleaders cannot do is use Dr. Vint Cerf as the pointy end of a PR stick. I recall the first time I met Dr. Cerf. He was the keynote at an obscure conference about search and retrieval. Indeed he took off his jacket. He then unbuttoned his shirt to display a white T shirt with “I TCP on everything.” The crowd laughed — not a Jack Benny 30 second blast of ebullience — but a warm sound.

cartoon dragon 3

Midjourney output this illustration capturing Googzilla in a rocking chair in the midst of the snow storm after the Microsoft asteroid strike at Davos. Does the Google look aged? Does the Google look angry? Does the Google do anything but talk in the future and progressive tenses? Of course not. Google is not an old dinosaur. The Google is the king of online advertising which is the apex of technology.

I thought about that moment when I read “Vint Cerf on the Exhilarating Mix of Thrill and Hazard at the Frontiers of Tech: That’s Always an Exciting Place to Be — A Place Where Nobody’s Ever Been Before.’” The interview is a peculiar mix of ignoring the fact that the Google is elegantly managing wizards (some who then terminate themselves by alleging falling or jumping off buildings), trapped in a conveyer belt of increasing expenses related to its plumbing and the maintenance thereof, and watching the fireworks ignited by the ChatGPT emulators. And Google is watching from a back alley, not the front row as I write this. The Google may push its way into the prime viewing zone, but it is OpenAI and a handful of other folks who are assembling the sky rockets and aerial bombs, igniting the fuses, and capturing attention.

Yes, that’s an exciting place to be, but at the moment that is not where Google is. Google is doing big time public relations as outfits like Microsoft expand the zing of smart Excel, Outlook, PowerPoint, and — believe it or not — Excel. Google is close enough to see the bright lights and hear the applause directed at lesser outfits. Google knows it is not the focus of attention. That’s where Vint Cerf’s comes into play on the occasion of winning an award for advancing technology (in general, not just online advertising).

Here are a handful of statements I noticed in the TechMeme “Featured Article” conversation with Dr. Cerf. Note, please, that my personal observations are in italic type in a color similar to that used for Alphabet’s Code Red emergency.

Snip 1: “Sergey has come back to do a little bit more on the artificial intelligence side of things…” Interesting. I interpret this as a college student getting a call to come back home to help out an ailing mom in what some health care workers call “sunset mode.” And Mr. Page? Maintaining a lower profile for non-Googley reasons? See the allegedly accurate report “Virgin Islands issued subpoena to Google co-founder Larry Page in lawsuit against JPMorgan Chase over Jeffrey Epstein.”

Snip 2: “a place where nobody’s ever been before.” I interpret this to mean that the Google is behind the eight ball or between an agile athlete and a team composed of yesterday’s champions or a helicopter pilot vaguely that the opposition is flying a nimble, smart rocket equipped fighter jet. Dinosaurs in rocking chairs watch the snow fall; they do not move to Nice, France.

Snip 3: “Be cautious about going too fast and trying to apply it without figuring out how to put guardrails in place.” How slow did Google go when it was inspired by the GoTo, Overture, and Yahoo ad model, settling for about $1 billion before the IPO? I don’t recall picking up the scent of ceramic brakes applied to the young, frisky, and devil-may-care baby Google. Step on the gas and go faster are the mantras I recall hearing.

Snip 4: “I will say that whenever something gets monetized, you should anticipate there will be emergent properties and possibly unexpected behavior, all driven by greed.” I wonder if the statement is a bit of a Freudian slip. Doesn’t the remark suggest that Google itself has manifested this behavior? It sure does to me, but I am no shrink. Who knew Google’s search-and-advertising business would become the poster reptile for surveillance capitalism?

Snip 5: “I think we are going to have to invest more in provenance and identity in order to evaluate the quality of that which we are experiencing.” Has Mr. Cerf again identified one of the conscious choices made by Google decades ago; that is, ignore date and time stamps for when the content was first spidered, when it was created, and when it was updated. What is the quality associated with the obfuscation of urls for certain content types, and remove a user’s ability to display the “content” the user wants; for example, a query for a bound phrase for an entity like “Amanda Rosenberg.” I also wonder about advertisements which link to certain types of content; for example, health care products or apps with gotcha functionalities.

Several observations:

  1. Google’s attempts to explain that its going slow is a mature business method for Google is amusing. I would recommend that the gag be included in the Sundar and Prabhakar comedy routine.
  2. The crafted phrases about guardrails and emergent behaviors do not explain why Google is talking and not doing. Furthermore, the talking is delivered not by users of a ChatGPT infused application. The words are flowing from a person who is no expert in smart software and has a few miles on his odometer as I do.
  3. The remarks ignore the raw fact that Microsoft dominated headlines with its Davos rocket launch. Google’s search wizards were thinking about cost control, legal hassles, and the embarrassing personnel actions related to smart software and intra-company guerilla skirmishes.

Net net: Read the interview and ask, “Where’s Googzilla now?” My answer is, “Prepping for retirement?”

Stephen E Arnold, May 9, 2023

Good Enough AI: Decimating Bit-Blasted Wretches

May 9, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_tNote: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

I think the writers’ strike will make it possible for certain Hollywood producer types to cozy up with smart software. What works in the cinema wasteland are sequels and remakes of what has sold. My hunch is that purpose-built smart software will be able to output good enough scripts quickly. A few humanoids, maybe even on set actors, can add touches which elevate good enough to pretty good.

There are other humanoid writers now at risk from good-enough outputs. At a Derby Party on May 6, 2023, I whipped out my mobile and illustrated how You.com can crank out a short essay good enough to get an A or a B in a sophomore English class. One person  who made a bundle of money selling automobiles said immediately, “I could have used this instead of that PR company and the part-timers who used to drive me crazy with questions.”

This person understood, and he was in his late 70s but still able to remember PR and marketing experts who were supposed to write presentations, ads, and marketing letters.

If a biz whiz heading to the old-age home grasp the concept, imagine what a rotund, confident MBA will do with good enough smart software.

What’s interesting to me is that the Washington Post, under the control of the original bulldozer driver Jeff Bezos, seems to understand what’s going to happen to many scribes, columnists, littérateurs, and scribblers. The ink stained wretches are going to become bit-blasted wretches. “He Wrote a Book on a Rare Subject. Then a ChatGPT Replica Appeared on Amazon” includes a quote from a human involved in smart software created content:

“We published a celebrity profile a month. Now we can do 10,000 a month.”

Net net: Smart software will create many opportunities for “writers” to find their future elsewhere. Fixer uppers of machine generated content may become a hot new gig along with TikTok maker of van life videos, creators of text based wall graffiti, and signs with messages such as “Will edit for food.”

Stephen E Arnold, May 9, 2023

AI Shocker? Automatic Indexing Does Not Work

May 8, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_tNote: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

I am tempted to dig into my more than 50 years of work in online and pull out a chestnut or two. l will not. Just navigate to “ChatGPT Is Powered by These Contractors Making $15 an Hour” and check out the allegedly accurate statements about the knowledge work a couple of people do.

The write up states:

… contractors have spent countless hours in the past few years teaching OpenAI’s systems to give better responses in ChatGPT.

The write up includes an interesting quote; to wit:

“We are grunt workers, but there would be no AI language systems without it,” said Savreux [an indexer tagging content for OpenAI].

I want to point out a few items germane to human indexers based on my experience with content about nuclear information, business information, health information, pharmaceutical information, and “information” information which thumbtypers call metadata:

  1. Human indexers, even when trained in the use of a carefully constructed controlled vocabulary, make errors, become fatigued and fall back on some favorite terms, and misunderstand the content and assign terms which will mislead when used in a query
  2. Source content — regardless of type — varies widely. New subjects or different spins on what seem to be known concepts mean that important nuances may be lost due to what is included in the available dataset
  3. New content often uses words and phrases which are difficult to understand. I try to note a few of the more colorful “new” words and bound phrases like softkill, resenteeism, charity porn, toilet track, and purity spirals, among others. In order to index a document in a way that allows one to locate it, knowing the term is helpful if there is a full text instance. If not, one needs a handle on the concept which is an index terms a system or a searcher knows to use. Relaxing the meaning (a trick of some clever outfits with snappy names) is not helpful
  4. Creating a training set, keeping it updated, and assembling the content artifacts is slow, expensive, and difficult. (That’s why some folks have been seeking short cuts for decades. So far, humans still become necessary.)
  5. Reindexing, refreshing, or updating the digital construct used to “make sense” of content objects is slow, expensive, and difficult. (Ask an Autonomy user from 1998 about retraining in order to deal with “drift.” Let me know what you find out. Hint: The same issues arise from popular mathematical procedures no matter how many buzzwords are used to explain away what happens when words, concepts, and information change.

Are there other interesting factoids about dealing with multi-type content. Sure there are. Wouldn’t it be helpful if those creating the content applied structure tags, abstracts, lists of entities and their definitions within the field or subject area of the content, and pointers to sources cited in the content object.

Let me know when blog creators, PR professionals, and TikTok artists embrace this extra work.

Pop quiz: When was the last time you used a controlled vocabulary classification code to disambiguate airplane terminal, computer terminal, and terminal disease? How does smart software do this, pray tell? If the write up and my experience are on the same wave length (not surfing wave but frequency wave), a subject matter expert, trained index professional, or software smarter than today’s smart software are needed.

Stephen E Arnold, May 8, 2023

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta