Google, Smart Software, and Prime Mover for Hyperbole

May 17, 2022

In my experience, the cost of training smart software is very big problem. The bigness does not become evident until the licensee of a smart system realizes that training the smart software must take place on a regular schedule. Why is this a big problem? The reason is the effort required to assemble valid training sets is significant. Language, data types, and info peculiarities change over time; for example, new content is fed into a smart system, and the system cannot cope with the differences between the training set that was used and the info flowing into the system now. A gap grows, and the fix is to assemble new training data, reindex the content, and get ready to do it again. A failure to keep the smart software in sync with what is processed is a tiny bit of knowledge not explained in sales pitches.

Accountants figure out that money must be spent on a cost not in the original price data. Search systems return increasingly lousy results. Intelligence software outputs data which make zero sense to a person working out a surveillance plan. An art history major working on a PowerPoint presentation cannot locate the version used by the president of the company for last week’s pitch to potential investors.

The accountant wants to understand overruns associated with smart software, looks into the invoices and time sheets, and discovers something new: Smart software subject matter experts, indexing professionals, interns buying third-party content from an online vendor called Elsevier. These are not what CPAs confront unless there are smart software systems chugging along.

The big problem is handled in this way: Those selling the system don’t talk too much about how training is a recurring cost which increases over time. Yep, reindexing is a greedy pig and those training sets have to be tested to see if the smart software gets smarter.

The fix? Do PR about super duper even smarter methods of training. Think Snorkel. Think synthetic data. Think PowerPoint decks filled with jargon that causes clueless MBAs do high fives because the approach is a slam dunk. Yes! Winner!

I read “DeepMind’s Astounding New ‘Gato’ AI Makes Me Fear Humans Will Never Achieve AGI” and realized that the cloud of unknowing has not yet yield to blue skies. The article states:

Just like it took some time between the discovery of fire and the invention of the internal combustion engine, figuring out how to go from deep learning to AGI won’t happen overnight.

No kidding. There are gotchas beyond training, however. I have a presentation in hand which I delivered in 1997 at an online conference. Training cost is one dot point; there are five others. Can you name them? Here’s a hint for another big issue: An output that kills a patient. The accountant understands the costs of litigation when that smart AI makes a close enough for horseshoes output for a harried medical professional. Yeah, go catscan, go.

Stephen E Arnold, May 17, 2022


Comments are closed.

  • Archives

  • Recent Posts

  • Meta