Common Sense from an AI-Centric Outfit: How Refreshing
July 11, 2024
This essay is the work of a dumb dinobaby. No smart software required.
In the wild and wonderful world of smart software, common sense is often tucked beneath a stack of PowerPoint decks and vaporized in jargon-spouting experts in artificial intelligence. I want to highlight “Interview: Nvidia on AI Workloads and Their Impacts on Data Storage.” An Nvidia poohbah named Charlie Boyle output some information that is often ignored by quite a few of those riding the AI pony to the pot of gold at the end of the AI rainbow.
The King Arthur of senior executives is confident that in his domain he is the master of his information. By the way, this person has an MBA, a law degree, and a CPA certification. His name is Sir Walter Mitty of Dorksford, near Swindon. Thanks, MSFT Copilot. Good enough.
Here’s the pivotal statement in the interview:
… a big part of AI for enterprise is understanding the data you have.
Yes, the dwellers in carpetland typically operate with some King Arthur type myths galloping around the castle walls; specifically:
Myth 1: We have excellent data
Myth 2: We have a great deal of data and more arriving every minute our systems are online
Myth 3: Out data are available and in just a few formats. Processing the information is going to be pretty easy.
Myth 4: Out IT team can handle most of the data work. We may not need any outside assistance for our AI project.
Will companies map these myths to their reality? Nope.
The Nvidia expert points out:
…there’s a ton of ready-made AI applications that you just need to add your data to.
“Ready made”: Just like a Betty Crocker cake mix my grandmother thought tasted fake, not as good as home made. Granny’s comment could be applied to some of the AI tests my team have tracked; for example, the Big Apple’s chatbot outputting comments which violated city laws or the exciting McDonald’s smart ordering system. Sure, I like bacon on my on-again, off-again soft serve frozen dessert. Doesn’t everyone?
The Nvidia experts offers this comment about storage:
If it’s a large model you’re training from scratch you need very fast storage because a lot of the way AI training works is they all hit the same file at the same time because everything’s done in parallel. That requires very fast storage, very fast retrieval.
Is that a problem? Nope. Just crank up the cloud options. No big deal, except it is. There are costs and time to consider. But otherwise this is no big deal.
The article contains one gems and wanders into marketing “don’t worry” territory.
From my point of view, the data issue is the big deal. Bad, stale, incomplete, and information in odd ball formats — these exist in organizations now. The mass of data may have 40 percent or more which has never been accessed. Other data are back ups which contain versions of files with errors, copyright protected data, and Boy Scout trip plans. (Yep, non work information on “work” systems.)
Net net: The data issue is an important one to consider before getting into the let’s deploy a customer support smart chatbot. Will carpetland dwellers focus on the first step? Not too often. That’s why some AI projects get lost or just succumb to rising, uncontrollable costs. Moving data? No problem. Bad data? No problem. Useful AI system? Hmmm. How much does storage cost anyway? Oh, not much.
Stephen E Arnold, July 11, 2024