Google Maps: Trust in Us. Well, Mostly

December 1, 2023

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

Friday and December 1, 2023. I want to commemorate the beginning of the last month of what has been an exciting 2023. How exciting. How about a Google Maps’ story?

Navigate to “Google Maps Mistake Leaves Dozens of Families Stranded in the Desert”. Here’s the story: The outstanding and from my point of view almost unusable Google Maps directed a number of people to a “dreadful dirt path during a dust storm.”

image

“Mommy, says the teenage son, “I told you exactly what the smart map system said to do. Why are we parked in a tree?” Thanks, MSFT Copilot. Good enough art.

Hey, wait up. I thought Google had developed a super duper quantum smart weather prediction system. Is Google unable to cross correlate Google Maps with potential negative weather events?

The answer, “Who are you kidding?” Google appears to be in content marketing hyperbole “we are better at high tech” mode. Let’s not forget the Google breakthrough regarding material science. Imagine. Google’s smart software identified oodles of new materials. Was this “new” news? Nope. Computational chemists have been generating potentially useful chemical substances for — what is it now? — decades. Is the Google materials science breakthrough going to solve the problem of burned food sticking to a cookie sheet? Sure, I am waiting for the news release.

What’s up with the Google Maps?

The write up says:

Google Maps apologized for the rerouting disaster and said that it had removed that route from its platform.

Hey, that’s helpful. I assume it was a quantum answer to a “we’re smart” outfit.

I wish I had kept the folder which had my collection of Google Map news items. I do recall someone who drove off a cliff. I had my own notes about my trying to find Seymour Rubinstein’s house on a bright sunny day. The inventor of WordStar did not live in the Bay. That was the location of Mr. Rubinstein’s house, according to Google Maps. I did find the house, and I had sufficient common sense not to drive into the water. I had other examples of great mappiness, but, alas!, no longer.

Is directing a harried mother into a desert during a dust storm humorous? Maybe to some in Sillycon Valley. I am not amused. I don’t think the mother was amused because in addition to the disturbing situation, her vehicle suffered $5,000 in damage.

The question is, “Why?”

Perhaps Google’s incentive system is not aligned to move consumer products like Google Maps from “good enough” to “excellent.” And the money that could have been spent on improving Google Maps may be needed to output stories about Google’s smart software inventing new materials.

Interesting. Isn’t OpenAI and the much loved Microsoft leading the smart software mindshare race? I think so. Perhaps Maps’ missteps are signal about management misalignment and deep issues within the Alphabet Google YouTube inferiority complex?

Stephen E Arnold, December 1, 2023

Is YouTube Marching Toward Its Waterloo?

November 28, 2023

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

I have limited knowledge of the craft of warfare. I do have a hazy recollection that Napoleon found himself at the wrong end of a pointy stick at the Battle of Waterloo. I do recall that Napoleon lost the battle and experienced the domino effect which knocked him down a notch or two. He ended up on the island of Saint Helena in the south Atlantic Ocean with Africa a short 1,200 miles to the east. But Nappy had no mobile phone, no yacht purchased with laundered money, and no Internet. Losing has its downsides. Bummer. No empire.

I thought about Napoleon when I read “YouTube’s Ad Blocker Crackdown Heats Up.” The question I posed to myself was, “Is the YouTube push for subscription revenue and unfettered YouTube user data collection a road to Google’s Battle of Waterloo?”

image

Thanks, MSFT Copilot. You have a knack for capturing the essence of a loser. I love good enough illustrations too.

The cited article from Channel News reports:

YouTube is taking a new approach to its crackdown on ad-blockers by delaying the start of videos for users attempting to avoid ads. There were also complaints by various X (formerly Twitter) users who said that YouTube would not even let a video play until the ad blocker was disabled or the user purchased a YouTube Premium subscription. Instead of an ad, some sources using Firefox and Edge browsers have reported waiting around five seconds before the video launches the content. According to users, the Chrome browser, which the streaming giant shares an owner with, remains unaffected.

If the information is accurate, Google is taking steps to damage what the firm has called the “user experience.” The idea is that users who want to watch “free” videos, have a choice:

  1. Put up with delays, pop ups, and mindless appeals to pay Google to show videos from people who may or may not be compensated by the Google
  2. Just fork over a credit card and let Google collect about $150 per year until the rates go up. (The cable TV and mobile phone billing model is alive and well in the Google ecosystem.)
  3. Experiment with advertisement blocking technology and accept the risk of being banned from Google services
  4. Learn to love TikTok, Instagram, DailyMotion, and Bitchute, among other options available to a penny-conscious consumer of user-produced content
  5. Quit YouTube and new-form video. Buy a book.

What happened to Napoleon before the really great decision to fight Wellington in a lovely part of Belgium. Waterloo is about nine miles south of the wonderful, diverse city of Brussels. Napoleon did not have a drone to send images of the rolling farmland, where the “enemies” were located, or the availability of something behind which to hide. Despite Nappy’s fine experience in his march to Russia, he muddled forward. Despite allegedly having said, “The right information is nine-tenths of every battle,” the Emperor entered battle, suffered 40,000 casualties, and ended up in what is today a bit of a tourist hot spot. In 1816, it was somewhat less enticing. Ordering troops to charge uphill against a septuagenarian’s forces was arguably as stupid as walking to Russia as snowflakes began to fall.

How does this Waterloo related to the YouTube fight now underway? I see several parallels:

  1. Google’s senior managers, informed with the management lore of 25 years of unfettered operation, knows that users can be knocked along a path of the firm’s choice. Think sheep. But sheep can be disorderly. One must watch sheep.
  2. The need to stem the rupturing of cash required to operate a massive “free” video service is another one of those Code Yellow and Code Red events for the company. With search known to be under threat from Sam AI-Man and the specters of “findability” AI apps, the loss of traffic could be catastrophic. Despite Google’s financial fancy dancing, costs are a bit of a challenge: New hardware costs money, options like making one’s own chips costs money, allegedly smart people cost money, marketing costs money, legal fees cost money, and maintaining the once-free SEO ad sales force costs money. Got the message: Expenses are a problem for the Google in my opinion.
  3. The threat of either TikTok or Instagram going long form remains. If these two outfits don’t make a move on YouTube, there will be some innovator who will. The price of “move fast and break things” means that the Google can be broken by an AI surfer. My team’s analysis suggests it is more brittle today than at any previous point in its history. The legal dust up with Yahoo about the Overture / GoTo issue was trivial compared to the cost control challenge and the AI threat. That’s a one-two for the Google management wizards to solve. Making sense of the Critique of Pure Reason is a much easier task in my view.

The cited article includes a statement which is likely to make some YouTube users uncomfortable. Here’s the statement:

Like other streaming giants, YouTube is raising its rates with the Premium price going up to $13.99 in the U.S., but users may have to shell out the money, and even if they do, they may not be completely free of ads.

What does this mean? My interpretation is that [a] even if you pay, a user may see ads; that is, paying does not eliminate ads for perpetuity; and [b] the fee is not permanent; that is, Google can increase it at any time.

Several observations:

  1. Google faces high-cost issues from different points of the business compass: Legal in the US and EU, commercial from known competitors like TikTok and Instagram, and psychological from innovators who find a way to use smart software to deliver a more compelling video experience for today’s users. These costs are not measured solely in financial terms. The mental stress of what will percolate from the seething mass of AI entrepreneurs. Nappy did not sleep too well after Waterloo. Too much Beef Wellington, perhaps?
  2. Google’s management methods have proven appropriate for generating revenue from a ad model in which Google controls the billing touch points. When those management techniques are applied to non-controllable functions, they fail. The hallmark of the management misstep is the handling of Dr. Timnit Gebru, a squeaky wheel in the Google AI content marketing machine. There is nothing quite like stifling a dissenting voice, the squawk of a parrot, and a don’t-let-the-door-hit-you-when -you-leave moment.
  3. The post-Covid, continuous warfare, and unsteady economic environment is causing the social fabric to fray and in some cases tear. This means that users may become contentious and become receptive to a spontaneous flash mob action toward Google and YouTube. User revolt at scale is not something Google has demonstrated a core competence.

Net net: I will get my microwave popcorn and watch this real-time Google Boogaloo unfold. Will a recipe become famous? How about Grilled Google en Croute?

Stephen E Arnold, November 28, 2023

Amazon Alexa Factoids: A Look Behind the Storefront Curtains

November 24, 2023

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

Hey, Amazon admirers, I noted some interesting (allegedly accurate factoids) in “Amazon Alexa to Lose $10 Billion This Year.” No, I was not pulled by interesting puddle of red ink.

image

Alexa loves to sidestep certain questions. Thanks, MSFT Copilot. Nice work even though you are making life difficult for Google’s senior management today.

Let me share four items which I thought interesting. Please, navigate to the original write up to get the full monte. (I support the tailor selling civvies, not the card game.)

  1. “Just about every plan to monetize Alexa has failed, with one former employee calling Alexa ‘a colossal failure of imagination,’ and ‘a wasted opportunity.’” [I noted the word colossal.]
  2. “Amazon can’t make money from Alexa telling you the weather”
  3. “I worked in the Amazon Alexa division. The level of incompetence coupled with arrogance was astounding.”
  4. “FAANG has gotten so large that the stock bump that comes from narrative outpaces actual revenue from working products.”

Now how about the management philosophy behind these allegedly accurate statements? It sounds like the consequences of doing high school science club field trip planning. Not sure how those precepts work? Just do a bit of reading about the OpenAI – Sam AI-Man hootenanny.

Stephen E Arnold, November 24, 2023

Hitting the Center Field Wall, AI Suffers an Injury!

November 15, 2023

green-dino_thumb_thumbThis essay is the work of a dumb, dinobaby humanoid. No smart software required.

At a reception at a government facility in Washington, DC, last week, one of the bright young sparks told me, “Every investment deal I see gets fund if it includes the words ‘artificial intelligence.’” I smiled and moved to another conversation. Wow, AI has infused the exciting world of a city built on the swampy marge of the Potomac River.

I think that the go-go era of smart software has reached a turning point. Venture firms and consultants may not have received the email with this news. However, my research team has, and the update contains information on two separate thrusts of the AI revolution.

image

The heroic athlete, supported by his publicist, makes a heroic effort to catch the long fly ball. Unfortunately our star runs into the wall, drops the ball, and suffers what may be a career-ending injury to his left hand. (It looks broken, doesn’t it?)Oh, well. Thanks, MSFT Bing. The perspective is weird and there is trash on the ground, but the image is good enough.

The first signal appears in “AI Companies Are Running Out of Training Data.” The notion that online information is infinite is a quaint one. But in the fever of moving to online, reality is less interesting that the euphoria of the next gold rush or the new Industrial Revolution. Futurism reports:

Data plays a central role, if not the central role, in the AI economy. Data is a model’s vital force, both in basic function and in quality; the more natural — as in, human-made — data that an AI system has to train on, the better that system becomes. Unfortunately for AI companies, though, it turns out that natural data is a finite resource — and if that tap runs dry, researchers warn they could be in for a serious reckoning.

The information or data in question is not the smog emitted by modern automobiles’ chip stuffed boxes. Nor is the data the streams of geographic information gathered by mobile phone systems. The high value data are those which matter; for example, in a stream of security information, which specific stock is moving because it is being manipulated by one of those bright young minds I met at the DC event.

The article “AI Companies Are Running Out of Training Data” adds:

But as data becomes increasingly valuable, it’ll certainly be interesting to see how many AI companies can actually compete for datasets — let alone how many institutions, or even individuals, will be willing to cough their data over to AI vacuums in the first place. But even then, there’s no guarantee that the data wells won’t ever run dry. As infinite as the internet seems, few things are actually endless.

The fix is synthetic or faked data; that is, fabricated data which appears to replicate real-life behavior. (Don’t you love it when Google predicts the weather or a smarty pants games the crypto market?)

The message is simple: Smart software has ground through the good stuff and may face its version of an existential crisis. That’s different from the rah rah one usually hears about AI.

The second item my team called to my attention appears in a news story called “OpenAI Pauses New ChatGPT Plus Subscriptions De to Surge in Demand.” I read the headline as saying, “Oh, my goodness, we don’t have the money or the capacity to handle more users requests.”

The article expresses the idea in this snappy 21st century way:

The decision to pause new ChatGPT signups follows a week where OpenAI services – including ChatGPT and the API – experienced a series of outages related to high-demand and DDoS attacks.

Okay, security and capacity.

What are the implications of these two unrelated stories:

  1. The run up to AI has been boosted with system operators ignoring copyright and picking low hanging fruit. The orchard is now looking thin. Apples grow on trees, just not quickly and over cultivation can ruin the once fertile soil. Think a digital Dust Bowl perhaps?
  2. The friction of servicing user requests is causing slow downs. Can the heat be dissipated? Absolutely but the fix requires money, more than high school science club management techniques, and common sense. Do AI companies exhibit common sense? Yeah, sure. Everyday.
  3. The lack of high-value or sort of good information is a bummer. Machines producing insights into the dark activities of bad actors and the thoughts of 12-year-olds are grinding along. However, the value of the information outputs seems to be lagging behind the marketers’ promises. One telling example is the outright failure of Israel’s smart software to have utility in identifying the intent of bad actors. My goodness, if any country has smart systems, it’s Israel. Based on events in the last couple of months, the flows of data produced what appears to be a failing grade.

If we take these two cited articles’ information at face value, one can make a case that the great AI revolution may be facing some headwinds. In a winner-take-all game like AI, there will be some Sad Sacks at those fancy Washington, DC receptions. Time to innovate and renovate perhaps?

Stephen E Arnold, November 15, 2023

ACM Kills Print Publications But Dodges the Money Issue

November 6, 2023

green-dino_thumb_thumbThis essay is the work of a dumb humanoid. No smart software required.

In January 2024, the Association for Computing Machinery will kill off its print publication. “Ceasing Print Publication of ACM Journals and Transaction” says good bye to the hard copy instances of Communications of ACM, ACM InRoads, and a couple of other publications. It is possible that ACM will continue to produce print versions of material for students. (I thought students were accustomed to digital content. Guess the ACM knows something I don’t. That’s not too difficult. I am a dinobaby, who read ACM publications for the stories, not the pictures.)

image

The perspiring clerk asks, “But what about saving the whales?” The CFO carrying the burden of talking to auditors, replies, “It’s money stupid, not that PR baloney.” Thanks, Microsoft Bind. You understand accountants perspiring. Do you have experience answering IRS questions about some calculations related to Puerto Rico?

Why would a professional trade outfit dismiss paper? My immediate and uninformed answer to this question is, “Cost. Stuff like printing, storage, fulfillment, and design cost money.” I would be wrong, of course. The ACM gives these reasons:

  • Be environmentally friendly. (Don’t ACM supporters use power sucking data centers often powered by coal?)(
  • Electronic publications have more features. (One example is a way to charge a person who wants to read an article and cut off at the bud the daring soul pumping money into a photocopy machine to have an article to read whilst taking a break from the coffee and mobile phone habit.)
  • Subscriptions are tanking.

I think the “subscriptions” bit is a way to say, “Print stuff is very expensive to produce and more expensive to sell.”

With the New York Times allegedly poised to use smart software to write its articles, when will the ACM dispense with member contributions?

Stephen E Arnold, November 6, 2023

Knowledge Workers, AI Software Is Cheaper and Does Not Take Vacations. Worried Yet?

November 2, 2023

green-dino_thumb_thumbThis essay is the work of a dumb humanoid. No smart software required.

I believe the 21st century is the era of good enough or close enough for horseshoes products and services. Excellence is a surprise, not a goal. At a talk I gave at CeBIT years ago, I explained that certain information centric technologies had reached the “let’s give up” stage of development. Fresh in my mind were the lessons I learned writing a compendium of information access systems published as “The Enterprise Search Report” by a company lost to me in the mists of time.

11 1 replaced by ai

“I just learned that our department will be replaced by smart software,” says the MBA from Harvard. The female MBA from Stanford emits a scream just like the one she let loose after scuffing her new Manuel Blahnik (Rodríguez) shoes. Thanks, MidJourney, you delivered an image with a bit of perspective. Good enough work.

I identified the flaws in implementations of knowledge management, information governance, and enterprise search products. The “good enough” comment was made to me during the Q-and-A session. The younger person pointed out that systems for finding information — regardless of the words I used to describe what most knowledge workers did — was “good enough.” I recall the simile the intense young person offered as I was leaving the lecture hall. Vivid now years later was the comment that improving information access was like making catalytic converters deliver zero emissions. Thus, information access can’t get where it should be. The technology is good enough.

I wonder if that person has read “AI Anxiety As Computers Get Super Smart.” Probably not. I believe that young person knew more than I did. As a dinobaby, I just smiled and listened. I am a smart dinobaby in some situations. I noted this passage in the cited article:

Generative AI, however, can take aim at white-collar jobs such as lawyers, doctors, teachers, journalists, and even computer programmers. A report from the McKinsey consulting firm estimates that by the end of this decade, as much as 30 percent of the hours worked in the United States could be automated in a trend accelerated by generative AI.

Executive orders and government proclamations are unlikely to have much effect on some people. The write up points out:

Generative AI makes it easier for scammers to create convincing phishing emails, perhaps even learning enough about targets to personalize approaches. Technology lets them copy a face or a voice, and thus trick people into falling for deceptions such as claims a loved one is in danger, for example.

What’s the fix? One that is good enough probably won’t have much effect.

Stephen E Arnold, November 2, 2023

test

By Golly, the Gray Lady Will Not Miss This AI Tech Revolution!

November 2, 2023

green-dino_thumb_thumbThis essay is the work of a dumb humanoid. No smart software required.

The technology beacon of the “real” newspaper is shining like a high-technology beacon. Flash, the New York Times Online. Flash, terminating the exclusive with LexisNexis. Flash. The shift to a — wait for it — a Web site. Flash. The in-house indexing system. Flash. Buying About.com. Flash. Doing podcasts. My goodness, the flashes have impaired my vision. And where are we today after labor strife, newsroom craziness, and a list of bestsellers that gets data from…? I don’t really know, and I just haven’t bothered to do some online poking around.

image

A real journalist of today uses smart software to write listicles for Buzzfeed, essays for high school students, and feature stories for certain high profile newspapers. Thanks for the drawing Microsoft Bing. Trite but okay.

I thought about the technology flashes from the Gray Lady’s beacon high atop its building sort of close to Times Square. Nice branding. I wonder if mobile phone users know why the tourist destination is called Times Square. Since I no longer work in New York, I have forgotten. I do remember the high intensity pinks and greens of a certain type of retail establishment. In fact, I used to know the fellow who created this design motif. Ah, you don’t remember. My hunch is that there are other factoids you and I won’t remember.

For example, what’s the byline on a New York Times’s story? I thought it was the name or names of the many people who worked long hours, made phone calls, visited specific locations, and sometimes visited the morgue (no, the newspaper morgue, not the “real” morgue where the bodies of compromised sources ended up).

If the information in  that estimable source Showbiz411.com is accurate, the Gray Lady may cite zeros and ones. The article is “The New York Times Help Wanted: Looking for an AI Editor to Start Publishing Stories. Six Figure Salary.” Now that’s an interesting assertion. A person like me might ask, “Why not let a recent college graduate crank out machine generated stories?” My assumption is that most people trying to meet a deadline and in sync with Taylor Swift will know about machine-generated information. But, if the story is true, here’s what’s up:

… it looks like the Times is going let bots do their journalism. They’re looking for “a senior editor to lead the newsroom’s efforts to ambitiously and responsibly make use of generative artificial intelligence.” I’m not kidding. How the mighty have fallen. It’s on their job listings.

The Showbiz411.com story allegedly quotes the Gray Lady’s help wanted ad as saying:

“This editor will be responsible for ensuring that The Times is a leader in GenAI innovation and its applications for journalism. They will lead our efforts to use GenAI tools in reader-facing ways as well as internally in the newsroom. To do so, they will shape the vision for how we approach this technology and will serve as the newsroom’s leading voice on its opportunity as well as its limits and risks. “

There are a bunch of requirements for this job. My instinct is that a few high school students could jump into this role. What’s the difference between a ChatGPT output about crossing the Delaware and writing a “real” news article about fashion trends seen at Otto’s Shrunken Head.

Several observations:

  • What does this ominous development mean to the accountants who will calculate the cost of “real” journalists versus a license to smart software? My thought is that the general reaction will be positive. Imagine: No vacays, no sick days, and no humanoid protests. The Promised Land has arrived.
  • How will the Gray Lady’s management team explain this cuddling up to smart software? Perhaps it is just one of those newsroom romances? On the other hand, what if something serious develops and the smart software moves in? Yipes.
  • What will “informed” reads think of stories crafted by the intellectual engine behind a high school student’s essay about great moments in American history? Perhaps the “informed” readers won’t care?

Exciting stuff in the world of real journalism down the street from Times Square and the furries, pickpockets, and gawkers from Ames, Iowa. I wonder if the hallucinating smart software will be as clever as the journalist who fabricates a story? Probably not. “Real” journalists do not shape, weaponized, or filter the actual factual. Is John Wiley & Sons ready to take the leap?

Stephen E Arnold, November 2, 2023

test

Now the AI $64 Question: Where Are the Profits?

October 26, 2023

green-dino_thumbThis essay is the work of a dumb humanoid. No smart software required.

As happens with most over-hyped phenomena, AI is looking like a disappointment for investors. Gizmodo laments, “So Far, AI Is a Money Pit That Isn’t Paying Off.” Writer Lucas Ropek cites this report from the Wall Street Journal as he states tech companies are not, as of yet, profiting off AI as they had hoped. For example, Microsoft’s development automation tool GitHub Copilot lost an average of $20 a month for each $10-per-month user subscription. Even ChatGPT is seeing its user base decline while operating costs remain sky high. The write-up explains:

“The reasons why the AI business is struggling are diverse but one is quite well known: these platforms are notoriously expensive to operate. Content generators like ChatGPT and DALL-E burn through an enormous amount of computing power and companies are struggling to figure out how to reduce that footprint. At the same time, the infrastructure to run AI systems—like powerful, high-priced AI computer chips—can be quite expensive. The cloud capacity necessary to train algorithms and run AI systems, meanwhile, is also expanding at a frightening rate. All of this energy consumption also means that AI is about as environmentally unfriendly as you can get. To get around the fact that they’re hemorrhaging money, many tech platforms are experimenting with different strategies to cut down on costs and computing power while still delivering the kinds of services they’ve promised to customers. Still, it’s hard not to see this whole thing as a bit of a stumble for the tech industry. Not only is AI a solution in search of a problem, but it’s also swiftly becoming something of a problem in search of a solution.”

Ropek notes it would have been wise for companies to figure out how to turn a profit on AI before diving into the deep end. Perhaps, but leaping into the next big thing is a priority for tech firms lest they be left behind. After all, who could have predicted this result? Let’s ask Google Bard, OpenAI, or one of the numerous AI “players”? Even better perhaps will be deferring the question of costs until the AI factories go online.

Cynthia Murrell, October 26, 2023

xx

Data Drift: Yes, It Is Real and Feeds on False Economy Methods

October 10, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_t[2]Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

When I mention statistical drift, most of those in my lectures groan and look at their mobile phone. I am delighted to call attention to a write up called “The Model-Eat-Model World’ of Clinical AI: How Predictive Power Becomes a Pitfall.” The article focuses on medical information, but its message applies to a wide range of “smart” models. These include the Google shortcuts of Snorkel to the Bayesian based systems in vogue in many policeware and intelware products. The behavior appears to have influenced Dr. Timnit Gebru and contributed to her invitation to find her future elsewhere from none other than the now marginalized Google Brain group. (Googlers do not appreciate being informed of their shortcomings it seems.)

10 10 young exec

The young shark of Wall Street ponders his recent failure at work. He thinks, “I used those predictive models as I did last year. How could they have gone off the rails. I am ruined.” Thanks, MidJourney. Manet you are not.

The main idea is that as numerical recipes iterate, the outputs deteriorate or wander off the desired path. The number of cycles require to output baloney depends on the specific collections of procedures. But wander these puppies do. To provide a baseline, users of the Autonomy Bayesian system found that after three months of operation, precision and recall were deteriorated. The fix was to retrain the system. Flash forward today to systems that iterate many times faster than the Autonomy neurolinguistic programming method, and the lousy outputs can appear in a matter of hours. There are corrective steps one can take, but these are expensive when they involve humans. Thus, some predictive outputs have developed smart software to try and keep the models from jumping their railroad tracks. When the models drift, the results seem off kilter.

The write up says:

Last year, an investigation from STAT and the Massachusetts Institute of Technology captured how model performance can degrade over time by testing the performance of three predictive algorithms. Over the course of a decade, accuracy for predicting sepsis, length of hospitalization, and mortality varied significantly. The culprit? A combination of clinical changes — the use of new standards for medical coding at the hospital — and an influx of patients from new communities. When models fail like this, it’s due to a problem called data drift.

Yep, data drift.

I need to check my mobile phone. Fixing data drift is tricky and in today’s zoom zoom world, “good enough” is the benchmark of excellence. Marketers do not want to talk about data drift. What if bad things result? Let the interns fix it next summer?

Stephen E Arnold, October 10, 2023

Cognitive Blind Spot 1: Can You Identify Synthetic Data? Better Learn.

October 5, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_tNote: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

It has been a killer with the back-to-back trips to Europe and then to the intellectual hub of the old-fashioned America. In France, I visited a location allegedly the office of a company which “owns” the domain rrrrrrrrrrr.com. No luck. Fake address. I then visited a semi-sensitive area in Paris, walking around in the confused fog only a 78 year old can generate. My goal was to spot a special type of surveillance camera designed to provide data to a smart software system. The idea is that the images can be monitored through time so a vehicle making frequent passes of a structure can be flagged, its number tag read, and a bit of thought given to answer the question, “Why?” I visited with a friend and big brain who was one of the technical keystones of an advanced search system. He gave me his most recent book and I paid for my Orangina. Exciting.

10 5 financial documents

One executive tells his boss, “Sir, our team of sophisticated experts reviewed these documents. The documents passed scrutiny.” One of the “smartest people in the room” asks, “Where are we going for lunch today?” Thanks, MidJourney. You do understand executive stereotypes, don’t you?

On the flights, I did some thinking about synthetic data. I am not sure that most people can provide a definition which will embrace the Google’s efforts in the money saving land of synthetic. I don’t think too many people know about Charlie Javice’s use of synthetic data to whip up JPMC’s enthusiasm for her company Frank Financial. I don’t think most people understand that when typing a phrase into the Twitch AI Jesus that software will output a video and mostly crazy talk along with some Christian lingo.

The purpose of this short blog post is to present an example of synthetic data and conclude by revisiting the question, “Can You Identify Synthetic Data?” The article I want to use as a hook for this essay is from Fortune Magazine. I love that name, and I think the wolves of Wall Street find it euphonious as well. Here’s the title: “Delta Is Fourth Major U.S. Airline to Find Fake Jet Aircraft Engine Parts with Forged Airworthiness Documents from U.K. Company.”

The write up states:

Delta Air Lines Inc. has discovered unapproved components in “a small number” of its jet aircraft engines, becoming the latest carrier and fourth major US airline to disclose the use of fake parts.  The suspect components — which Delta declined to identify — were found on an unspecified number of its engines, a company spokesman said Monday. Those engines account for less than 1% of the more than 2,100 power plants on its mainline fleet, the spokesman said. 

Okay, bad parts can fail. If the failure is in a critical component of a jet engine, the aircraft could — note that I am using the word could — experience a catastrophic failure. Translating catastrophic into more colloquial lingo, the sentence means catch fire and crash or something slightly less terrible; namely, catch fire, explode, eject metal shards into the tail assembly, or make a loud noise and emit smoke. Exciting, just not terminal.

I don’t want to get into how the synthetic or fake data made its way through the UK company, the UK bureaucracy, the Delta procurement process, and into the hands of the mechanics working in the US or offshore. The fake data did elude scrutiny for some reason. With money being of paramount importance, my hunch is that saving some money played a role.

If organizations cannot spot fake data when it relates to a physical and mission-critical component, how will organizations deal with fake data generated by smart software. The smart software can get it wrong because an engineer-programmer screwed up his or her math or the complex web of algorithms just generate unanticipated behaviors from dependencies no one knew to check and validate.

What happens when computers which many people are “always” more right than a human, says, “Here’s the answer.” Many humans will skip the hard work because they are in a hurry, have no appetite for grunt work, or are scheduled by a Microsoft calendar to do something else when the quality assurance testing is supposed to take place.

Let’s go back to the question in the title of the blog post, “Can You Identify Synthetic Data?”

I don’t want to forget this part of the title, “Better learn.”

JPMC paid out more than $100 million in November 2022 because some of the smartest guys in the room weren’t that smart. But get this. JPMC is a big, rich bank. People who could die because of synthetic data are a different kettle of fish. Yeah, that’s what I thought about as I flew Delta back to the US from Paris. At the time, I thought Delta had not fallen prey to the scam.

I was wrong. Hence, I “better learn” myself.

Stephen E Arnold, October 5, 2023

Next Page »

  • Archives

  • Recent Posts

  • Meta