Apple Emulates the Timnit Gebru Method

October 26, 2021

Remember Dr. Timnit Gebru. This individual was the researcher who apparently did not go along with the tension flow related to Google’s approach to snorkeling. (Don’t get the snorkel thing? Yeah, too bad.) The solution was to exit Dr. Gebru and move in a Googley manner forward.

Now the Apple “we care about privacy” outfit appears to have reached a “me too moment” in management tactics.

Two quick examples:

First, the very Silicon Valley Verge published “Apple Just Fired a Leader of the #AppleToo Movement.” I am not sure what the AppleToo thing encompasses, but it obviously sparked the Timnit option. The write up says:

Apple has fired Janneke Parrish, a leader of the #AppleToo movement, amid a broad crackdown on leaks and worker organizing. Parrish, a program manager on Apple Maps, was terminated for deleting files off of her work devices during an internal investigation — an action Apple categorized as “non-compliance,” according to people familiar with the situation.

Okay, deletes are bad. I figured that out when Apple elected to get rid of the backspace key.

Second, Gizmodo, another Silicon Valley information service, revealed “Apple Wanted Her Fired. It Settled on an Absurd Excuse.” The write up reports:

The next email said she’d been fired. Among the reasons Apple provided, she’d “failed to cooperate” with what the company called its “investigatory process.”

Hasta la vista, Ashley Gjøvik.

Observations:

  • The Timnit method appears to work well when females are involved in certain activities which run contrary to the Apple way. (Note that the Apple way includes flexibility in responding to certain requests from nation states like China.)
  • The lack of information about the incidents is apparently part of the disappearing method. Transparency? Yeah, not so much in Harrod’s Creek.
  • The one-two disappearing punch is fascinating. Instead of letting the dust settle, do the bang-bang thing.

Net net: Google’s management methods appear to be viral at least in certain management circles.

Stephen E Arnold, October 26, 2021

Google Issues Apology To Timnit Gebru

December 15, 2020

Timnit Gebru is one of the world’s leading experts on AI ethics. She formerly worked at Google, where she assembled one of the most diverse Google Brain research teams. Google decided to fire her after she refused to rescind a paper she wrote concerning about risks deploying large language models. Venture Beat has details in the article: “Timnit Gebru: Google’s ‘Dehumanizing’ Memo Paints Me As An Angry Black Woman” and The Global Herald has an interview with Gebru: “Firing Backlash Led To Google CEO Apology: Timnit Gebru.”

Gebru states that the apology was not meant for her, but for the reactions Google received from the fallout of her firing. Gebru’s entire community of associates and friends stay behind her stance of not rescinding her research. She holds her firing up as an example of corporate censorship of unflattering research as well as sexism and racism.

Google painted Gebru as a stereotypical angry black woman and used her behavior as an excuse for her termination. I believe Gebru’s firing has little to do with racism and sexism. Google’s response has more to do with getting rid of an noncompliant cog in their machine, but in order to oust Gebru they relied on stereotypical means and gaslighting.

Google’s actions are disgusting. Organizations treat all types of women and men like this so they can save face and remove unsavory minions. Gaslighting is a typical way for organizations to downplay their bad actions and make the whistleblower the villain.

Gebru’s unfortunate is typical for many, but she offered this advice:

“What I want these women to know is that it’s not in your head. It’s not your fault. You are amazing, and do not let the gaslighting stop you. I think with gaslighting the hardest thing is there’s repercussions for speaking up, but there’s also shame. Like a lot of times people feel shame because they feel like they brought it upon themselves somehow.”

There are better options out there for Gebru and others in similar situations. Good luck to Gebru and others like her!

Whitney Grace, December 15, 2020

Gebru-Gibberish Gives Google Gastroenteritis

February 24, 2021

At the outset, I want to address Google’s Gebru-gibberish

Definition: Gebru-gibberish refers to official statements from Alphabet Google about the personnel issues related to the departure of two female experts in artificial intelligence working on the ethics of smart software. Gebru-gibberish is similar to statements made by those in fear of their survival.

Gastroenteritis: Watch the ads on Fox News or CNN for video explanations: Adult diapers, incontinence, etc.

Psychological impact: Fear, paranoia, flight reaction, irrational aggressiveness. Feelings of embarrassment, failure, serious injury, and lots of time in the WC.

The details of the viral problem causing discomfort among the world’s most elite online advertising organization relates to the management of Dr. Timnit Gebru. To add to the need to keep certain facilities nearby, the estimable Alphabet Google outfit apparently dismissed Dr. Margaret Mitchell. The output from the world’s most sophisticated ad sales company was Gebru-gibberish. Now those words have characterized the shallowness of the Alphabet Google thing’s approach to smart software.

In order to appreciate the problem, take a look at “Underspecification Presents Challenges for Credibility in Modern Machine Learning.” Here’s the author listing and affiliation for the people who contributed to the paper available without cost on ArXiv.org:

image

The image is hard to read. Let me point out that the authors include more than 30 Googlers (who may become Xooglers in between dashes to the WC).

The paper is referenced in a chatty Medium write up called “Is Google’s AI Research about to Implode?” The essay raises an interesting possibility. The write up contains an interesting point, one that suggests that Google’s smart software may have some limitations:

Underspecification presents significant challenges for the credibility of modern machine learning.

Why the apparently illogical behavior with regard to Drs. Gebru and Mitchell?

My view is that the Gebru-gibberish released from Googzilla is directly correlated with the accuracy of the information presented in the “underspecification” paper. Sure, the method works in some cases, just as the 1998 Autonomy black box worked in some cases. However, to keep the accuracy high, significant time and effort must be invested. Otherwise, smart software evidences the charming characteristic of “drift”; that is, what was relevant before new content was processed is perceived as irrelevant or just incorrect in subsequent interactions.

What does this mean?

Small, narrow domains work okay. Larger content domains work less okay.

Heron Systems, using a variation of the Google DeepMind approach, was able to “kill” a human in a simulated dog flight. However, the domain was small and there were some “rules.” Perfect for smart software. The human top gun was dead fast. Larger domains like dealing with swarms of thousands of militarized and hardened unmanned aerial vehicles and a simultaneous series of targeted cyber attacks using sleeper software favored by some nation states means that smart software will be ineffective.

What will Google do?

As I have pointed out in previous blog posts, the high school science club management method employed by Backrub has become the standard operating procedure at today’s Alphabet Google.

Thus, the question, “Is Google’s AI research about to implode?” is a good one. The answer is, “No.” Google has money; it has staff who tow the line; and it has its charade of an honest, fair, and smart online advertising system.

Let me suggest a slight change to the question; to wit: “Is Google at a tipping point?” The answer to this question is, “Yes.”

Gibru-gibberish is similar to the information and other outputs of Icarus, who flew too close to the sun and flamed out in a memorable way.

Stephen E Arnold, February 24, 2021

Google Gems for the Week of 19 February, 2024

February 27, 2024

green-dino_thumbThis essay is the work of a dumb humanoid. No smart software required.

This week’s edition of Google Gems focuses on a Hope Diamond and a handful of lesser stones. Let’s go.

THE HOPE DIAMOND

In the chaos of the AI Gold Rush, horses fall and wizard engineers realize that they left their common sense in the saloon. Here’s the Hope Diamond from the Google.

The world’s largest online advertising agency created smart software with a lot of math, dump trucks filled with data, and wizards who did not recall that certain historical figures in the US were not of color. “Google Says Its AI Image-Generator Would Sometimes Overcompensate for Diversity,” an Associated Press story, explains in very gentle rhetoric that its super sophisticate brain and DeepMind would get the race of historical figures wrong. I think this means that Ben Franklin could look like a Zulu prince or George Washington might have some resemblance to Rama (blue skin, bow, arrow, and snappy hat).

My favorite search and retrieval expert Prabhakar Raghavan (famous for his brilliant lecture in Paris about the now renamed Bard) indicated that Google’s image rendering system did not hit the bull’s eye. No, Dr. Raghavan, the digital arrow pierced the micrometer thin plastic wrap of Google’s super sophisticated, quantum supremacy, gee-whiz technology.

image

The message I received from Google when I asked for an illustration of John Hancock, an American historical figure. Too bad because this request goes against Google’s policies. Yep, wizards infused with the high school science club management method.

More important, however, was how Google’s massive stumble complemented OpenAI’s ChatGPT wonkiness. I want to award the Hope Diamond Award for AI Ineptitude to both Google and OpenAI. But, alas, there is just one Hope Diamond. The award goes to the quantumly supreme outfit Google.

[Note: I did not quote from the AP story. Why? Years ago the outfit threatened to sue people who use their stories’ words. Okay, no problemo, even though the newspaper for which I once worked supported this outfit in the days of “real” news, not recycled blog posts. I listen, but I do not forget some things. I wonder if the AP knows that Google Chrome can finish a “real” journalist’s sentences for he/him/she/her/it/them. Read about this “feature” at this link.]

Here are my reasons:

  1. Google is in catch-up mode and like those in the old Gold Rush, some fall from their horses and get up close and personal with hooves. How do those affect the body of a wizard? I have never fallen from a horse, but I saw a fellow get trampled when I lived in Campinas, Brazil. I recall there was a lot of screaming and blood. Messy.
  2. Google’s arrogance and intellectual sophistication cannot prevent incredible gaffes. A company with a mixed record of managing diversity, equity, etc. has demonstrated why Xooglers like Dr. Timnit Gebru find the company “interesting.” I don’t think Google is interesting. I think it is disappointing, particularly in the racial sensitivity department.
  3. For years I have explained that Google operates via the high school science club management method. What’s cute when one is 14 loses its charm when those using the method have been at it for a quarter century. It’s time to put on the big boy pants.

OTHER LITTLE GEMMAS

The previous week revealed a dirt trail with some sharp stones and thorny bushes. Here’s a quick selection of the sharpest and thorniest:

  1. The Google is running webinars to inform publishers about life after their wonderful long-lived cookies. Read more at Fipp.com.
  2. Google has released a small model as open source. What about the big model with the diversity quirk? Well, no. Read more at the weird green Verge thing.
  3. Google cares about  AI safety. Yeah, believe it or not. Read more about this PR move on Techcrunch.
  4. Web search competitors will fail. This is a little stone. Yep, a kidney stone for those who don’t recall Neeva. Read more at Techpolicy.
  5. Did Google really pay $60 million to get that outstanding Reddit content. Wow. Maybe Google looks at different sub reddits than my research team does. Read more about it in 9 to 5 Google.
  6. What happens when an uninformed person uses the Google Cloud? Answer: Sticker shock. More about this estimable method in The Register.
  7. Some spoil sport finds traffic lights informed with Google’s smart software annoying. That’s hard to believe. Read more at this link.
  8. Google pointed out in a court filing that DuckDuckGo was a meta search system (that is, a search interface to other firm’s indexes) and Neeva was a loser crafted by Xooglers. Read more at this link.

No Google Hope Diamond report would be complete without pointing out that the online advertising giant will roll out its smart software to companies. Read more at this link. Let’s hope the wizards figure out that historical figures often have quite specific racial characteristics like Rama.

I wanted to include an image of Google’s rendering of a signer of the Declaration of Independence. What you see in the illustration above is what I got. Wow. I have more “gemmas”, but I just don’t want to present them.

Stephen E Arnold, February 27, 2024

A Decision from the High School Science Club School of Management Excellence

January 11, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

I can’t resist writing about Inc. Magazine and its Google management articles. These are knee slappers for me. The write up causing me to chuckle is “Google’s CEO, Sundar Pichai, Says Laying Off 12,000 Workers Was the Worst Moment in the Company’s 25-Year History.” Zowie. A personnel decision coupled with late-night, anonymous termination notices — What’s not to like. What’s the “real” news write up have to say:

Google had to lay off 12,000 employees. That’s a lot of people who had been showing up to work, only to one day find out that they’re no longer getting a paycheck because the CEO made a bad bet, and they’re stuck paying for it.

image

“Well, that clever move worked when I was in my high school’s science club. Oh, well, I will create a word salad to distract from my decision making.Heh, heh, heh,” says the distinguished corporate leader to a “real” news publication’s writer. Thanks, MSFT Copilot Bing thing. Good enough.

I love the “had.”

The Inc. Magazine story continues:

Still, Pichai defends the layoffs as the right decision at the time, saying that the alternative would have been to put the company in a far worse position. “It became clear if we didn’t act, it would have been a worse decision down the line,” Pichai told employees. “It would have been a major overhang on the company. I think it would have made it very difficult in a year like this with such a big shift in the world to create the capacity to invest in areas.”

And Inc Magazine actually criticizes the Google! I noted:

To be clear, what Pichai is saying is that Google decided to spend money to hire employees that it later realized it needed to invest elsewhere. That’s a failure of management to plan and deliver on the right strategy. It’s an admission that the company’s top executives made a mistake, without actually acknowledging or apologizing for it.

From my point of view, let’s focus on the word “worst.” Are there other Google management decisions which might be considered in evaluating the Inc. Magazine and Sundar Pichai’s “worst.” Yep, I have a couple of items:

  1. A lawyer making babies in the Google legal department
  2. A Google VP dying with a contract worker on the Googler’s yacht as a result of an alleged substance subject to DEA scrutiny
  3. A Googler fond of being a glasshole giving up a wife and causing a soul mate to attempt suicide
  4. Firing Dr. Timnit Gebru and kicking off the stochastic parrot thing
  5. The presentation after Microsoft announced its ChatGPT initiative and the knee jerk Red Alert
  6. Proliferating duplicative products
  7. Sunsetting services with little or no notice
  8. The Google Map / Waze thing
  9. The messy Google Brain Deep Mind shebang
  10. The Googler who thought the Google AI was alive.

Wow, I am tired mentally.

But the reality is that I am not sure if anyone in Google management is particularly connected to the problems, issues, and challenges of losing a job in the midst of a Foosball game. But that’s the Google. High school science club management delivers outstanding decisions. I was in my high school science club, and I know the fine decision making our members made. One of those cost the life of one of our brightest stars. Stars make bad decisions, chatter, and leave some behind.

Stephen E Arnold, January 11, 2024

Microsoft at Davos: Is Your Hair on Fire, Google?

November 2, 2023

green-dino_thumb_thumbThis essay is the work of a dumb humanoid. No smart software required.

Microsoft said at the January 2023 Davos, AI is the next big thing. The result? Google shifted into Code Red and delivered a wild and crazy demonstration of a deeply flawed AI system in February 2023. I think the phrase “Code Red” became associated to the state of panic within the comfy confines of Googzilla’s executive suites, real and virtual.

Sam AI-man made appearances speaking to anyone who would listen words like “billion dollar investment,” efficiency, and work processes. The result? Googzilla itself  found out that whether Microsoft’s brilliant marketing of AI worked or not, the Softies had just demonstrated that it — not the Google — was a “leader”. The new Microsoft could create revenue  and credibility problems for the Versailles of technology companies.

Therefore, the Google tried to try and be nimble and make the myth of engineering prowess into reality, not a CGI version of Camelot. The PR Camelot featured Google as the Big Dog in the AI world. After all, Google had done the protein thing, an achievement which made absolutely no sense to 99 percent of the earth’s population. Some asked, “What the heck is a protein folder?” I want a Google Waze service that shows me where traffic cameras are.

The Google executives apparently went to meetings with their hair on fire.

11 2 code red at google

A group of Google executives in a meeting with their hair on fire after Microsoft’s Davos AI announcement. Google wanted teams to manifest AI prowess everywhere, lickity split. Google reorganized. Google probed Anthropic and one Googler invested in the company. Dr. Prabhakar Raghavan demonstrated peculiar communication skills.

I had these thoughts after I read “Google Didn’t Rush Bard Chatbot to Beat Microsoft, Executive Says.” So what was this Code Red thing? Why has Google — the quantum supremacy and global leader in online advertising and protein folding — be lagging behind Microsoft? What is it now? Oh, yeah. Almost a year, a reorganization of the Google’s smart software group, and one of Google’s own employees explaining that AI could have a negative impact on the world. Oh, yeah, that guy is one of the founders of Google’s DeepMind AI group. I won’t mention the Googler who thought his chatbot was alive and ended up with an opportunity to find his future elsewhere. Right. Code Red. I want to note Timnit Gebru and the stochastic parrot, the Jeff Dean lateral arabesque, and the significant investment in a competitor’s AI technology. Right. Standard operating procedure for an online advertising company with a fairly healthy self concept about its excellence and droit du seigneur.

The Bloomberg article reports which I am assuming is “real”, actual factual information:

A senior Google executive disputed suggestions that the company rushed to release its artificial intelligence-based chatbot Bard earlier this year to beat a similar offering from rival Microsoft Corp. Testifying in Google’s defense at the Justice Department’s antitrust trial against the search giant, Elizabeth Reid, a vice president of search, acknowledged that Bard gave “a wrong answer” during its public unveiling in February. But she rejected the contention by government lawyer David Dahlquist that Bard was “rushed” out after Microsoft announced it was integrating generative AI into its own Bing search engine.

The real news story pointed out:

Google’s public demonstration of Bard underwhelmed investors. In one instance, Bard was asked about new discoveries from the James Webb Space Telescope. The chatbot incorrectly stated the telescope was used to take the first pictures of a planet outside the Earth’s solar system. While the Webb telescope was the first to photograph one particular planet outside the Earth’s solar system, NASA first photographed a so-called exoplanet in 2004. The mistake led to a sharp fall in Alphabet’s stock. “It’s a very subtle language difference,” Reid said in explaining the error in her testimony Wednesday. “The amount of effort to ensure that a paragraph is correct is quite a lot of work.” “The challenges of fact-checking are hard,” she added.

Yes, facts are hard in Hallucinationville? I think the concept I take away from this statement is that PR is easier than making technology work. But today Google and similar firms are caught in what I call a “close enough for horseshoes” mind set. Smart software, in my experience, is like my dear, departed mother’s  not-quite-done pineapple upside down cakes. Yikes, those were a mess. I could eat the maraschino cherries but nothing else. The rest was deposited in the trash bin.

And where are the “experts” in smart search? Prabhakar? Danny? I wonder if they are embarrassed by their loss of their thick lustrous hair. I think some of it may have been singed after the outstanding Paris demonstration and subsequent Mountain View baloney festivals. Was Google behaving like a child frantically searching for his mom at the AI carnival? I suppose when one is swathed in entitlements, cashing huge paychecks, and obfuscating exactly how the money is extracted from advertisers, reality is distorted.

Net net: Microsoft at Davos caused Google’s February 2023 Paris presentation. That mad scramble has caused to conclude that talking about AI is a heck of a lot easier than delivering reliable, functional, and thought out products. Is it possible to deliver such products when one’s hair is on fire? Some data say, “Nope.”

Stephen E Arnold, November 2, 2023

Data Drift: Yes, It Is Real and Feeds on False Economy Methods

October 10, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_t[2]Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

When I mention statistical drift, most of those in my lectures groan and look at their mobile phone. I am delighted to call attention to a write up called “The Model-Eat-Model World’ of Clinical AI: How Predictive Power Becomes a Pitfall.” The article focuses on medical information, but its message applies to a wide range of “smart” models. These include the Google shortcuts of Snorkel to the Bayesian based systems in vogue in many policeware and intelware products. The behavior appears to have influenced Dr. Timnit Gebru and contributed to her invitation to find her future elsewhere from none other than the now marginalized Google Brain group. (Googlers do not appreciate being informed of their shortcomings it seems.)

10 10 young exec

The young shark of Wall Street ponders his recent failure at work. He thinks, “I used those predictive models as I did last year. How could they have gone off the rails. I am ruined.” Thanks, MidJourney. Manet you are not.

The main idea is that as numerical recipes iterate, the outputs deteriorate or wander off the desired path. The number of cycles require to output baloney depends on the specific collections of procedures. But wander these puppies do. To provide a baseline, users of the Autonomy Bayesian system found that after three months of operation, precision and recall were deteriorated. The fix was to retrain the system. Flash forward today to systems that iterate many times faster than the Autonomy neurolinguistic programming method, and the lousy outputs can appear in a matter of hours. There are corrective steps one can take, but these are expensive when they involve humans. Thus, some predictive outputs have developed smart software to try and keep the models from jumping their railroad tracks. When the models drift, the results seem off kilter.

The write up says:

Last year, an investigation from STAT and the Massachusetts Institute of Technology captured how model performance can degrade over time by testing the performance of three predictive algorithms. Over the course of a decade, accuracy for predicting sepsis, length of hospitalization, and mortality varied significantly. The culprit? A combination of clinical changes — the use of new standards for medical coding at the hospital — and an influx of patients from new communities. When models fail like this, it’s due to a problem called data drift.

Yep, data drift.

I need to check my mobile phone. Fixing data drift is tricky and in today’s zoom zoom world, “good enough” is the benchmark of excellence. Marketers do not want to talk about data drift. What if bad things result? Let the interns fix it next summer?

Stephen E Arnold, October 10, 2023

Google: When Wizards Cannot Talk to One Another

August 1, 2023

Note: Dinobaby here: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid. Services are now ejecting my cute little dinosaur gif. Like my posts related to the Dark Web, the MidJourney art appears to offend someone’s sensibilities in the datasphere. If I were not 78, I might look into these interesting actions. But I am and I don’t really care.

Google is in the vanguard of modern management methods. As a dinobaby, I thought that employees who disagree would talk about the issue and work out a solution. Perhaps it would be a test of Option A and Option B? Maybe a small working group would dive into a tough technical point and generate a list of talking points for further discussion, testing, and possibly an opinion from a consulting firm?

How would my old-fashioned approach work?

7 24 cannot talk

One youthful wizard says, “Your method is not in line with the one we have selected.” The other youthful wizard replies, “Have you tested both and logged the data?” The very serious wizard with the bigger salary responds, “That’s not necessary. Your method is not in line with the one we have selected. By the way, you may find your future elsewhere.” Thanks MidJourney. You have nailed the inability of certain smart people to discuss without demeaning another. Has this happened to you MidJourney?

The answer is, “Are you crazy?”

Navigate to “Google Fails to Get AI Engineer Lawsuit Claiming Wrongful Termination Thrown Out.” As I understand the news report, Google allegedly fired a person who wrote a paper allegedly disagreeing with another Google paper. This, if true, reminded me of the Stochastic Parrot dust up which made Googler Dr. Timnit Gebru a folk hero among some. She is finding her future elsewhere now.

Navigate to the cited article to get more details.

Several points:

  1. Google appears to be unable to resolve internal discussions without creating PR instead of technical progress.
  2. The management methods strike me as illogical. I recall discussions with Googlers about the importance of logic, and it is becoming clear to me that Google logic follows it own rules. (Perhaps Google people managers should hire people that can thrive within Google logic?)
  3. The recourse to the legal system to resolve which may be a technical matter is intellectually satisfying. I am confident that judges, legal eagles, expert witnesses are fully versed in chip engineering for complex and possibly proprietary methods. Have Google people management personnel considered just hiring such multi-faceted legal brains and eliminating wrong-thinking engineers?

Net net: A big time “real” news reporter objected to my use of the phrase “high school management methods.” Okay, perhaps “adolescent management methods” or “adolescent thought processes” are more felicitous phrases. But not for me. These fascinating Google management methods which generate news and legal precedents may render it unnecessary for the firm to use such words as “trust,” “user experience,” and other glittering generalities.

The reality is that cooperative resolution seems to be a facet of quantum supremacy that this dinobaby does not understand.

Stephen E Arnold, August 1, 2023

Is Smart Software Above Navel Gazing: Nope, and It Does Not Care

June 15, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_t[1]_thumb_thumb_thumb_thumb_thumb_thumbNote: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

Synthetic data. Statistical smoothing. Recursive methods. When we presented our lecture “OSINT Blindspots” at the 2023 National Cyber Crime Conference, the audience perked up. The terms might have been familiar, but our framing caught the more than 100 investigators’ attention. The problem my son (Erik) and I described was butt simple: Faked data will derail a prosecution if an expert witness explains that machine-generated output may be wrong.

We provided some examples, ranging from a respected executive who obfuscates his “real” business from a red-herring business. We profiled how information about a fervid Christian adherence to God’s precepts overshadowed a Ponzi scheme. We explained how an American living in Eastern Europe openly flaunts social norms in order to distract authorities from an encrypted email business set up to allow easy, seamless communication for interesting people. And we included more examples.

6 14 how long befoe...

An executive at a big time artificial intelligence firm looks over his domain and asks himself, “How long will it take for the boobs and boobettes to figure out that our smart software is wonky?” The illustration was spit out by the clever bits and bytes at MidJourney.

What’s the point in this blog post? Who cares besides analysts, lawyers, and investigators who have to winnow facts which are verifiable from shadow or ghost information activities?

It turns out that a handful of academics seem to have an interest in information manipulation. Their angle of vision is broader than my team’s. We focus on enforcement; the academics focus on tenure or getting grants. That’s okay. Different points of view lead to interesting conclusions.

Consider this academic and probably tough to figure out illustration from “The Curse of Recursion: Training on Generated Data Makes Models Forget”:

image

A less turgid summary of the researchers’ findings appears at this location.

The main idea is that gee-whiz methods like Snorkel and small language models have an interesting “feature.” They forget; that is, as these models ingest fake data they drift, get lost, or go off the rails. Synthetic cloth, unlike natural cotton T shirts, look like shirts. But on a hot day, those super duper modern fabrics can cause a person to perspire and probably emit unusual odors.

The authors introduce and explain “model collapse.” I am no academic. My interpretation of the glorious academic prose is that the numerical recipes, systems, and methods don’t work like the nifty demonstrations. In fact, over time, the models degrade. The hapless humanoids who are dependent on these lack the means to figure out what’s on point and what’s incorrect. The danger, obviously, is that clueless and lazy users of smart software make more mistakes in judgment than a person might otherwise reach.

The paper includes fancy mathematics and more charts which do not exactly deliver on the promise of a picture is worth a thousand words. Let me highlight one statement from the journal article:

Our evaluation suggests a “first mover advantage” when it comes to training models such as LLMs. In our work we demonstrate that training on samples from another generative model can induce a distribution shift, which over time causes Model Collapse. This in turn causes the model to mis-perceive the underlying learning task. To make sure that learning is sustained over a long time period, one needs to make sure that access to the original data source is preserved and that additional data not generated by LLMs remain available over time. The need to distinguish data generated by LLMs from other data raises questions around the provenance of content that is crawled from the Internet: it is unclear how content generated by LLMs can be tracked at scale. One option is community-wide coordination to ensure that different parties involved in LLM creation and deployment share the information needed to resolve questions of provenance. Otherwise, it may become increasingly difficult to train newer versions of LLMs without access to data that was crawled from the Internet prior to the mass adoption of the technology, or direct access to data generated by humans at scale.

Bang on.

What the academics do not point out are some “real world” business issues:

  1. Solving this problem costs money; the point of synthetic and machine-generated data is to reduce costs. Cost reduction wins.
  2. Furthermore, fixing up models takes time. In order to keep indexes fresh, delays are not part of the game plan for companies eager to dominate a market which Accenture pegs as worth trillions of dollars. (See this wild and crazy number.)
  3. Fiddling around to improve existing models is secondary to capturing the hearts and minds of those eager to worship a few big outfits’ approach to smart software. No one wants to see the problem because that takes mental effort. Those inside one of firms vying to own information framing don’t want to be the nail that sticks up. Not only do the nails get pounded down, they are forced to leave the platform. I call this the Dr. Timnit Gebru effect.

Net net: Good paper. Nothing substantive will change in the short or near term.

Stephen E Arnold, June 15, 2023

Google DeepMind Risk Paper: 60 Pages with a Few Googley Hooks

May 22, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_tNote: This essay is the work of a real and still-alive dinobaby. No smart software involved in writing, just a dumb humanoid.

I read the long version of “Ethical and Social Risks of Harm from Language Models.” The paper is mostly statements and footnotes to individuals who created journal-type articles which prove the point of each research article. With about 25 percent of the peer reviewed research including shaped, faked, or weaponized data – I am not convinced by footnotes. Obviously the DeepMinders believe that footnotes make a case for the Google way. I am not convinced because the Google has to find a way to control the future of information. Why? Advertising money and hoped for Mississippis of cash.

The research paper dates from 2021 and is part of Google’s case for being ahead of the AI responsibility game. The “old” paper reinforces the myth that Google is ahead of everyone else in the AI game. The explanation for Sam AI-man’s and Microsoft’s markeitng coup is that Google had to go slow because Google knew that there were ethical and social risks of harm from the firm’s technology. Google cares about humanity! The old days of “move fast and break things” are very 1998. Today Google is responsible. The wild and crazy dorm days are over. Today’s Google is concerned, careful, judicious, and really worried about its revenues. I think the company worries about legal actions, its management controversies, and its interdigital dual with the Softies of Redmond.

5 17 hunting for footnotes 2

A young researcher desperately seeking footnotes to support a specious argument. With enough footnotes, one can move the world it seems. Art generated by the smart software MidJourney.

I want to highlight four facets of the 60 page risks paper which are unlikely to get much, if any, attention from today’s “real” journalists.

Googley hook 1: Google wants to frame the discussion. Google is well positioned to “guide mitigation work.” The examples in the paper are selected to “guiding action to resolve any issues that can be identified in advance.” My comment: How magnanimous of Google. Framing stakes out the Googley territory. Why? Google wants to be Googzilla and reap revenue from its users, licensees, models, synthetic data, applications, and advertisers. You can find the relevant text in the paper on page 6 in the paragraph beginning “Responsible innovation.”

Googley hook 2: Google’s risks paper references fuzzy concepts like “acceptability” and “fair.” Like love, truth, and ethics, the notion of “acceptability” is difficult to define. Some might suggest that it is impossible to define. But Google is up to the task, particularly for application spaces unknown at this time. What happens when you apply “acceptability” to “poor quality information.” One just accepts the judgment of the outfit doing the framing. That’s Google. Game. Set. Match. You can find the discussion of “acceptability” on page 9.

Googley hook 3: Google is not going to make the mistake of Microsoft and its racist bot Tay. No way, José. What’s interesting is that the only company mentioned in the text of the 60 page paper is Microsoft. Furthermore, the toxic aspects of large language models are hard for technologies to detect (page18). Plus large language models can infer a person’s private data. So “providing true information is not always beneficial (Page 21). What’s the fix? Use smaller sets of training data… maybe. (Page 22). But one can fall back on trust — for instance, trust in Google the good — to deal with these challenges. In fact, trust Google to choose training data to deal with some of the downsides of large language models (Page 24).

Googley hook 4: Making smart software dependent on large language models that mitigates risk is expensive. Money, smart people who are in short supply, and computing resources are expensive.  Therefore, one need not focus on the origin point (large language model training and configuration). Direct attention at those downstream. Those users can deal with the identified 21 problems. The Google method puts Google out of the primary line of fire. There are more targets for the aggrieved to seek and shoot at (Page 37).

When I step back from the article which is two years old, it is obvious Google was aware of some potential issues with its approach. Dr. Timnit Gebru was sacrificed on a pyre of spite. (She does warrant a couple of references and a footnote or two. But she’s now a Xoogler. The one side effect was that Dr. Jeff Dean, who was not amused by the stochastic parrot has been kicked upstairs and the UK “leader” is now herding the little wizards of Google AI.

The conclusion of the paper echoes the Google knows best argument. Google wants a methodological toolkit because that will keep other people busy. Google wants others to figure out fair, an approach that is similar to Sam Altman (OpenAI) who begs for regulation of a sector about which much is unknown.

The answer, according to the risk analysis is “responsible innovation.” I would suggest that this paper, the television interviews, the PR efforts to get the Google story in as many places as possible are designed to make the sluggish Google a player in the AI game.

Who will be fooled? Will Google catch up in this Silicon Valley venture invigorating hill climb? For me the paper with the footnotes is just part of Google’s PR and marketing effort. Your mileage may vary. May relevance be with you, gentle reader.

Stephen  E Arnold, May 22, 2023

Next Page »

  • Archives

  • Recent Posts

  • Meta