Amazon Data Sets
February 21, 2023
Do you want to obtain data sets for analysis or making smart software even more crafty? Navigate to the AWS Marketplace. This Web page makes it easy to search through the more than 350 data products on offer. There is a Pricing Model check box. Click it if you want to see the no-cost data sets. There are some interesting options in the left side Refine Results area. For example, there are 366 open data licenses available. I find this interesting because when I examined the page, there were 362 data products. What are the missing four? I noted that there are 2,340 “standard data subscription agreements.” Again the difference between the 366 on offer and the 2,340 is interesting. A more comprehensive listing of data sources appears in the PrivacyRights’ listing. With some sleuthing, you may be able to identify other, lower profile ways to obtain data too. I am not willing to add some color about these sources in this free blog post.
Stephen E Arnold, February 21, 2023
When Dumping an Employee Yields a Conference: Unexpected Consequence? Yep
February 20, 2023
The saga of Google’s management of smart people has taken a surprising twist. On Friday, March 17, 2023, Dr. Timnit Gebru and some colleagues have declared “Stochastic Parrots Day.” The conference is named after the journal article/research paper about some of the risks certain approaches to smart software generates.
Stochastic parrots created by the smart software Craiyon.com. I assume that Craiyon is the owner of these images and that image rights trolls will be on the prowl for violations of the software’s intellectual property. But I enhanced these stochastic parrots, and I wrote this essay. No smart software writing aids for this dinobaby.
You can download the paper “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? The paywalled ACM version is at this link. The authors of the paper that allowed Dr. Gebru to find her future elsewhere are Emily Bender, Angelina McMillan-Major, and another Xoogler purged from the online ad outfit Margaret Mitchell. from this link, which raises a paywall. However, there is a useful summary prepared by Tushar Chandra at this link. According to the conference announcement, the co-authors and “various guests” will “reflect on what has happened in the last two years, what the large language model landscape currently looks like, and where we are headed versus where we should be headed.”
In my experience, employees who have the opportunity to find their future elsewhere start poking around for work. A few start companies or non-profits. Very few set up a new conference named after the paper which [a] blew the whistle on some of the AI craziness reported endlessly in TechMeme and other online information services and [b] put US Army De Oppresso Liber laser on Google’s personnel management methods.
Yep, a conference. A free conference, although a registrant can donate to the organizers.
What’s the unexpected consequence or, I should say, consequences? Let me do a little speculation:
- Google amps up the Sundar and Prabhakar routine about how Google wants to be careful, to earn trust, and, of course, demonstrate that Microsoft’s brilliant marketing play is just stupid. (Who is hallucinating? Microsoft’s OpenAI demonstrations or the Google?)
- The conference attracts the attention of a major conference organizer. I am not sure the ACM will have the moxie to create a conference that appeals to those who are not members. Imagine a two per year Stochastic Parrot program held twice a year. I think it might work.
- This event strikes me as similar to a one of those quantum moments. Is the parrot dead or alive? Predicting how the conference will interact with the real world and what systems and methods find themselves under the parrot’s confocal-type differential interference contrast microscope. What will emerge? Recursive methods fed synthetic data? Higher level abstractions shaped by engineers’ biases? Misinformation ingested so that results don’t match other sources and findings? Carelessness infused with cost cutting in the content training process? Sail and Snorkel perhaps?
Net net: What happens if a stochastic parrot conference gets too big? Answer: Perhaps Jeff Dean will become a speaker and set the record straight? Yikes! Code Super Red?
Stephen E Arnold
Another Grousing Xoogler: A Case Study Under Construction?
February 20, 2023
Say “Google” to me, and I think of:
[a] Philandering in the Google legal unit. See this story.
[b] A senior manager dead on a yacht with a “special” contractor and alleged concoctions not included in a bright child’s chemistry set. See this story.
[c] Solving death. See this story.
[d] An alleged suicide attempt by a high profile Alphabet professional fond of wearing Google Glass at parties and who suffered post traumatic stress when the love boat crashed. See this story.
[e] Google’s click fraud matter. See this story.
[f] Pundits “forgetting” that Google’s pay-to-play was an idea for which Google’s pre-IPO management paid about $1 billion to avoid an expensive legal hassle over alleged improper use of Yahoo, GoTo, and Overture technology. See this story.
I am not sure what you think about when you hear the word “Google.”
Image of trustworthy people generated by Craiyon.com. A dinobaby wrote this Beyond Search story and the caption for the AI generated image which I assume is now in for fee image banks with PicRights’ software protecting everyone’s rights.
“Former Googler Pulls Back the Curtain on a Bureaucratic Maze and Lambastes Bosses and Employees for Losing Sight of What’s Important” suggests that my associations are not comprehensive. A Xoogler wizard named Praveen Seshadri suggested, according to Fortune Magazine:
Google employees don’t go to work each day thinking they serve users or customers. Instead, they serve something internal to Google, be it a process, a technology, a manager, or other employees.
What about promotions, bonuses, and increasing advertising revenue? Not top of mind for Praveen it seems.
Googlers, he allegedly says, according to Fortune:
Instead, the focus is on potential risk, which is seen in “every line code you change” and “anything you launch,” resulting in layer upon layer of processes, reviews, and approvals.
Ah, ha. Parkinson’s Law applied to high school science club management methods, perhaps?
The Fortune write up states:
… today, Seshadri argues in his essay, there is a “collective delusion” within Google that the company is still exceptional, whenin fact most people quietly complain about the overall inefficiency. As a Google employee, “you don’t wake up everyday thinking about how you should be doing better and how your customers deserve better and how you could be working better,” he writes. “Instead, you believe that things you are doing already are so perfect that they are the only way to do it.”
I suppose I should add one more item to my list of associations:
[g] Googlers strugle to perceive the reality their actions have created. See this story.
What happened to Foundem, the French tax forms, and Timnit Gebru? A certain blindness?
Each week appears to bring another installment of the Sundar and Prabhakar team’s comedy act. I look forward to a few laughs from the group now laboring in Code Red mode.
Stephen E Arnold, February 20, 2023
Apple and Google: Money Buys Happiness
February 20, 2023
I read a story published by something called NBC Bay Area. My hunch it is the NBCUniversal, which is a property of the fantastic Comcast outfit. The article is “A Student Used ChatGPT to Cheat in an AI Ethics Class.” My first thought is that whoever pulled off the cheat is an ideal candidate for the super trustworthy pair of Apple and Google.
Why trust?
Consider the allegedly accurate information in “Report: Apple Gets a Cut of Search Revenue from Chrome As Part of Secret Google Deal.” No, this is not the money Google pays Apple to be the search engine in Safari. This “secret” is the alleged kickback from searches “made through some of Google’s own app.” Who cares? My hunch is that the European Union will show an interest in this type of deal if the report is accurate.
Next consider “Google Continued to Ramp Up Federal Lobbying Spending before DOJ Filed Second Antitrust Lawsuit.” Am I surprised? Not really. The write up says:
In the last two years, Google’s parent company ramped up annual lobbying expenditures by nearly 50% — spending more than $13 million on federal lobbying in 2022 alone.
I wonder if Google is trying to exert some influence? I don’t know but with Google cutting costs and telling people in Europe that ChatGPT is not thinking clearly, I wonder if the lobbying money might be put into other projects.
Now back to the ethics of using smart software to cheat in an ethics course about smart software.
Perfect for work at Apple and Google. A few may become lobbyists.
Stephen E Arnold, February 20, 2023
Fixing Bard with a Moma Badge As a Reward
February 17, 2023
I read an interesting news item from CNBC. Yep, CNBC. The story is “Google Asks Employees to Rewrite Bard’s Bad Responses, Says the A.I. Learns Best by Example.” The passage which caught my attention immediately was:
Prabhakar Raghavan, Google’s vice president for search, asked staffers in an email on Wednesday to help the company make sure its new ChatGPT competitor gets answers right. The email, which CNBC viewed, included a link to a do’s and don’ts page with instructions on how employees should fix responses as they test Bard internally.
Hypothetical Moma buttons for right fixes to Google Bard’s off-the-mark answers. Collect them all!
I don’t know much about Googlers, but from what I have observed, the concept “answers right” is fascinating. From my point of view, Googlers must know what is “right.” Therefore, Google can recognize what is wrong. The process, if the sentence accurately reflects the wisdom of Sundar and Prabhakar, is that Google is all knowing.
Let’s look at one definition of all knowing. The source is the ever popular scribe, disabled, and so-so poet John Milton, who described the Google approach to fixing up its smart software by Google wizards, poobahs, and wonder makers. Milton pointed out his God’s approach to addressing a small problem:
What pleasure I from such obedience paid,
When will and reason (reason also is choice)
Useless and vain, of freedom both despoiled,
Made passive both, had served necessity,
Not me. (3.103-111) [Emphasis added, Editor]
Serving necessity? Question: When the software and systems are flawed, humans must intervene … of necessity?
Will Googlers try to identify right information and remediate it? Yes.
Can Googlers determine “right” and “bad” information? Consider this: If these Googlers could, how does one explain the flawed software and systems which must be fixed by “necessity”?
I know Google’s senior managers are bright, but this intervention by the lesser angels strikes me as [a] expensive, [b] an engineering mess, and [c] demonstrating some darned wacky reasoning. But the task is hard. In fact, it is a journey:
… CEO Sundar Pichai asked employees to spend two to four hours of their time on Bard, acknowledging that “this will be a long journey for everyone, across the field.”
But the weirdness of “field” metaphor is nothing to this stunning comment, which is allegedly dead accurate:
To incentivize people in his organization to test Bard and provide feedback, Raghavan said contributors will earn a “Moma badge…”
A Moma badge? A Moma badge? Like an “Also Participated” ribbon or a scouting patch for helping an elderly person across Shoreline Drive?
If the CNBC write up is accurately relating what a senior Googler said, Google’s approach manifests arrogance and a bit of mental neuropathy. My view is that the “Moma badge” thing smacks of a group of adolescents in a high school science club deciding to create buttons to award to themselves for setting the chem lab on fire. Good work, kids. Is the Moma badge and example of Google management insight.
I know one thing: I want a Moma badge… now.
Stephen E Arnold, February 17, 2023
Video: The Path to Non Understanding?
February 17, 2023
I try to believe “everything” I read on the Internet. I have learned that software can hallucinate because a Google wizard says so. I understand that Sam Bankman Fried tried to do “good” as he steered his company to business school case study fame. I embrace the idea that movie stars find synthetic versions of themselves scary. Plus, I really believe the information in “Study: TikTok Increasingly Popular among Kids.” But do we need a study to “prove” what can be observed in a pizza joint, at the gym, or sitting at an interminable traffic light?
Here are some startling findings which are interesting and deeply concerning to me:
- From all app categories, children spent the most time on social media daily, averaging 56 mins/day, followed by online video apps (45 mins/day), and gaming (38 mins/day). [That adds up to the same amount of time spent exercising, reading books about nuclear physics, and working on calculations about Hopf fibrations or about two and one half hours per day.]
- While children increasingly spent more time on social media and video streaming apps, time on communications apps fell, with time on Zoom dipping by 21 per cent, and Skype by 37 per cent. [Who needs to interact when there are injections of content which can be consumed passively. Will consumers of digital media develop sheep-like characteristics and move away from a yapping Blue Heeler?]
- 70 per cent of parents assert that screens and technology are now a distraction from family time, and device use causes weekly or daily arguments in over 49 per cent of households. [Togetherness updated to 2023 norms is essential for a smoothly functioning society of thumbtypers.]
The numbers seem to understate the problem; for example, people of any age can be observed magnetized to their digital devices in these settings:
- Standing on line anywhere
- Sitting on an exercise machine at 7 am absorbing magnetizing digital content
- Attending a Super Bowl party, a bar, or in a lecture hall
- Lying on a gurney waiting for a medical procedure
- Watching a live performance.
What do the data suggest? A fast track to non comprehension. Why understand when one can watch a video about cutting shuffle dance shapes? Who controls what target sees specific content? Is framing an issue important? What if an entity or an AI routine controls content injection directly into an individual’s brain? Control of content suggests control of certain behaviors in my opinion.
Stephen E Arnold, February 17, 2023
Google Pushback: Malik Aforethought?
February 16, 2023
High school reunions will be interesting this year — particularly in a country where youthful relationships persist for life. I read “A Well Known Tech Blogger and Venture Capitalist Says It Might Be Time for Google to Find a New CEO.” The write up includes a sentence I found intriguing about Sundar Pichai, the Google digital leader:
“Google’s board, including the founders, must ask: is Pichai the right guy to run the company, or is it time for Sundar to go? Does the company need a more offense minded CEO? Someone who is not satisfied with status quo, and willing to break some eggs?”
The Microsoft ChatGPT marketing thunderbolt may well put asunder Sundar.
The write up quotes the pundit Om Malik again:
“Google seems to have dragged its feet. The botched demo and lack of action around AI are symptoms of a bigger disease — a company entrapped in its past, inaction, and missed opportunities.”
Imagine. Attending a high school graduation hoe down in Mumbai and having to explain:
- Microsoft’s smart software scorched earth method
- Missing an “opportunity”
- Criticism from one of Silicon Valley’s most loved insiders.
Yep, long evening.
Stephen E Arnold, February 16, 2023
Goggle Points Out the ChatGPT Has a Core Neural Disorder: LSD or Spoiled Baloney?
February 16, 2023
I am an old-fashioned dinobaby. I have a reasonably good memory for great moments in search and retrieval. I recall when Danny Sullivan told me that search engine optimization improves relevance. In 2006, Prabhakar Raghavan on a conference call with a Managing Director of a so-so financial outfit explained that Yahoo had semantic technology that made Google’s pathetic effort look like outdated technology.
Hallucinating pizza courtesy of the super smart AI app Craiyon.com. The art, not the write up it accompanies, was created by smart software. The article is the work of the dinobaby, Stephen E Arnold. Looks like pizza to me. Close enough for horseshoes like so many zippy technologies.
Now that SEO and its spawn are scrambling to find a way to fiddle with increasingly weird methods for making software return results the search engine optimization crowd’s customers demand, Google’s head of search Prabhakar Raghavan is opining about the oh, so miserable work of Open AI and its now TikTok trend ChatGPT. May I remind you, gentle reader, that OpenAI availed itself of some Googley open source smart software and consulted with some Googlers as it ramped up to the tsunami of PR ripples? May I remind you that Microsoft said, “Yo, we’re putting some OpenAI goodies in PowerPoint.” The world rejoiced and Reddit plus Twitter kicked into rave mode.
Google responded with a nifty roll out in Paris. February is not April, but maybe it should have been in April 2023, not in les temp d’hiver?
I read with considerable amusement “Google Vice President Warns That AI Chatbots Are Hallucinating.” The write up states as rock solid George Washington I cannot tell a lie truth the following:
Speaking to German newspaper Welt am Sonntag, Raghavan warned that users may be delivered complete nonsense by chatbots, despite answers seeming coherent. “This type of artificial intelligence we’re talking about can sometimes lead to something we call hallucination,” Raghavan told Welt Am Sonntag. “This is then expressed in such a way that a machine delivers a convincing but completely fictitious answer.”
LSD or just the Google code relied upon? Was it the Googlers of whom OpenAI asked questions? Was it reading the gems of wisdom in Google patent documents? Was it coincidence?
I recall that Dr. Timnit Gebru and her co-authors of the Stochastic Parrot paper suggest that life on the Google island was not palm trees and friendly natives. Nope. Disagree with the Google and your future elsewhere awaits.
Now we have the hallucination issue. The implication is that smart software like Google-infused OpenAI is addled. It imagines things. It hallucinates. It is living in a fantasy land with bean bag chairs, Foosball tables, and memories of Odwalla juice.
I wrote about the after-the-fact yip yap from Google’s Chair Person of the Board. I mentioned the Father of the Darned Internet’s post ChatGPT PR blasts. Now we have the head of search’s observation about screwed up neural networks.
Yep, someone from Verity should know about flawed software. Yep, someone from Yahoo should be familiar with using PR to mask spectacular failure in search. Yep, someone from Google is definitely in a position to suggest that smart software may be somewhat unreliable because of fundamental flaws in the systems and methods implemented at Google and probably other outfits loving the Tensor T shirts.
Stephen E Arnold, February 16, 2023
Bing Gaffes: Errors Made by the Softies
February 16, 2023
Microsoft does a bang up job of marketing. I am not sure the security of the firm’s systems and the ability to allow users to print are comparable. But Microsoft has mindshare.
The images of commuters who missed their search express train are the brilliant work of Craiyon.com smart software. This write up is the work of the dinobaby Stephen E Arnold.
The word “google” now evokes pushback. MSFT’s mission is accomplished. To remind people of the missteps in smart Bing, Simon Willison has compiled an interesting list of flubs. These range from errors about vacuum cleaners to a few threats to users. If you enjoy looking at what happens when smart software demonstrates the thrill of recursive learning, you will want to read “Bing: “I Will Not Harm You Unless You Harm Me First”. I quite liked the essay. I am confident that the Google smart software team will distribute the examples to the wizardly Googlers. Forewarned of embedded and poorly understood error generation is a good thing. Too bad neither Google nor Microsoft considered the issue before the PR tsunami and the new arms race.
Stephen E Arnold, February 16, 2023
Google Wizards: Hey, We Knew But Did Not Intervene. Very Bard Like
February 15, 2023
I read two stories. Each offers a glimpse into what I call backing away and distancing. I think each reveals the failure of Google governance. You may disagree. That’s okay, particularly if the stories are horse feathers. My hunch is that there is a genetically warped turkey under the plumage.
The first item is from the increasingly sensational Insider. The story is “Google Didn’t Think Its Bard AI Was Really Ready for a Product Yet, Says Alphabet Chairman, Days after Its Stock Fell Following the Chatbot’s Very Public Mistake.” The write up pivots on information (allegedly 100 percent dead solid in the bull’s eye) provided by John Hennessy, the chairman of Alphabet. The chair person! What did this captain of the digital titan say? I quote from the write up:
“I think Google was hesitant to productize this because it didn’t think it was really ready for a product yet, but, I think, as a demonstration vehicle, it’s a great piece of technology….He added Google was slow to introduce Bard because it was still giving wrong answers.
From my point of view, isn’t the role of the Board of Directors, and specifically the Chair, supposed to provide what might be called governance guidance? Since this admission of “giving wrong answers” is made public after the disaster in a city where a great lunch is easy to obtain, I would suggest that the bowl of soupe a l’oignon was prepared from a bag of instant convenient food: Not particularly good but perfect for a high school science club snack.
The second item is from CNet, which has some experience with smart software. The article is “Computing Guru Criticizes ChatGPT AI Tech for Making Things Up.” And who is the computing guru? None other than Vint Cerf, one of the father’s of the Internet if I remember something I heard at a conference.
The CNet article reported as actual factual:
But, speaking Monday [February 13, 2023] at Celesta Capital’s TechSurge Summit, he did warn about ethical issues of a technology that can generate plausible sounding but incorrect information even when trained on a foundation of factual material. If an executive tried to get him to apply ChatGPT to some business problem, his response would be to call it snake oil, referring to bogus medicines that quacks sold in the 1800s, he said. Another ChatGPT metaphor involved kitchen appliances.
Then this allegedly accurate quotation from the father of the Internet and Google guru:
“It’s like a salad shooter — you know how the lettuce goes all over everywhere,” Cerf said. “The facts are all over everywhere, and it mixes them together because it doesn’t know any better.”
Did the Googlers crafting Bard run the demonstration by Mr. Cerf? Nope. The write up says:
Cerf said he was surprised to learn that ChatGPT could fabricate bogus information from a factual foundation. “I asked it, ‘Write me a biography of Vint Cerf.’ It got a bunch of things wrong,” Cerf said. That’s when he learned the technology’s inner workings — that it uses statistical patterns spotted from huge amounts of training data to construct its response. “It knows how to string a sentence together that’s grammatically likely to be correct,” but it has no true knowledge of what it’s saying, Cerf said. “We are a long way away from the self-awareness we want.”
It seems to me that if the father of the Internet is on staff, it would make sense to get some inputs.
Let’s recap:
- After the fact, the Chair of the Board points out known problems but does not invoke action based on the need for governance related to product performance. Seems like something slipped betwixt the cup and the lip.l
- After the fact, the father of the Internet points out that he was “surprised” that Google technology generated misinformation. Again … after the fact.
Is the company managed by responsible adults or individuals who believe themselves to be in a high school science club? Are Googlers indifferent to the need to get their act together before they take the show on the road.
I think the French could label either Googlers’ comment as observations offered in l’esprit de l’escalier. Accurate but not management.
Stephen E Arnold, February 15, 2023