Problematic Smart Algorithms

December 12, 2023

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

We already know that AI is fundamentally biased if it is trained with bad or polluted data models. Most of these biases are unintentional due ignorance on the part of the developers, I.e. lack diversity or vetted information. In order to improve the quality of AI, developers are relying on educated humans to help shape the data models. Not all of the AI projects are looking to fix their polluted data and ZD Net says it’s going to be a huge problem: “Algorithms Soon Will Run Your Life-And Ruin It, If Trained Incorrectly.”

Our lives are saturated with technology that has incorporated AI. Everything from an application used on a smartphone to a digital assistant like Alexa or Siri uses AI. The article tells us about another type of biased data and it’s due to an ironic problem. The science team of Aparna Balagopalan, David Madras, David H. Yang, Dylan Hadfield-Menell, Gillian Hadfield, and Marzyeh Ghassemi worked worked on an AI project that studied how AI algorithms justified their predictions. The data model contained information from human respondents who provided different responses when asked to give descriptive or normative labels for data.

Normative data concentrates on hard facts while descriptive data focuses on value judgements. The team noticed the pattern so they conducted another experiment with four data sets to test different policies. The study asked the respondents to judge an apartment complex’s policy about aggressive dogs against images of canines with normative or descriptive tags. The results were astounding and scary:

"The descriptive labelers were asked to decide whether certain factual features were present or not – such as whether the dog was aggressive or unkempt. If the answer was "yes," then the rule was essentially violated — but the participants had no idea that this rule existed when weighing in and therefore weren’t aware that their answer would eject a hapless canine from the apartment.

Meanwhile, another group of normative labelers were told about the policy prohibiting aggressive dogs, and then asked to stand judgment on each image.

It turns out that humans are far less likely to label an object as a violation when aware of a rule and much more likely to register a dog as aggressive (albeit unknowingly ) when asked to label things descriptively.

The difference wasn’t by a small margin either. Descriptive labelers (those who didn’t know the apartment rule but were asked to weigh in on aggressiveness) had unwittingly condemned 20% more dogs to doggy jail than those who were asked if the same image of the pooch broke the apartment rule or not.”

The conclusion is that AI developers need to spread the word about this problem and find solutions. This could be another fear mongering tactic like the Y2K implosion. What happened with that? Nothing. Yes, this is a problem but it will probably be solved before society meets its end.

Whitney Grace, December 12, 2023

Amazon Offers AI-Powered Review Consolidation for Busy Shoppers

September 6, 2023

I read the reviews for a product. I bought the product. Reality was — how shall I frame it — different from the word pictures. Trust those reviews. ? Hmmm. So far, Amazon’s generative AI focus has been on supplying services to developers on its AWS platform. Now, reports ABC News, “Amazon Is Rolling Out a Generative AI Feature that Summarizes Product Reviews.” Writer Haleluya Hadero tells us:

“The feature, which the company began testing earlier this year, is designed to help shoppers determine at a glance what other customers said about a product before they spend time reading through individual reviews. It will pick out common themes and summarize them in a short paragraph on the product detail page.”

A few mobile shoppers have early access to the algorithmic summaries while Amazon tweaks the tool with user feedback. Eventually, the company said, shoppers will be able to surface common themes in reviews. Sounds nifty, but there is one problem: Consolidating reviews that are fake, generated by paid shills, or just plain wrong does nothing to improve their accuracy. But Amazon is more eager to jump on the AI bandwagon than to perform quality control on its reviews system. We learn:

“The Seattle-based company has been looking for ways to integrate more artificial intelligence into its product offerings as the generative AI race heats up among tech companies. Amazon hasn’t released its own high-profile AI chatbot or imaging tool. Instead, it’s been focusing on services that will allow developers to build their own generative AI tools on its cloud infrastructure AWS. Earlier this year, Amazon CEO Andy Jassy said in his letter to shareholders that generative AI will be a ‘big deal’ for the company. He also said during an earnings call with investors last week that ‘every single one’ of Amazon’s businesses currently has multiple generative AI initiatives underway, including its devices unit, which works on products like the voice assistant Alexa.”

Perhaps one day Alexa will recite custom poetry or paint family portraits for us based on the eavesdropping she’s done over the years. Heartwarming. One day, sure.

Cynthia Murrell, September 19, 2023

Wanna Be an AI Entrepreneur? Part 2

August 17, 2023

MIT digital-learning dean Cynthia Breazeal and Yohana founder Yoky Matsuoka have a message for their entrepreneurship juniors. Forbes shares “Why These 50 Over 50 Founders Say Beware of AI ‘Hallucination’.” It is easy to get caught up in the hype around AI and leap into the fray before looking. But would-be AI entrepreneurs must approach their projects with careful consideration.

8 12 money machine

An entrepreneur “listens” to the AI experts. The AI machine spews money to the entrepreneur. How wonderful new technology is! Thanks, MidJourney for not asking me to appeal this image.

Contributor Zoya Hansan introduces these AI authorities:

“‘I’ve been watching generative AI develop in the last several years,’ says Yoky Matsuoka, the founder of a family concierge service called Yohana, and formerly a cofounder at Google X and CTO at Google Nest. ‘I knew this would blow up at some point, but that whole ‘up’ part is far bigger than I ever imagined.’

Matsuoka, who is 51, is one of the 20 AI maestros, entrepreneurs and science experts on the third annual Forbes 50 Over 50 list who’ve been early adopters of the technology. We asked these experts for their best advice to younger entrepreneurs leveraging the power of artificial intelligence for their businesses, and each one had the same warning: we need to keep talking about how to use AI responsibly.”

The pair have four basic cautions. First, keep humans on board. AI can often offer up false information, problematically known as “hallucinations.” Living, breathing workers are required to catch and correct these mistakes before they lead to embarrassment or even real harm. The founders also suggest putting guardrails on algorithmic behavior; in other words, impose a moral (literal) code on one’s AI products. For example, eliminate racial and other biases, or refuse to make videos of real people saying or doing things they never said or did.

In terms of launching a business, resist pressure to start an AI company just to attract venture funding. Yes, AI is the hot thing right now, but there is no point if one is in a field where it won’t actually help operations. The final warning may be the most important: “Do the work to build a business model, not just flashy technology.” The need for this basic foundation of a business does not evaporate in the face of hot tech. Learn from Breazeal’s mistake:

“In 2012, she founded Jibo, a company that created the first social robot that could interact with humans on a social and emotional level. Competition with Amazon’s Alexa—which takes commands in a way that Jibo, created as a mini robot that could talk and provide something like companionship, wasn’t designed to do—was an impediment. So too was the ability to secure funding. Jibo did not survive. ‘It’s not the most advanced, best product that wins,’ says Breazeal. ‘Sometimes it’s the company who came up with the right business model and figured out how to make a profit.'”

So would-be entrepreneurs must proceed with caution, refusing to let the pull of the bleeding edge drag one ahead of oneself. But not too much caution.

Cynthia Murrell, August 17, 2023

Need Research Assistance, Skip the Special Librarian. Go to Elicit

July 17, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_t[1]Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

Academic databases are the bedrock of research. Unfortunately most of them are hidden behind paywalls. If researchers get past the paywalls, they encounter other problems with accurate results and access to texts. Databases have improved over the years but AI algorithms make things better. Elicit is a new database marketed as a digital assistant with less intelligence than Alexa, Siri, and Google but can comprehend simple questions.

7 16 library hub

“This is indeed the research library. The shelves are filled with books. You know what a book is, don’t you? Also, will find that this research library is not used too much any more. Professors just make up data. Students pay others to do their work. If you wish, I will show you how to use the card catalog. Our online public access terminal and library automation system does not work. The university’s IT department is busy moonlighting for a professor who is a consultant to a social media company,” says the senior research librarian.

What exactly is Elicit?

“Elicit is a research assistant using language models like GPT-3 to automate parts of researchers’ workflows. Currently, the main workflow in Elicit is Literature Review. If you ask a question, Elicit will show relevant papers and summaries of key information about those papers in an easy-to-use table.”

Researchers use Elicit to guide their research and discover papers to cite. Researcher feedback stated they use Elicit to answer their questions, find paper leads, and get better exam scores.

Elicit proves its intuitiveness with its AI-powered research tools. Search results contain papers that do not match the keywords but semantically match the query meaning. Keyword matching also allows researchers to narrow or expand specific queries with filters. The summarization tool creates a custom summary based on the research query and simplifies complex abstracts. The citation graph semantically searches citations and returns more relevant papers. Results can be organized and more information added without creating new queries.

Elicit does have limitations such as the inability to evaluate information quality. Also Elicit is still a new tool so mistakes will be made along the development process. Elicit does warn users about mistakes and advises to use tried and true, old-fashioned research methods of evaluation.

Whitney Grace, July 16 , 2023

The Seven Wonders of the Google AI World

May 12, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_tNote: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

I read the content at this Google Web page: https://ai.google/responsibility/principles/. I found it darned amazing. In fact, I thought of the original seven wonders of the world. Let’s see how Google’s statements compare with the down-through-time achievements of mere mortals from ancient times.

Let’s imagine two comedians explaining the difference between the two important set of landmarks in human achievement. Here are the entertainers. These impressive individuals are a product of MidJourney’s smart software. The drawing illustrates the possibilities of artificial intelligence applied to regular intelligence and a certain big ad company’s capabilities. (That’s humor, gentle reader.)

clowns 5 11 23

Here are the seven wonders of the world according to the semi reliable National Geographic (l loved those old Nat Geos when I was in the seventh grade in 1956-1957!):

  1. The pyramids of Giza (tombs or alien machinery, take your pick)
  2. The hanging gardens of Babylon (a building with a flower show)
  3. The temple of Artemis (god of the hunt for maybe relevant advertising?)
  4. The statue of Zeus (the thunder god like Googzilla?)
  5. The mausoleum at Halicarnassus (a tomb)
  6. The colossus of Rhodes (Greek sun god who inspired Louis XIV and his just-so-hoity toity pals)
  7. The lighthouse of Alexandria (bright light which baffles some who doubt a fire can cast a bright light to ships at sea)

Now the seven wonders of the Google AI world:

  1. Socially beneficial AI (how does AI help those who are not advertisers?)
  2. Avoid creating or reinforcing unfair bias (What’s Dr. Timnit Gebru say about this?)
  3. Be built and tested for safety? (Will AI address video on YouTube which provide links to cracked software; e.g. this one?)
  4. Be accountable to people? (Maybe people who call for Google customer support?)
  5. Incorporate privacy design principles? (Will the European Commission embrace the Google, not litigate it?)
  6. Uphold high standards of scientific excellence? (Interesting. What’s “high” mean? What’s scientific about threshold fiddling? What’s “excellence”?)
  7. AI will be made available for uses that “accord with these principles”. (Is this another “Don’t be evil moment?)

Now let’s evaluate in broad strokes the two seven wonders. My initial impression is that the ancient seven wonders were fungible, not based on the future tense, the progressive tense, and breathing the exhaust fumes of OpenAI and others in the AI game. After a bit of thought, I am not sure Google’s management will be able to convince me that its personnel policies, its management of its high school science club, and its knee jerk reaction to the Microsoft Davos slam dunk are more than bloviating. Finally, the original seven wonders are either ruins or lost to all but a MidJourney reconstruction or a Bing output. Google is in the “careful” business. Translating: Google is Googley. OpenAI and ChatGPT are delivering blocks and stones for a real wonder of the world.

Net net: The ancient seven wonders represent something to which humans aspired or honored. The Google seven wonders of AI are, in my opinion, marketing via uncoordinated demos. However, Google will make more money than any of the ancient attractions did. The Google list may be perfect for the next Sundar and Prabhakar Comedy Show. Will it play in Paris? The last one there flopped.

Stephen E Arnold, May 12, 2023

The Chivalric Ideal: Social Media Companies as Jousters or Is It Jesters?

April 12, 2023

Vea4_thumb_thumbNote: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

As a dinobaby, my grade school education included some biased, incorrect, yet colorful information about the chivalric idea. The basic idea was that knights were governed by the chivalric social codes. And what are these, pray tell, squire? As I recall Miss Soapes, my seventh grade teacher, the guts included honor, honesty, valor, and loyalty. Scraping away the glittering generalities from the disease-riddled, classist, and violent Middle Ages – the knights followed the precepts of the much-beloved Church, opened doors for ladies, and embodied the characters of Sir Gawain, Lancelot, King Arthur, and a heaping dose of Hector of Troy, Alexander the Great (who by the way figured out pretty quickly that what is today Afghanistan would be tough to conquer), and baloney gathered by Ramon Llull were the way to succeed.

Flash forward to 2023, and it appears that the chivalric ideals are back in vogue. “Google, Meta, Other Social Media Platforms Propose Alliance to Combat Misinformation” explains that social media companies have written a five page “proposal.” The recipient is the Indian Ministry of Electronics and IT. (India is a juicy market for social media outfits not owned by Chinese interests… in theory.)

The article explains that a proposed alliance of outfits like Meta and Google:

will act as a “certification body” that will verify who a “trusted” fact-checker is.

Obviously these social media companies will embrace the chivalric ideals to slay the evils of weaponized, inaccurate, false, and impure information. These companies mount their bejeweled hobby horses and gallop across the digital landscape. The actions evidence honor, loyalty, justice, generosity, prowess, and good manners. Thrilling. Cinematic in scope.

The article says:

Social media platforms already rely on a number of fact checkers. For instance, Meta works with fact-checkers certified by the International Fact-Checking Network (IFCN), which was established in 2015 at the US-based Poynter Institute. Members of IFCN review and rate the accuracy of stories through original reporting, which may include interviewing primary sources, consulting public data and conducting analyses of media, including photos and video. Even though a number of Indian outlets are part of the IFCN network, the government, it is learnt, does not want a network based elsewhere in the world to act on content emanating in the country. It instead wants to build a homegrown network of fact-checkers.

Will these white knights defeat the blackguards who would distort information? But what if the companies slaying the inaccurate factoids are implementing a hidden agenda? What if the companies are themselves manipulating information to gain an unfair advantage over any entity not part of the alliance?

Impossible. These are outfits which uphold the chivalric ideals. Truth, honor, etc., etc.

The historical reality is that chivalry was cooked up by nervous “rulers” in order to control the knights. Remember the phrase “knight errant”?

My hunch is that the alliance may manifest some of the less desirable characteristics of the knights of old; namely, weapons, big horses, and a desire to do what was necessary to win.

Knights, mount your steeds. To battle in a far off land redolent with exotic spices and revenue opportunities. Toot toot.

Stephen E Arnold, April 2023

The Gray Lady: Calling the Winner of the AI Race

March 17, 2023

Editor’s Note: Written by a genuine dinobaby with some editorial inputs from Stephen E Arnold’s tech elves.

I love it when “real journalists” predict winners. Does anyone remember the Dewey thing? No, that’s okay. I read “How Siri, Alexa and Google Assistant Lost the AI Race.” The title reminds me of English 101 “How to” essays. (A publisher once told me that “how to” books were the most popular non fiction book type. Today the TikTok video may do the trick.)

The write up makes a case for OpenAI and ChatGPT winning the smart software race. Here’s a quote I circled:

The excitement around chatbots illustrates how Siri, Alexa and other voice assistants — which once elicited  similar enthusiasm — have squandered their lead in the A.I. race.

Squandering a lead is not exactly losing a race, at least here in Kentucky. Races can be subject to skepticism, but in the races I have watched, a horse wins, gets a ribbon, the owner receives hugs and handshakes, and publicity. Yep, publicity. Good stuff.

The write up reports or opines:

Many of the big tech companies are now racing to come up with responses to ChatGPT.

Is this me-too innovation? My thought is that the article is not a how-to; it’s an editorial opinion.

My reaction to the story is that the “winner” is the use of OpenAI type technology with a dialogue-type interface. The companies criticized for squandering a lead are not dead in their stable stall. Furthermore, smart software is not new. The methods have been known for years. What’s new is that computational resources are more readily available. Content is available without briar patches like negotiating permissions and licenses to recycle someone else’s data. Code libraries are available. Engineers and programmers are interested in doing something with the AI Lego blocks. People with money want to jump on the high speed train even if the reliability and the destination are not yet known.

I would suggest that the Gray Lady’s analysis is an somewhat skewed way to point out that some big tech outfits have bungled and stumbled.

The race, at least here in Harrod’s Creek, is not yet over. I am not sure the nags are out of their horse carriers yet. Why not criticize in the context of detailed, quite specific technical, financial, and tactical factors? I will answer my own question, “The Gray Lady has not gotten over how technology disrupted the era of big newspapers as gatekeepers.”

How quickly will the Gray Lady replace “real journalists” (often with agendas) with cheaper, faster software.

I will answer my own question, “Faster than some of the horses running in the Kentucky Derby this year.”

Stephen E Arnold, March 17, 2023

Search and Retrieval: A Sub Sub Assembly

January 2, 2023

What’s happening with search and retrieval? Google’s results irritate some; others are happy with Google’s shaping of information. Web competitors exist; for example, Kagi.com and Neva.com. Both are subscription services. Others provide search results “for free”; examples include Swisscows.com and Yandex.com. You can find metasearch systems (minimal original spidering, just recycling results from other services like Bing.com); for instance, StartPage.com (formerly Ixquick.com) and DuckDuckGo.com. Then there are open source search options. The flagship or flagships are Solr and Lucene. Proprietary systems exist too. These include the ageing X1.com and the even age-ier Coveo system. Remnants of long-gone systems are kicking around too; to wit, BRS and Fulcrum from OpenText, Fast Search now a Microsoft property, and Endeca, owned by Oracle. But let’s look at search as it appears to a younger person today.

image

A decayed foundation created via smart software on the Mage.space system. A flawed search and retrieval system can make the structure built on the foundation crumble like Southwest Airlines’ reservation system.

First, the primary means of access is via a mobile device. Surprisingly, the source of information for many is video content delivered by the China-linked TikTok or the advertising remora YouTube.com. In some parts of the world, the go-to information system is Telegram, developed by Russian brothers. This is a centralized service, not a New Wave Web 3 confection. One can use the service and obtain information via a query or a group. If one is “special,” an invitation to a private group allows access to individuals providing information about open source intelligence methods or the Russian special operation, including allegedly accurate video snips of real-life war or disinformation.

The challenge is that search is everywhere. Yet in the real world, finding certain types of information is extremely difficult. Obtaining that information may be impossible without informed contacts, programming expertise, or money to pay what would have been called “special librarian research professionals” in the 1980s. (Today, it seems, everyone is a search expert.)

Here’s an example of the type of information which is difficult if not impossible to obtain:

  • The ownership of a domain
  • The ownership of a Tor-accessible domain
  • The date at which a content object was created, the date the content object was indexed, and the date or dates referenced in the content object
  • Certain government documents; for example, unsealed court documents, US government contracts for third-party enforcement services, authorship information for a specific Congressional bill draft, etc.
  • A copy of a presentation made by a corporate executive at a public conference.

I can provide other examples, but I wanted to highlight the flaws in today’s findability.

Read more

How Regulation Works: Irritate Taylor Swift and Find Out

December 29, 2022

Ticketmaster and its parent company Live Nation have been scamming consumers for decades. There was a lawsuit in the 2010s about inflated service fees that Ticketmaster lost. Plaintiffs were awarded gift certificates with minuscule amounts that could not be combined and had expiration dates. The bigger question, Engadget asks, is why did it take a poster to force the federal government into action: “Ticketmaster’s Taylor Swift Fiasco Sparks Senate Antitrust Hearing.”

Ticketmaster screwed up tickets for Taylor Swift’s first tour in five years. The ticket seller’s systems were overwhelmed by fourteen million people, including bots, when tickers went up for sale. Ticketmaster’s Web site was hit with 3.5 million system requests.

Ticketmaster informed Swift they could handle the mass of fans, but she was “pissed off” when they failed.

“Sens. Amy Klobuchar (D-MN) and Mike Lee (R-UT), the chair and ranking member of the Senate Judiciary Subcommittee on Competition Policy, Antitrust and Consumer Rights, have announced a hearing to gather evidence on competition in the ticketing industry. They have yet to confirm when the hearing will take place or the witnesses that the committee will call upon.”

New York Representative Alexandria Ocasio-Cortez stated Live Nation should be broken up. The US government has been investigating Live Nation’s monopoly for several months, but the Swift fiasco has garnered the issue more public attention.

Ticketmaster was sued in the past for similar issues and the company lost. Why is Live Nation allowed to continue its poor business practices?

Whitney Grace, December 29, 2022

Want Clicks? Put War Videos on TikTok

December 20, 2022

Here is another story about the importance of click-throughs to social media companies, repercussions be damned. BBC News reports, “Russian Mercenary Videos ‘Top 1 Bn Views’ on TikTok.” The mercenary band in these videos, known as the Wagner Group, is helping Russia fight its war against Ukraine. Writer Alexandra Fouché cites a recent report from NewsGuard as she reveals:

“NewsGuard said it had identified 160 videos on the short-video platform that ‘allude to, show, or glorify acts of violence’ by the mercenary group, founded by Yevgeny Prigozhin, a close ally of President Vladimir Putin. Fourteen of those videos showed full or partial footage of the apparent killing of former Russian mercenary Yevgeny Nuzhin which saw high engagement within days of being uploaded last month, it said.”

That brutal murder, which was performed with a sledgehammer, was viewed over 900,000 times on TikTok before ByteDance took it down. Nuzhin was apparently killed because he switched sides and denounced the Wagner Group. Sadly but surely, there are many viewers who would seek out such footage; why blame TikTok for its spread? The article continues:

“NewsGuard found that TikTok’s algorithm appeared to push users towards violent Wagner Group content. When an analyst searched for the term ‘Wagner’, TikTok’s search bar suggested searches for ‘Wagner execution’ and ‘Wagner sledgehammer’. The same search in Russian resulted in the suggestions ‘Wagner PMC’, ‘Wagner sledgehammer’ and ‘Wagner orchestra’. Wagner refers to its fighters as ‘musicians’. NewsGuard also found that videos could be found on TikTok showing another Wagner murder involving an army deserter in Syria in 2017 and that they had reached millions of users.

The online analysis group said it had also identified other music videos on the platform that advocated violence against Ukrainians, including calls to kill Ukrainians claiming they were ‘Nazis’.

Funny, when I searched Google for “Wagner,” the first three results my filter bubble turned up were composer Richard Wagner’s Wikipedia page, Wagner paint sprayer’s home page, and Staten Island’s Wagner College. Some actual news articles about the Wagner Group followed, but nary a violence glorification video in sight. TikTok certainly knows how to generate clicks. But what about China’s “reeducation” camps? The Chinese company is not circulating videos of those, is it? It seems the platform can be somewhat selective, after all.

Cynthia Murrell, December 20, 2022

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta