Cheap Training for Machine Learning Is Not Hyped Enough. Believe It or Not

December 6, 2022

I read an interesting article titled “Counting the Cost of Training Large Language Models.” The write up contains a statement which provides insight into the type of blind spots that plague whiz bang smart software companies. Here’s the statement which struck me as amusing and revelatory:

It has been becoming increasingly clear – anecdotally at least – just how expensive it is to train large language models and recommender systems…

Two points. Anyone who took the time to ask about the cost of retraining a Bayesian and neurolinguistic system from the late 1990s would have learned: [a] Smart software, even relatively simple implementations, require refined and curated training data before a system is deployed. This work is tedious and requires subject matter specialists. Then there is testing and fiddling knobs and dials before the software becomes operational. [b] The smart software requires retraining with updated data sets, calibration, and testing on a heartbeat. For some Autonomy plc type systems, the retraining could be necessary every 180 days or when “drift” became evident. Users complain, and that’s how one knows the system is lost in the tiny nooks and crannies of lots of infinitesimals adding up to a dust pile in a dark corner of a complex system.

After three decades of information available about the costs of human centric involvement in making smart software less stupid, one would think that the whiz kids would have done some homework. Oh, right. If the information is not in the first 15 items in a Google search result, there are no data. Very modern.

The write up identifies a number of companies with ways to chop down training costs. To be clear, the driving idea for Snorkel from the Stanford AI Lab is reducing the costs of building training sets. The goal is to be “close enough for horseshoes” or “good enough.” Cut the costs and deal with issues with some software wrappers. Package up the good enough training data and one has a way to corner the market for certain ML applications. But it’s not just the Google. Amazon AWS is in the hunt for this off-the-shelf approach to machine learning. I think of it as the 7-11 approach to getting a meal: Cheap, quick, and followed by a Big Gulp.

The write  up has a number of charts. These are okay, but I am not sure about the provenance of the data presented. But that’s just my skepticism for content marketing type write ups. There are even “cost per one million parameters” data. Interesting but who compiled the data, what methods were used to generate the numbers, and who vetted the project itself? Annoying questions? Sure. Important? Not to true believers.

But I know the well educated, well informed funding sources and procurement officials will love this conclusion:

some people will rent to train, and then as they need to train more and also train larger models, the economics will compel them to buy.

Yep, but what about the issue of “close enough for horseshoes”? Yep, here’s another annoying question: Is this article the kick off for another hype campaign? My initial reaction is, “Yes.”

Stephen E Arnold, December 2022

Small Snowden Item: Not Rooting for US Soccer Team?

December 6, 2022

I think the answer to the question, “Is Edward Snowden rooting for the US soccer team?” is no. I read “Edward Snowden Swears Allegiance to Russia and Receives Passport, Lawyer Says”. [Note: In the spirit of capitalism, you will have to pay to view the original story.] The Bezos affiliated real news outfit said:

It’s unclear whether Snowden swore the oath of allegiance at the same time as he was granted a passport, but the two are common procedures when foreigners become Russian citizens. The text includes swearing “to protect the freedom and independence of the Russian Federation, to be loyal to Russia, to respect its culture, history and traditions,” and to promise to “perform the duties of a citizen of the Russian Federation for the good of the state and society.” Kucherena [The estimable Mr. Snowden’s legal eagle] added that Snowden’s wife, Lindsay Mills, was also undergoing the Russian citizenship application process and that the couple’s children would likely attend Russian schools, when ready.

Interesting. I assume information will surface about the forthcoming Russian film “Dinner with Vlad” starring the bold, brave bag man Mr. Snowden and the somewhat weighty Mr. Segal. The plot is, as I understand it, Vlad asks his guests about Russia’s most appealing aspect. Mr. Snowden says, “It’s the great Internet connections”, and Mr. Seagal says, “It the food.” The three stars drink Russian vodka and engage in an arm wrestling competition. Vlad wins and the three drooks head to a cover band featuring Pussy Riot tunes. Mr. Snowden and Mr. Seagal give inspired lectures during the band’s break. Males in the audience are enlisted. Females? Well, fade to black.

Stephen E Arnold, December 6, 2022

Apple Factoid or Why a US Company Shows Affection for Pandas (Digital and Furry)

December 6, 2022

I spotted an article with a killer title: “Apple Reaches Highest Ever Monthly Market Share in China.” What’s the factoid? The write up provides what may be a semi credible factoid:

One in every four devices sold in China during October 2022 was an iPhone.

Here’s a passage from the write up I found intriguing:

Apple has been reaching new heights in terms of market share in China during the last two years. It reached a record monthly market share in November and December 2020, and in October, November and December 2021. Notably, 2020 was also the year when US sanctions were imposed on Huawei.

The article provides no information about why a US company is thriving in an environment of restrictions on certain Chinese-US interactions. Perhaps there is information to be found, but it is not in reports of what appear to be significant sales by a US firm in the Middle Kingdom.

Stephen E Arnold, December 6, 2022

TikTok: Back in the Surveillance Spotlight?

December 6, 2022

In western countries, especially the United States, TikTok is a platform showcasing the worst of its citizens. It also encourages poor behavior due to mob mentality/crowd psychology. Did you know that China owns TikTok and uses it to collect data on US citizens? It is probably manipulating algorithms to show Americans the worst of the worst as well. The FBI is finally catching on that TikTok is not a benign social media platform, but it is probably too little too late.

CNBC wrote that, “FBI Is ‘Extremely Concerned’ About China’s Influence Through TikTok On US Users.” FBI Director Christopher Wray warned US lawmakers about the potential threat TikTok poses:

“ ‘We do have national security concerns at least from the FBI’s end about TikTok,’ Wray told members of the House Homeland Security Committee in a hearing about worldwide threats. ‘They include the possibility that the Chinese government could use it to control data collection on millions of users. Or control the recommendation algorithm, which could be used for influence operations if they so chose. Or to control software on millions of devices, which gives it opportunity to potentially technically compromise personal devices.’”

TikTok’s parent company ByteDance denies any bad actions and condemns anyone who claims TikTok is anything more than a short video-sharing platform. The Hill has a similar take on the same story “FBI Head: China Has ‘Stolen More’ US Data ‘Than Every Other Nation Combined’” and uses the same quote from Wray but includes an additional one:

“There are still unresolved questions about data sharing between Chinese companies and the government in Beijing, said Wray, adding that ‘there’s a number of concerns there as to what is actually happening and actually being done.’”

What is interesting about China is that it is one of the world’s oldest countries and its cultural mentality is different from than the West. China could be patiently playing the long game to subvert the US government with the help of its citizens. How? They systemically use TikTok to condition Americans’ attention spans to be shorter and influence bad behavior.

Why is the FBI only concerned now?

Whitney Grace, December 6, 2022

Hot Take Resulting from Google Method

December 5, 2022

I read “Hot Take: Google Has a Company Strategy, Not a Product Strategy.” The write up explains that Google thinks like this:

Hire all the smart people and let them build. Hire all the smart people so they can’t work at a competitor. Hire all the smart people even if we don’t have something important for them to work on. Google acts like a venture capitalist, investing in promising people with the expectation that most will fail. They invest broadly in search of the idea that will deliver 100x. Let 1000 flowers bloom, and see which are the best.

You may agree or disagree with this statement. It is probably helpful if one has worked as an employee at Google or a consultant to the firm. But that does not stop Silicon Valley types from expressing their views of the world as information gleaned from an Egyptian ruler’s tomb.

I noted this statement in the comments to the article:

romwell said: Hot take: Google doesn’t have a strategy, period. Neither company, nor product.

In numerous articles and my monographs about Google, I have emphasized one point which, to me, encapsulates the company’s remarkable 25 year trajectory.

The firm made use of ideas developed at GoTo.com, Overture.com, and Yahoo.com. Those ideas converted Google from a mechanism for searching the content on the Web into a platform for advertising. By keeping one’s eye on the advertising ball, it’s clear that Alphabet YouTube Google DeepMind has been struggling to find a revenue winner.

Net net: As romwell said, “Google doesn’t have a strategy, period.” Had Yahoo not settled the court case for a $1 billion prior to the IPO, Google would have become another AllTheWeb.com, Lycos.com, or one of the many other outfits indexing problematic content.

Stephen E Arnold, December 5, 2022

A Legal Information Truth Inconvenient, Expensive, and Dangerous

December 5, 2022

The Wall Street Journal published “Justice Department Prosecutors Swamped with Data As Cases Leave Long Digital Trails.” The write up addressed a problematic reality without craziness. The basic idea is that prosecutors struggle with digital information. The consequences are higher costs and in some cases allowing potentially problematic individuals to go to Burger King or corporate practices to chug along with felicity.

The article states:

Federal prosecutors are swamped by data, as the way people communicate and engage in behavior scrutinized by investigators often leaves long and complicated digital trails that can outpace the Justice Department’s technology.

What’s the fix? This is a remarkable paragraph:

The Justice Department has been working on ways to address the problem, including by seeking additional funding for electronic-evidence technology and staffing for US attorney’s offices. It is also providing guidance in an annual training for prosecutors to at times collect less data.

Okay, more money which may or may not be spent in a way to address the big data issues, more lawyers (hopefully skilled in manipulating content processing systems functions), annual training, and gather less information germane to a legal matter. I want to mention that misinformation, reformation of data, and weaponized data are apparently not present in prosecutors’ data sets or not yet recognized as a problem by the Justice Department.

My response to this interesting article includes:

  1. This is news? The issue has been problematic for many years. The vendors of specialized systems to manage evidence, index and make searchable content from disparate sources, and output systems which generate a record of what lawyer accessed what and when are asserting their systems can handle this problem. Obviously either licensees discover the systems don’t work like the demos or cannot handle large flows of disparate content.
  2. The legal industry is not associated with groundbreaking information innovation. I may be biased, but I think of lawyers knowing more about billing for their time than making use of appropriate, reliable technology for managing evidence. Excel timesheets are one thing. Dark Web forum content, telephone intercepts, and context free email and chat messages are quite different. Annual training won’t change the situation. The problem has to be addressed by law schools and lawyer certification systems. Licensing a super duper search system won’t deal with the problem no matter what consultants, vendors, and law professors say.
  3. The issue of “big data” is real, particularly when there are some many content objects available to a legal team, its consultants, and the government professionals working on a case or a particular matter. It is just easier to gather and then try to make sense of the data. When the necessary information is not available, time or money runs out and everyone moves on. Big data becomes a process that derails some legal proceedings.

My view is that similar examples of “data failure” will surface. The meltdown of crypto? Yes, too much data. The downstream consequences of certain medical products? Yes, too much data and possibly the subtle challenge of data shaping by certain commercial firms? The interlocks among suppliers of electrical components? Yes, too much data and possibly information weaponization by parties to a legal matter?

When online meant keyword indexing and search, old school research skills and traditional data collection were abundant. Today, short cuts and techno magic are daily fare.

It is time to face reality. Some technology is useful, but human expertise and judgment remain essential. Perhaps that will be handled in annual training, possibly on a cruise ship with colleagues? A vendor conference offering continuing education credits might be a more workable solution than smart software with built in workflow.

Stephen E Arnold, December 5, 2022

Google and Its Hard Data Approach to a Soft Skill: Firing People

December 5, 2022

Who knew that Google would embrace highly subjective methods such as performance reviews. Yep, a person provides input about another person. What could go wrong? Nothing because the data are Googley by definition. (Interesting how that works, isn’t it?)

The scoop on the method plus a somewhat less than enthusiastic comment are the guts of “Google’s Plan to Lay Off 10,000 Poor Performing Employees Is Based on a Big Lie: Can Performance Reviews Really Do the Trick.”

The Google approach appears to equate fewer employees with lower costs. Okay, sure. But why not focus on the core problem: For me, Google is losing its magnetism. The company like Apple is embracing more aggressive methods of generating revenue. Do you enjoy the promotions for Google’s spin on cable TV? I love them: Repetitive and invasive. What’s not to like.

Here’s the Google plan:

Reports indicate that performance reviews are rolling out companywide. Google leadership is turning to the reviews so that they can rely on supposedly hard data to maintain fairness, remove bias, protect against favoritism, and have something to point to when needing to justify their decision for which 10,000 get laid off.

But the information in the write up which caught my attention was this passage’s payload:

But, according to this Harvard professor, [the write up in the best tradition of Silicon Valley real news does not identify Tsedal Neeley as the expert who is calling Google’s method hogwash]  it’s all one big lie. Many experts claim that the layoffs in big tech are the result of new corporate strategy, failed big bets coming out of the pandemic, and austerity measures entering the recession. This angers the public (not to mention the employees at these companies), because now the decision feels less objective — less fair.

My take on this is that Google’s multi-decade approach has been a high school science club approach to management. Now the company is embracing the ways of the dinobabies. Will this work? In my opinion, it will work like most of Google’s technology, in a way that is good enough.

Google’s personnel milestones include some notable, high profile events. My hunch is that 2023 will feature some newsworthy benchmarks as well; for example, fairness, equal treatment for those from certain backgrounds, and unbiased selection of those who can find their future elsewhere.

Worth watching because the Twitter email notification about termination may be an ideal fit for Gmail’s capabilities.

Stephen E Arnold, December 5, 2022

Google: Is This Like a Radio Payola Event?

December 5, 2022

In a savvy marketing move, Google worked with iHeartMedia to have social media stars promote the Pixel 4. Just one problem—most of those paid to extoll the phone’s virtues had allegedly never used one. Engadget reports, “Google Sued by FTC and Seven States Over ‘Deceptive’ Pixel 4 Ads.” Writer Jon Fingas elaborates:

“Promos aired between 2019 and 2020 featured influencers that extolled the features of phones they reportedly didn’t own — Google didn’t even supply Pixels before most of the ads were recorded, officials said. iHeartMedia and 11 other radio networks ran the Pixel 4 ads in ten large markets. They aired about 29,000 times. It’s not clear how many people listened to the commercials. The FTC aims to bar Google and iHeartMedia from making any future misleading claims about ownership. It also asks both companies to prove their compliance through reports. The states, including Arizona, California, Georgia, Illinois, Massachusetts, New York and Texas, have also issued judgments demanding the firms pay $9.4 million in penalties.”

A Google spokesperson hastened to explain the company had settled with only six of the seven states. Oh is that all? Fingas reminds us phone companies have a habit of misrepresentation, from presenting stock DSLR photos as taken with their cameras to, yes, celebrities pretending to use their phones. He writes:

“However, the accusations here are more serious. The FTC and participating states are contending that Google set out to use false testimonials. It had a ‘blatant disrespect’ for truth-in-ads rules, according to FTC consumer protection director Samuel Levine. While the punishment is tiny compared to the antitrust penalties Google has faced so far, it could damage trust in the company’s campaigns for newer Pixels and other hardware.”

Perhaps. But are consumers paying attention?

Cynthia Murrell, December 5, 2022

FTX: What Does B Stand For?

December 2, 2022

I am not a krypto kiddie. After the mysterious Nakamoto white paper became available, I made an informed judgment: Bad actors will love this crypto thing. My hunch was correct. The meltdown of a crypto wizard and his merry band of tea totaling worker bees have demonstrated that cyber fraud can be entertaining.

I read “Does B Stand for Bankman-Fried or Bankruptcy?” The write up asks a simple question. I noted this passage from the “real” Silicon Valley write up:

SBF said FTX failed on risk management and he didn’t “knowingly co-mingle funds.”

There you go.

Now what does B stand for? Here are my suggestions:

bamboozle – to rip off, fool, or deceive
bane – a source or ruin, harm, or evil
baseborn – a nice way to question one’s family position in society
bebotherer – one who brings trouble
besotted – drunk and incoherent
bonkers — a few cans short of a six pack
brock—a nasty, little, furred creature

I am leaning toward bamboozle but I think brock has a certain charm. Perhaps a combo; to wit:

The brock bamboozled himself and others.

Close enough for horseshoes as the “we’re not talking” analytics folks like to say among friends at lunch.

Stephen E Arnold, December 2, 2022

A Paradox at the Center of the Internet: No Big Deal

December 2, 2022

The Internet is a mess, but compared to how it was in its early decades it is way more organized. The organization of the Internet is called centralization. Gordon Brander of Unconscious wants the Internet to be decentralized. He says that will happen after it becomes more centralized first, read his explanation here: “Centralization Is Inevitable.” Brander says that the best way to understand the benefits of decentralization is to understand how centralization first happens.

While there are many ways to map centralization, the Internet is concentrated into different hubs or a scale-free network. The best way to define a scale-free network is:

“The defining characteristic of scale-free networks is a power law distribution with a long tail. A small number of nodes with an extremely large number of links, and an extremely large number of nodes with a small number of links. Think Twitter. Most users have a few followers, while a few influencers have millions. This power law distribution grants the biggest hubs a lot of power over the network. It also makes hubs important to the functioning of the network in ways that are not immediately obvious, like keystone species in an ecology.”

These networks emerge because there receive preferential attachment or “the rich-get-richer” scenario. Users prefer a hub/network, ergo it will receive more attention, trust, users, etc. Scale-free networks are also more efficient, because links between systems are smaller.

Another advantage is that they are resilient to attack, i.e. if one part of the hub fails, the entire system continues to run. That also makes networks more vulnerable to attacks, because a well-laced virus could knock out all the nodes.

Brander ends his spiel by stating the centralization and decentralization of the Internet is the circle of life: random start-ups, exponential growth, consolidation, collapse, then repeat. Someone cue The Lion King’s opening song!

Whitney Grace, December 2, 2022

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta