Shaping Data Is Indeed a Thing and Necessary

April 12, 2021

I gave a lecture at Microsoft Research many years ago. I brought up the topic of Kolmogorov’s complexity idea and making fast and slow smart software sort of work. (Remember that Microsoft bought Fast Search & Transfer which danced around making automated indexing really super wonderful like herring worked over by a big time cook.) My recollection of the Microsoft group’s reaction was, “What is this person talking about?” There you go.

If you are curious about the link between a Russian math person once dumb enough to hire one of my relatives to do some grunt work, check out the 2019 essay “Are Deep Neural Networks Dramatically Overfitted?” Spoiler: You betcha.

The essay explains that mathy tests signal when a dataset is just right. No more nor no less data are needed. Thus, if the data are “just right,” the outputs will be on the money, accurate, and close enough for horse shoes.

The write up states:

The number of parameters is not correlated with model overfitting in the field of deep learning, suggesting that parameter counting cannot indicate the true complexity of deep neural networks.

Simplifying: “Oh, oh.”

Then there is a work around. The write up points out:

The lottery ticket hypothesis states that a randomly initialized, dense, feed-forward network contains a pool of subnetworks and among them only a subset are “winning tickets” which can achieve the optimal performance when trained in isolation. The idea is motivated by network pruning techniques — removing unnecessary weights (i.e. tiny weights that are almost negligible) without harming the model performance. Although the final network size can be reduced dramatically, it is hard to train such a pruned network architecture successfully from scratch.

Simplifying again: “Yep, close enough for most applications.”

What’s the fix? Keep the data small.

Doesn’t that create other issues? Sure does. For example, what about real time streaming data which diverge from the data used to train smart software. You know the “change” thing when historical data no longer apply. Smart software is possible as long as the aperture is small and the data shaped.

There you go. Outputs are good enough but may be “blind” in some ways.

Stephen E Arnold, April 12, 2021

HPE Machine Learning: A Benefit of the Autonomy Tech?

April 8, 2021

This sounds like an optimal solution from HPE (formerly known as HP); too bad it was not available back when the company evaluated the purchase of Autonomy. Network World reports, “HPE Debuts New Opportunity Engine for Fast AI Insights.” The machine-learning platform is called the Software Defined Opportunity Engine, or SDOE. It is based in the cloud, and will greatly reduce the time it takes to create custom sales proposals for HPE channel partners and customers. Citing a blog post from HPE’s Tom Black, writer Andy Patrizio explains:

“It takes a snapshot of the customer’s workloads, configuration, and usage patterns to generate a quote for the best solution for the customer in under a minute. The old method required multiple visits by resellers or HPE itself to take an inventory and gather usage data on the equipment before finally coming back with an offer. That meant weeks. SDOE uses HPE InfoSight, HPE’s database which collects system and use information from HPE’s customer installed base to automatically remediate infrastructure issues. InfoSight is primarily for technical support scenarios. Started in 2010, InfoSight has collected 1,250 trillion data points in a data lake that has been built up from HPE customers. Now HPE is using it to move beyond technical support to rapid sales prep.”

The write-up describes Black’s ah-ha moment when he realized that data could be used for this new purpose. The algorithm-drafted proposals are legally binding—HPE must have a lot of confidence in the system’s accuracy. Besides HPE’s existing database and servers, the process relies on the assessment tool recently acquired when the company snapped up CloudPhysics. We learn that the tool:

“… analyzes on-premises IT environments much in the same way as InfoSight but covers all of the competition as well. It then makes recommendations for cloud migrations, application modernization and infrastructure. The CloudPhysics data lake—which includes more than 200 trillion data samples from more than one million virtual machines—combined with HPE’s InfoSight can provide a fuller picture of their IT infrastructure and not just their HPE gear.”

As of now, SDOE is only for storage systems, but we are told that could change down the road. Black, however, was circumspect on the details.

Cynthia Murrell, April 8, 2021

AI: Are Algorithms House Trained?

March 30, 2021

Containment Algorithms Don’t Work for Our Machines” includes a thought-provoking  passage; namely:

Director of the Center for Humans and Machines, Iyad Rahwan, described it this way: “If you break the problem down to basic rules from theoretical computer science, it turns out that an algorithm that would command an AI not to destroy the world could inadvertently halt its own operations. If this happened, you would not know whether the containment algorithm is still analyzing the threat, or whether it has stopped to contain the harmful AI. In effect, this makes the containment algorithm unusable.”

What’s the write up’s take on this “challenge”? Here’s the statement in the article:

The lesson of the study’s computability theory is that we do not know how or if we will be able to build a program that eliminates the risk associated with a sufficiently advanced artificial intelligence. As some AI theorists and scientists believe, no advanced AI systems can ever be guaranteed entirely safe. But their work continues; nothing in our lives has ever been guaranteed safe to begin with.

With the US doing yoga to maintain its perceived lead in smart software, the trajectory of smart software and its receptivity to house training may reside elsewhere.

Stephen E Arnold, March 30, 2021

Amazon: Where Does It Get AI Technology?

March 26, 2021

I saw an interesting table from Global Data Financial Deals Database. What’s interesting is that Apple, Facebook, Google, and Microsoft were active purchasers of AI companies. I understand that “taking something off the table” is a sound business tactic. Even if the AI technology embodied in a takeover is wonky, a competitor cannot take advantage of the insights or the people in a particular firm.

I found the inclusion of Accenture in the table interesting. The line between “consulting” and “smart software” seems to be permeable. One wonders how other big dog consulting firms will address what appears to be their smart software gaps. I have long believed that blue chip consulting firms were 21st century publishing companies. The combination of renting smart people and providing smart technology to clients is the type of amalgamation which appears to meet certain needs of Fortune 1000 firms and major government entities and some non governmental organizations.

What jumped out at me as I looked at the data and scanned the comments about it was the absence or failure to include one key question:

What’s Amazon doing to get its artificial intelligence technology?

Buying AI technology to leverage it or take it away from competitors is one method. Has Amazon found another way? Which approach is “better” in terms of intellectual property and real world applications?

I will address this question and a couple of other equally obscure facts about what may be one of the smartest companies in the world in my lecture at the upcoming 2021 National Cyber Crime Conference.

Buying AI capabilities is, it seems, the go to method for some high profile outfits. But is it the only path to smart software? No, it is not.

Stephen E Arnold, March 26, 2021

What Makes MIT a Great Institution: No, Not Jeffrey Epstein

March 24, 2021

News flash. Complex math is a challenge. People who can parse in a meaningful way said complex math are not plentiful compared to Wendy’s workers and art history majors.

Auditors Are Testing Hiring Algorithms for Bias, but Big Questions Remain” presents a brilliant insight. This is a eureka moment unequalled in MIT’s rich history of research, analysis, and judicious decision making. (No, I am not talking about Jeffrey Epstein, the cover up, and the sweep under the rug response.)

What is this brilliant empirical insight? I quote:

For all the attention that AI audits have received, though, their ability to actually detect and protect against bias remains unproven. The term “AI audit” can mean many different things, which makes it hard to trust the results of audits in general.

I have to sit down, take a breath, calm myself.

Complex numerical procedures, smart software, and outputs with opaque programmer controls are very hard to audit.

What’s the fix?

None other than ideas like “let the government do it.”

Yeah, brilliant.

Stephen E Arnold, March 24, 2021

Eschewing the Google: Career Suicide or Ethical Savvy?

March 19, 2021

I spotted an interested quote in Wired’s “The Departure of 2 Google AI Researchers Spurs More Fallout.” Here’s the quote:

“Google has shown an astounding lack of leadership and commitment to open science, ethics, and diversity in their treatment of the Ethical AI team.”

It’s been several months since the Google engaged in Gebru-gibberish; that is, the firm’s explanations about the departure of a PhD who wrote a research paper suggesting that the Google’s methods may not be a-okay.

The Google is pressing forward with smart software, which is, the future of the company. I thought online advertising was, but what do I know.

The article also mentions that a high profile AI researcher would not attend a Google AI event. The reason? Here’s what Wired reports:

Friday morning, Kress-Gazit emailed the event’s organizers to say she would not attend because she didn’t wish to be associated with Google research in any way. “Not only is the research process and integrity of Google tainted, but it is clear, by the way these women were treated, that all the diversity talk of the company is performative,” she wrote. Kress-Gazit says she didn’t expect her action to have much effect on Google, or her own future work, but she wanted to show solidarity with Gebru and Mitchell, their team, and their research agenda.

A few years ago, professionals would covet a Google tchotchke like a mouse pad or a flashing Google LED pin. (My tarnished and went dead years ago.) Now high profile academics are unfriending Messrs. Brin and Page online ad machine.

Interesting shift in attitude toward the high school science club company in a few pulses of Internet time.

Stephen E Arnold, March 19, 2021

AI Suffers the Slings and Arrow of Outrageous Marketing

March 19, 2021

I read “Loose Lips Sink AI Ships.” Amusing. The write up begins with a sentence designed to catch my attention:

Cognitive computing is not an IBM fraud. [Emphasis added. Editor.]

Imagine. IBM and fraud in the same sentence. Even more tasty is the phrase “cognitive computing.” The phrase evokes zeros and ones which think. The implication is that smart computers are as good as a mere mortal, perhaps even better at some things.

Fraud. Hmmm.

The write up explains that one naysayer is missing the boat. The naysayer took umbrage as a marketing person’s characterization of IBM Watson artificial intelligence platform being able to “outthink human brains in areas where finding insights and connections can be difficult due to the abundance of data.”

My goodness. A marketing person exaggerating. Plus the “abundance” word evokes the image of a tsunami of information. That’s an original metaphor too.

The write up explains that AI is a whiz bang deal. The case example is Covid research. I was hoping that the author would explain how IBM Watson was lashed to a gurney and wheeled into the parking lot at a major Houston, Texas, hospital. But no. The example was Covid.

The write up explains that AI is better with bigger and faster computers. That’s good news for some companies. Also, computer reasoning is “increasing quickly.” I like increased reasoning.

There is some less than sunny news too. What a surprise. For example, neural networks are clever, not intelligent. Clever was good enough for the Google, but not enough for real AI yet. And AI systems mimic human intelligence; the systems are not quite like your next door neighbor. (I think computers are quite like my next door neighbor, but I live in rural Kentucky. That’s a consideration.)

The write up seems to strive for balance if one relates to big data, big computers, and big marketing.

Let’s ask Watson? Well, maybe not.

Stephen E Arnold, March 19, 2021

Alphabet Google: Just Helping the Public

March 17, 2021

I usually don’t read insurance industry trade publications. Decades ago I brushed into the world of “real” insurance, and I have a deep aversion for this industry. Betting on death is not my thing, but those big insurers are a jolly group.

I read “Alphabet’s Waymo Says Its Tech Would Avoid Fatal Human Crashes.” For convenience, I will refer to Alphabet Waymo with its “real” name: The Google.

The write up explains:

The autonomous-car artificial intelligence from Alphabet Inc.’s Waymo avoided or mitigated crashes in most of a set of virtually recreated fatal accidents, according to a white paper the company published Monday.

This is lingo for a model, just like the ones “real” MBAs and alleged “data scientists” run using Excel or a facsimile on steroids. The model ingests assumptions and data. The wizard at the keyboard pretty much plugs in threshold values and checks the output. Need a little more oomph; change the threshold. Once the numbers flow. Bingo. Good to go.

What I found interesting was this passage in the insurance industry centric PR piece of marketing collateral:

Waymo says it published the study for the benefit of the public, rather than regulators specifically.

But can you die riding in a smart EV from The Google?

Absolutely. The write up reports:

The Driver system failed to avoid or mitigate simulated accidents only when the autonomous car was struck from behind, according to the study.

No problems. Adjust those actuarial tables accordingly. Come to think of it, “Why use human actuaries?” Take the output from The Google’s model and pump it into a smart analytics program and let ‘er rip.

Stephen E Arnold, March 17, 2021

Palantir and Anduril: Best Buds for Sure

March 12, 2021

I read “Anduril Industries Joins Palantir Technologies’ TITAN Industry Team.” In the good old days I would have been zipping from conference to conference outputting my ideas. Now I sit in rural Kentucky and fire blog posts into the datasphere.

This post calls attention to an explicit tie up between two Peter Thiel-associated entities: Palantir Technologies and Anduril. The latter is an interesting company with some nifty smart technology, including a drone which has the cheerful name “Anvil.”

For details about the new US Army project and the relationship between these two companies, the blog post was online as of March 8, 2021. (Some information may be removed, and I can’t do much about what other outfits do.)

Information about Anduril is available at their Web site. Palantir is everywhere and famous in the intelware business and among some legal eagles. No, I don’t have a Lord of the Rings fetish, but some forever young folks do.

Stephen E Arnold, March 12, 2021

Who Should Watch Over Smart Software? No One. Self Regulation Is the Answer

March 11, 2021

I read an amusing academic paper article called “Someone to Watch Over AI and Keep It Honest – and t’s Not the Public!.” The idea is that self regulation works. Full stop. Ignoring the 737 Max event and Facebook’s legal move to get anti-trust litigation dumped, the write up reports:

Dr Bran Knowles, a senior lecturer in data science at Lancaster University, says: “I’m certain that the public are incapable of determining the trustworthiness of individual AIs… but we don’t need them to do this. It’s not their responsibility to keep AI honest.”

And what’s the smart software entity figuring prominently in the write up? Amazon, the Google, or Twitter?



The idea, at least in the construct of the cited article, is that trust is important. And whom does one trust?


How do I know there’s an element of trust required to accept this fine scholarly article?

Here’s a clue:

The paper is co-authored by John T. Richards, of IBM’s T.J. Watson Research Center, Yorktown Heights, New York.

Yep, the home of the game shown winner and arguably one of the few smart software systems to be put on a gurney and rolled out the door of a Houston, Texas medical facility.

But just in case the self regulation thing doesn’t work, the scholarly experts’ findings point to “a regulatory ecosystem.”

Yep, regulations. How’s that been working out in the last 20 years?

Why not ask IBM Watson?

Stephen E Arnold, March 11, 2021

Next Page »

  • Archives

  • Recent Posts

  • Meta