Issues with the Zuckbook Smart Software: Imagine That
May 10, 2022
I was neither surprised by nor interested in “Facebook’s New AI System Has a ‘High Propensity’ for Racism and Bias.” The marketing hype encapsulated in PowerPoint decks and weaponized PDF files on Arxiv paint fantastical pictures of today’s marvel-making machine learning systems. Those who have been around smart software and really stupid software for a number of years understand two things: PR and marketing are easier than delivering high-value, high-utility systems and smart software works best when tailored and tuned to quite specific tasks. Generalized systems are not yet without a few flaws. Addressing these will take time, innovation, and money. Innovation is scarce in many high-technology companies. The time and money factors dictate that “good enough” and “close enough for horseshoes” systems and methods are pushed into products and services. “Good enough” works for search because no one knows what is in the index. Comparative evaluations of search and retrieval is tough when users (addicts) operate within a cloud of unknowing. The “close enough for horseshoes” produces applications which are sort of correct. Perfect for ad matching and suggesting what Facebook pages or Tweets would engage a person interested in tattoos or fad diets.
The cited article explains:
Facebook and its parent company, Meta, recently released a new tool that can be used to quickly develop state-of-the-art AI. But according to the company’s researchers, the system has the same problem as its predecessors: It’s extremely bad at avoiding results that reinforce racist and sexist stereotypes.
My recollection is that the Google has terminated some of its wizards and transformed these professionals into Xooglers in the blink of an eye. Why? Exposing some of the issues that continue to plague smart software.
Those interns, former college professors, and start up engineers rely on techniques used for decades. These are connected together, fed synthetic data, and bolted to an application. The outputs reflect the inherent oddities of the methods; for example, feed the system images spidered from Web sites and the system “learns” what is on the Web sites. Then generalize from the Web site images and produce synthetic data. The who process zooms along and costs less. The outputs, however, have minimal information about that which is not on a Web site; for example, positive images of a family in a township outside of Cape Town.
The write up states:
Meta researchers write that the model “has a high propensity to generate toxic language and reinforce harmful stereotypes, even when provided with a relatively innocuous prompt.” This means it’s easy to get biased and harmful results even when you’re not trying. The system is also vulnerable to “adversarial prompts,” where small, trivial changes in phrasing can be used to evade the system’s safeguards and produce toxic content.
What’s new? These issues surfaced in the automated content processing in the early versions of the Autonomy Neuro Linguistic Programming approach. The fix was to retrain the system and tune the outputs. Few licensees had the appetite to spend the money needed to perform the retraining and reindexing of the processed content when the search results drifted into weirdness.
Since the mid 1990s, have developers solved this problem?
Nope.
Has the email with this information reached the PR professionals and the art history majors with a minor in graphic design who produce PowerPoints? What about the former college professors and a bunch of interns and recent graduates?
Nope.
What’s this mean? Here’s my view:
- Narrow applications of smart software can work and be quite useful; for example, the Preligens system for aircraft identification. Broad applications have to be viewed as demonstrations or works in progress.
- The MBA craziness which wants to create world-dominating methods to control markets must be recognized and managed. I know that running wild for 25 years creates some habits which are going to be difficult to break. But change is needed. Craziness is not a viable business model in my opinion.
- The over-the-top hyperbole must be identified. This means that PowerPoint presentations should carry a warning label: Science fiction inside. The quasi-scientific papers with loads of authors who work at one firm should carry a disclaimer: Results are going to be difficult to verify.
Without some common sense, the flood of semi-functional smart software will increase. Not good. Why? The impact of erroneous outputs will cause more harm than users of the systems expect. Screwing up content filtering for a political rally is one thing; outputting an incorrect medical action is another.
Stephen E Arnold, May 10, 2022