Why Black Boxes in Smart Software?

January 5, 2020

I read “Why Are We Using Black Box Models in AI When We Don’t Need To? A Lesson From An Explainable AI Competition.” The source is HDSR, which appears to be hooked up to MIT. Didn’t MIT find an alleged human trafficker an ideal source of contributions and worthy of a bit of “black boxing”? (See “Jeffrey Epstein’s money bought a cover-up at the MIT Media Lab.”) The answer seems obvious: Keep prying eyes out. Prevent people from recognizing how mundane flashy stuff actually is.

The write up from HDSR states:

The belief that accuracy must be sacrificed for interpretability is inaccurate. It has allowed companies to market and sell proprietary or complicated black box models for high-stakes decisions when very simple interpretable models exist for the same tasks.

The write up moves with less purpose that Jeffrey Epstein.

I noted this statement as well:

Let us insist that we do not use black box machine learning models for high-stakes decisions unless no interpretable model can be constructed that achieves the same level of accuracy. It is possible that an interpretable model can always be constructed—we just have not been trying. Perhaps if we did, we would never use black boxes for these high-stakes decisions at all.

I love the privileged tone of the passage.

Here’s my take:

Years ago I prepared for a European country’s intelligence service an analysis of the algorithms used in smart software. I thought this was an impossible job. But after making some calls, talking to wizards, and doing a bit of reading about what’s taught in computer science classes, my team and I unearthed several interesting factoids:

  1. The black box became the marketing hot button in the mid 1990s. The outfit adding oomph to mystery and secrecy was Autonomy. If you are not familiar with the company, think Bayesian maths. Keep the neuro linguistic programming mechanism under wraps differentiated Autonomy from its competition.
  2. Computer science and advanced mathematics courses around the world incorporated into their courses of study some useful and mostly reliable methods; for example, k means. There were another nine computational touchstones we identified. Did we miss a few? Probably, but my team concluded that most of the fancy math outfits were using a handful of procedures and fiddling with thresholds, training data, and workflows to deliver their solutions. Why reveal to anyone that under the hood most of the fancy stuff for NLP, text analytics, machine learning, and the other buzzwords which seem so 2020 were the same.
  3. My team also identified that each of the widely used, what we called “good enough” methods, could be manipulated. Change a threshold here, modify training data there, create a feedback loop and rules there—the system output results that appeared quite accurate, even useful. Putting the methods in a black box disguised for decades the simple methods used by Cambridge Analytica to skew outputs and probably elections. Differentiation comes not from the underlying methods; uniqueness is a result of the little bitty tweaks. Otherwise, most systems are just lik the competitions’ systems.

Net net: Will transparent methods prevail? Unlikely. Making something clear reduces its perceived value. Just think how linking Jeffrey Epstein to MIT alters the outputs about good judgment.

Black boxes? Very useful indeed. Secrets? Selective revelation of facts? Millennial marketing? All useful

Stephen E Arnold, January 5, 2020

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta