Frequentists Versus Bayesians: Is HP Amused?
February 19, 2014
I read a long report and then a handful of spin off reports about HP and Autonomy, mid February 2014 version. The Financial Times’s story is a for fee job. You can get a feel for the information in “HP Executives Knew of Autonomy’s Hardware Sales Losses: Report.” There are clever discussions of this allegedly “new information” in a number of blogs. What is interesting is an allegedly accurate chunk of information in “HP Explores Settlement of Autonomy Shareholder Lawsuit.” My head is spinning. HP buys something. Changes the person on watch when the deal was worked out. HP gets a new boss and makes changes to its board of directors. HP then accuses everyone except itself for buying Autonomy for a lot of money. HP then whips up the regulators, agitates accounting firms, and pokes Michael Lynch with a cattle prod.
As this activity was in the microwave, it appears that HP knew how the hardware/software deals were handled. If the reports are accurate, Dell hardware was more desirable than HP’s hardware.
But there is a more interesting twist. I refer you, gentle reader, to “A Fervent Defense of Frequentist Statistics.” Autonomy’s “black box” consists of Bayesian methods and what I call MCMC or Monte Carlo and Markov Chain techniques. The idea is that once some judgment calls are made, the Integrated Data Operating Layer or IDOL can chug away without human involvement. When properly resourced and trained, the Autonomy system works for certain types of content processing and information retrieval applications. You can read more about IDOL in our for-fee analysis of IDOL. This document reviews several important patents germane to the Autonomy system. You can purchase a copy of this analysis at https://gumroad.com/l/autonomy.
In a Fervent Defense, an old battle line is reactivated. The “frequentists” are not exactly thrilled with the rise of Bayesian methods. Autonomy emerged from Cambridge University when some of the Bayesian methods were revealed as crucial to World War II activities. Freqeuntists point out that there are some myths about Bayesian methods. The write up is not for MBAs, failed Web masters, and unemployed middle school teachers. For example, the myths allegedly dispelled in the article are:
- “Bayesian methods are optimal.
- Bayesian methods are optimal except for computational considerations.
- We can deal with computational constraints simply by making approximations to Bayes.
- The prior isn’t a big deal because Bayesians can always share likelihood ratios.
- Frequentist methods need to assume their model is correct, or that the data are i.i.d.
- Frequentist methods can only deal with simple models, and make arbitrary cutoffs in model complexity (aka: “I’m Bayesian because I want to do Solomonoff induction”).
- Frequentist methods hide their assumptions while Bayesian methods make assumptions explicit.
- Frequentist methods are fragile, Bayesian methods are robust.
- Frequentist methods are responsible for bad science
- Frequentist methods are unprincipled/hacky.
- Frequentist methods have no promising approach to computationally bounded inference.”
The key point is that HP is going to learn, already has learned, or learned and just forgotten that Bayesian methods are not a suitable for every single information processing application. In fact, using Bayesian when a frequentist method is more appropriate can produce unsatisfactory results for a discriminating data scientist. The use of frequentist methods when Bayesian is more appropriate can yield equally dissatisfying outputs.
The point is that if one buys a system built on one method and then applies it inappropriately, the knowledgeable user is going to be angry. It is possible that some disappointed users will take legal action, demand a license refund, or just hit the conference circuit and explain why such and such a system was a failure.
Will HP put the three ring circus of buying Autonomy to rest and then find itself mired in the jaws of a Bayesian versus frequentist dispute? My hunch is, “Yep.”
Could HP have convinced itself that Autonomy was a universal fix it kit for information processing problems? If the answer is, “Yes,” then HP is going to have to come to grips with licensees who are going to point out that the solution did not cure the problem.
In short, HP faces more excitement. The company will not be “idle” any time soon. HP may not be amused, but I am. Search is indeed a bit more difficult than some would have customers believe.
Stephen E Arnold, February 19, 2014