Prediction, Metadata, and Good Enough

June 14, 2012

Several PR mavens have sent me today multiple unsolicited emails about their clients’ predictive statistical methods. I don’t like spam email. I don’t like PR advisories that promise wild and crazy benefits for predictive analytics applied to big data, indexing content, or figuring out what stocks to buy.

March Communications was pitching Lavastorm and Kabel Deutschland. The subject analytics—real time, predictive, and discovery driven.


Predictive analytics can be helpful in many business and technical processes. Examples range from figuring out where to sell an off lease mint green Ford Mustang convertible to planning when to ramp up outputs from a power generation station. Where predictive analytics are not yet ready for prime time is identifying which horse will win the Kentucky Derby and determining where the next Hollywood starlet will crash a sports car. Predictive methods can suggest how many cancer cells will die under certain conditions and assumptions, but the methods cannot identify which cancer cells will die.

Can predictive analytics make you a big winner at the race track? If firms with rock sold predictive analytics could predict a horse race, would these firms be selling software or would these firms be betting on horse races?

That’s an important point. Marketers promise magic. Predictive methods deliver results that provide some insight but rarely rock solid outputs. Prediction is fuzzy. Good enough is often the best a method can provide.

In between is where hopes and dreams rise and fall with less clear cut results. I am, of course, referring to the use by marketers of lingo like this:

The idea behind these buzzwords is that numerical recipes can process information or data and assign probabilities to outputs. When one ranks the outputs from highest probability to lowest probability, an analyst or another script can pluck the top five outputs. These outputs are the most likely to occur. The approach works for certain Google-type caching methods, providing feedback to consumer health searchers, and figuring out how much bandwidth is needed for a new office building when it is fully occupied. Picking numbers at the casino? Not so much.

The “good enough” approach makes intuitive sense. For certain types of data, predictive analytics work in a manner which I describe as “good enough”. The idea is that algorithms generate outputs which are usable but not 100 percent correct. In some mathematical processes, such as the former NuTech Solutions’ sparse data method, the system would calculate what it could from the data available. Then as new data became available, the system would update the outputs. Not every cell in a data table would have a value or a calculated value. Another set of processes could look at the sparse data and move forward with its calculation. As new data became available, the second system would recalculate its outputs. Over time, the system would deliver usable outputs based on the available data. With more data and more iterative calculations, the outputs would become more “accurate.” NuTech’s mathematical methods involved a discipline known as mereology, genetic algorithms, and various mathematical operations which sanded down the rough edges of the approach.

What I find interesting about the marketers’ use of predictive analytics, indexing, coding, search, etc., is that the complexities are often glossed over. What I find are assertions about the accuracy of the outputs. Predictive methods do generate usable results; however, the notion that the outputs of a predictive system are “good enough” is one of the big assumptions about predictive methods.

An investment firm set up by Bill Manning, one of the founders of Manning & Napier Advisors, was a big believer in mathematics as a turbocharger of investment strategy. The concept was that certain methods could contribute useful insights which could help a person make an informed investment decision. Mr. Manning was an advocate of beta surfing and a wide range of mathematical operations. Although Mr. Manning was successful in his investment tactics, mathematics were a contributing factor, not the determining factor on what action to take. For financial services firms, predictive analytics are a tool, but firms relying exclusively on programmatic methods or “trigger trading” often find themselves involving humans. Rules can work better than real-time predictive analytics for certain types of transactions. Humans work even better for others.

If predictive methods delivered better than “good enough” outputs, the firm with the best algorithm would quickly dominate financial trading, picking horses in legalized betting parlors, and clean out casinos. The reality is that buzzwords provide unscrupulous marketers with opportunities to befuddle researchers. I ran a query for predictive analytics on Google and received a pointer to Curious I followed the link and the site appeared. The WebStatDomain system finds similar sites. The results were mixed and in some cases confusing. For example, the number two “most like” site was Another baffling link was a pointer to Algorithms were operating to make these links appear, but the method was not working for me.

When I encounter a search or content processing company offering “predictive” methods, I take three actions. First, I look at what the company says on its Web site. Second, I review the available descriptions of what the company does. Third, I look for verification that the company’s technology does what the marketers assert. I found the functions of Digital Reasoning and Ikanow credible. The companies explain their approach. And the companies track record is discernable and positive. What about analytics companies which cannot pass my simple test? Caution is advised.

My view is that there will be quite a bit of the “predictive” smoke being blown in the coming months. Words like “business information” and “big data analytics” have lost their punch. The notion of “predictive” as an all purpose adjective may be the marketers new little black dress.

That’s my prediction.

Stephen E Arnold, June 14, 2012

Sponsored by Polyspot


6 Responses to “Prediction, Metadata, and Good Enough”

  1. Prediction, Metadata, and Good Enough | Ontologique – the Ontology Boutique on June 14th, 2012 2:12 pm

    […] on Share this:Like this:LikeBe the first to like this. ‹ Older Post The Trouble […]

  2. Rich Turner on June 21st, 2012 10:01 am

    This is a good article – but the notion of “Predictive Analytics” doesn’t see it’s greatest value in picking the winning horse at the Kentucky Derby. Instead, it’s a way to leverage real human input, from experts, and use computing power to “predict” other information which is related or relevant. Just like with Search, Predictive Analytics has to start with something: what’s the desired outcome, what are the facts to date, where might the program identify trends, what data is being evaluated, etc. This is why in the legal community, where Predictive Coding using technology like CAAT from Content Analyst ( is showing great promise, the absolute first thing most of these products do is produce a statistical sample that is carefully reviewed and coded by experts. This gives the Analytics something to predict from, and empowers knowledge workers.

  3. 22 June 2012: Computerworld UK | Tap the 90 on June 22nd, 2012 2:04 pm

    […] Prediction, Metadata, and Good Enough ( […]

  4. FRE 702 and Predictive Coding : Beyond Search on July 2nd, 2012 12:09 am

    […] a Google sponsored link, dig into search results and find the best.  Last week we talked about how “good enough” was what many content analytics companies were chugging out to their clients.  We also offered […]

  5. cheap pocket pc pay n go mobiles on August 18th, 2012 6:49 am

    Do you hаve a ѕρam issue on this blog; I also am a blogger, аnԁ I was curious аbout
    your ѕituation; we have deνelοped some nicе pгoceԁures and we are
    looking to swap ѕolutions with othегs, why not shoot me an e-maіl if interеsteԁ.

  6. Logan Deshazo on August 21st, 2012 4:29 pm

    Hey there! I know this is kind of off-topic but I needed to ask. Does managing a well-established blog such as yours take a large amount of work? I’m brand new to running a blog but I do write in my journal every day. I’d like to start a blog so I will be able to share my personal experience and thoughts online. Please let me know if you have any kind of suggestions or tips for brand new aspiring blog owners. Thankyou!|

  • Archives

  • Recent Posts

  • Meta