People and Big Data: Analytics for Mr and Ms Couch Potato

March 24, 2011

I have to admit that the idea of big data and the “people” was a concatenation new to me. I just read “Data Science Tookit Brings Big Data Analysis to the People.” Let’s look at this snippet:

Data Science Toolkit offers OCR functionality to convert PDFs or scanned image files to text files, filter geographic locations from news articles and other types of unstructured data or find political district and neighborhood information for any given location. Data Science Toolkit is available as a web service online, but it can also be downloaded and run on an Amazon EC2 or VM virtual machine.

I live in Harrod’s Creek, Kentucky. The “people” in this metropolis of a couple of thousand people consists of folks who use the Internet to look at pictures, send email, and maybe check out some online information about the local basketball scene. The sophisticated data consumers mostly work in my office. I know from my good morning chats at the local filling station cum junk food outlet that I am skewing the demographics with my generalization about Internet usage. Close enough for horse shoes as my grandfather used to say.

I think the idea of “big data” is interesting. We publish a curated blog  called Inteltrax that covers some of the interesting companies in the data fusion market. But if you think interest in a $1.0 million enterprise search system appeals to a narrow readership, data fusion has the same magnetism. There are not any “people.” There are college graduates with mathematical expertise and an compelling need to process information. Here in Harrod’s Creek, the “people” are more likely to check email and then fire up the flat screen to watch hoops.

Maybe the observation about “people” is a variant of Potomac Fever; that is, those exposed to the craziness of power and money in Washington, DC, think that “everyone” has the same visceral reaction to political push ups. I once heard a person who worked in a think tank describe the firm’s discussions about client engagements as “drinking our own Kool-Aid.” Tastes great, but the Kool-Aid is not enjoyed with the same lip smacking elsewhere. When was the last time you guzzled pumpkin or red bean Kool-Aid?

My view:

  1. A useful service such as the one described in the write up looks a heck of a lot more magnetic than it may be. That’s the unsupported assertion about “people” when the reality is that a tiny percentage of savvy folks will get with the big data program as a Web service.
  2. The notion that “people” can manipulate big data and find a pot of gold at the end of the analytics rainbow is charming, but essentially incorrect. There are quiet a few considerations to evaluate in the big data game. A shortcut can save time but also put the rental car in the ditch.
  3. Big data are the norm in many online operations. What is helpful to me is to explain that a tiny percentage of those with big data know what to do to squeeze nuggets from the log files.

Quite a story for me: I thought it was one of those PR, promo, search engine optimization type write ups. I then realized it was a Kool-Aid break after a lunch break in Silicon Valley where there is no Internet bubble. Absolutely not.

Stephen E Arnold, March 25, 2011



