Need Confidence in Your Big Data? InfoSphere Delivers Assurances
June 17, 2015
I spotted a tweet about a white paper titled “Improve the Confidence in Your Big Data with IBM InfoSphere.” The write up was a product of Information Asset LLC, a company with which I was not familiar. The link in the tweet was dead, so I located a copy of the white paper on the IBM Web site at this link, which I verified on June 17, 2015. If it is dead when you look for the white paper, take it up with IBM, not me.
The white paper is seven pages long and explains that IBM’s InfoSphere is the hub of some pretty interesting functions; specifically:
- Big Data exploration
- Enhanced 360 [degree] view of the customer
- Application development and testing
- Application efficiency
- Security and compliance
- Application consolidation and retirement
- Data warehouse augmentation
- Operations analysis
- Security/intelligence extension.
I thought InfoSphere was a brand created in 2008 by IBM marketers to group IBM’s different information management software products into one basket. The Big Data thing is a new twist for me.
The white paper takes each of these nine topics and explains them one by one. I found some interesting tidbits in several of the explanations, but I have only enough energy and good humor to tackle one category, Big Data exploration.
The notion of exploring Big Data is an interesting one. I thought one normalized, queried, and reviewed results of a query. The exploration thing is foreign to me. Big Data, by definition, are—well—big. Big collections are data are difficult to explore. I formulate queries, look at results, review clusters, etc. I suppose I am exploring, but I think of the work as routine database look ups. I am so hopelessly old fashioned, aren’t I. Some outfits like Recorded Future generate reports which illustrate certain query results, but we are back to queries, aren’t we.
Here’s what I learned about InfoSphere’s capabilities. Keep in mind that InfoSphere is a collection of discrete software programs and code systems. Data scientists need to explore and mine Big Data to uncover interesting nuggets that are relevant for better decision making. A large hospital system built a detailed model to predict the likelihood that patients with congestive heart failure would be readmitted within 30 days. Smoking status was a key variable that was strongly correlated with the likelihood of readmission. At the outset, only 25 percent of the structured data around smoking status was populated with binary yes/no answers. However, the analytics team was able to increase the population rate for smoking status to 85 percent of the encounters by using content analytics. The content analytics team was also able to use physicians’ and nurses’ notes to unlock additional information, such as smoking duration and frequency. There were a number of reasons for the discrepancy. For example, some patients indicated that they were non-smokers, but text analytics revealed the following in the doctors’ notes: “Patient is restless and asked for a smoking break,” “Patient quit smoking yesterday,” and “Quit.” IBM InfoSphere Big Insights offers strong text analytic capabilities. In addition, IBM InfoSphere Business Glossary provides a repository for key definitions such as “readmission.” IBM InfoSphere Master Data Management provides an Enterprise Master Patient Index to track readmissions for the same patient across multiple hospitals in the same network. Finally, IBM InfoSphere Data Explorer provides robust search capability across unstructured data.
Okay, search is the operative word. I find this fascinating because IBM is working hard to convince me that Watson can ingest information and figure out what it means and then answer questions automatically. For example, if a cancer doctor does not know what treatment to use, Watson will tell her.
I must tell you that this white paper illustrates the fuzzy thinking that characterizes many firms’ approach to information challenges. Remember. The InfoSphere Big Data explorer is just one of nine capabilities of a marketing label.
Useful? Just ring up your local IBM regional office and solve nine problems with that phone call. Magic. Big insights too.
Stephen E Arnold, June 17, 2015