Pitching All Source Analysis: Just Do Dark Data. Really?
November 25, 2016
I read “Shedding Light on Dark Data: How to Get Started.” Okay, Dark Data. Like Big Data, the phrase is the fruit of the nomads at Garner Group. The person embracing this sort of old concept is an outfit OdinText. Spoiler: I thought the write up was going to identify outfits like BAE Systems, Centrifuge Systems, IBM Analyst’s Notebook, Palantir Technologies, and Recorded Future (an In-Q-Tel and Google backed outfit). Was I wrong? Yes.
The write up explains that a company has to tackle a range of information in order to be aware, informed, or insightful. Pick one. Here’s the list of Dark Data types, which the aforementioned companies have been working to capture, analyze, and make sense of for almost 20 years in the case of NetReveal (Detica) and Analyst’s Notebook. The other companies are comparative spring chickens with an average of seven years’ experience in this effort.
- Customer relationship management data
- Data warehouse information
- Enterprise resource planning information
- Log files
- Machine data
- Mainframe data
- Semi structured information
- Social media content
- Unstructured data
- Web content.
I think the company or non profit which tries to suck in these data types and process them may run into some cost and legal issues. Analyzing tweets and Facebook posts can be useful, but there are costs and license fees required. Frankly not even law enforcement and intelligence entities are able to do a Cracker Jack job with these content streams due to their volume, cryptic nature, and pesky quirks related to metadata tagging. But let’s move on. To this statement:
Phone transcripts, chat logs and email are often dark data that text analytics can help illuminate. Would it be helpful to understand how personnel deal with incoming customer questions? Which of your products are discussed with which of your other products or competitors’ products more often? What problems or opportunities are mentioned in conjunction with them? Are there any patterns over time?
Yep, that will work really well in many legal environments. Phone transcripts are particularly exciting.
How does one think about Dark Data? Easy. Here’s a visualization from the OdinText folks:
Notice that there are data types in this diagram NOT included in the listing above. I can’t figure out if this is just carelessness or an insight which escapes me.
How does one deal with Dark Data? OdinText, of course. Yep, of course. Easy.
Stephen E Arnold, November 25, 2016