Computational Limits: Just a Reminder to the Cheerleaders for Big Data and Analytics
December 1, 2016
“Let’s index everything” or “Let’s process all the digital data”. Ever hear these statements or something similar? I have. In fact, I hear this type of misinformed blather almost every day. I read “Big Data Coming in Faster Than Biomedical Researchers Can Process It” seems to have figured out that yapping about capture and crunch are spitting out partial truths. (What’s new in the trendy world of fake news?)
The write up points out in a somewhat surprised way:
“It’s not just that any one data repository is growing exponentially, the number of data repositories is growing exponentially,” said Dr. Atul Butte, who leads the Institute for Computational Health Sciences at the University of California, San Francisco.
Now the kicker:
Prospecting for hints about health and disease isn’t going to be easy. The raw data aren’t very robust and reliable. Electronic medical records are often kept in databases that aren’t compatible with one another, at least without a struggle. Some of the potentially revealing details are also kept as free-form notes, which can be hard to extract and interpret. Errors commonly creep into these records. And data culled from scientific studies aren’t entirely trustworthy, either.
Net net: Lots of data. Inadequate resources. Inability to filter for relevance. Failure to hook “data” to actual humans. The yap about curing cancer or whatever disease generates a news release indicates an opportunity. But there’s no easy solution.
The resources to “make sense” of large quantities of historical and real time data are not available. But marketing is easy. Dealing with real world data is a bit more difficult. Keep that in mind if you develop a nifty disease and expect Big Data and analytics to keep the cookies from burning. Sure the “data” about making a blue ribbon batch of chocolate chips is available. Putting the right information into a context at the appropriate time is a bit more difficult even for the cognitive, smart software, text analytics cheerleaders.
Wait. I have a better idea. Why not just let a search system find and discover exactly what you need? Let me know how that works out for you.
Stephen E Arnold, December 1, 2016