Data Silos vs. Knowledge Graphs
May 26, 2021
Data scientist and blogger Dan McCreary has high hopes for his field’s future, describing what he sees as the upcoming shift “From Data Science to Knowledge Science.” He predicts:
“I believe that within five years there will be dramatic growth in a new field called Knowledge Science. Knowledge scientists will be ten times more productive than today’s data scientists because they will be able to make a new set of assumptions about the inputs to their models and they will be able to quickly store their insights in a knowledge graph for others to use. Knowledge scientists will be able to assume their input features:
- Have higher quality
- Are harmonized for consistency
- Are normalized to be within well-defined ranges
- Remain highly connected to other relevant data as such as provenance and lineage metadata”
Why will this evolution occur? Because professionals are motivated to develop their way past the current tedious state of affairs—we are told data scientists typically spend 50% to 80% of their time on data clean-up. This leaves little time to explore the nuggets of knowledge they eventually find among the weeds.
As McCreary sees it, however, the keys to a solution already exist. For example, machine learning can be used to feed high-quality, normalized data into accessible and evolving knowledge graphs. He describes how MarkLogic, where he used to work, developed and uses data quality scores. Such scores would be key to building knowledge graphs that analysts can trust. See the post for more details on how today’s tedious data science might evolve into this more efficient “knowledge science.” We hope his predictions are correct, but only time will tell. About five years, apparently.
Cynthia Murrell, May 26, 2021