Easy as 1,2,3 Common Mistakes Made with Data Lakes
December 15, 2015
The article titled Avoiding Three Common Pitfalls of Data Lakes on DataInformed explores several pitfalls that could negate the advantages of data lakes. The article begins with the perks, such as easier data access and of course, the cost-effectiveness of keeping data in a single hub. The first is sustainability (or the lack thereof), since the article emphasizes that data lakes actually require much more planning and management of data than conventional databases. The second pitfall raised is resource allocation,
“Another common pitfall of implementing data lakes arises when organizations need data scientists, who are notoriously scarce, to generate value from these hubs. Because data lakes store data in their native format, it is common for data scientists to spend as much as 80 percent of their time on basic data preparation. Consequently, many of the enterprise’s most valued resources are dedicated to mundane, time-consuming processes that considerably lengthen time to action on potentially time-sensitive big data.“
The third pitfall is technology contradictions or trying to use traditional approaches on a data lake that holds both big and unstructured data. Be not alarmed, however, the article goes into great detail about how to avoid these issues through data lake development with smart data technologies such as semantic tech.
Chelsea Kerwin, December 15, 2015