Dataset Management for Revelytix Loom and Cloudera Navigator
March 27, 2013
A surprising article from DBMS 2 (DataBase Management System Services) about Dataset management includes an explanation of the new term, dataset. It was created for Revelytix, a big data software company, seems to have had trouble with the older term for what they do: metadata management. This term is problematic because it could refer to several types of data. Dataset management describes both Revelytix and the recently released Cloudera Navigator. The author asserts,
“My idea for the term dataset is to connote more grandeur than would be implied by the term “table”, but less than one might assume for a whole “database”. I.e.:
A dataset contains all the information about something. This makes it a bigger deal than a mere table, which could be meaningless outside the context of a database.
But the totality of information in a “dataset” could be less comprehensive than what we’d expect in a whole “database”.”
Mid-tier consultants may try to use the new problem as a revenue lever. Products to look to are Cloudera Navigator, which is from a leading Hadoop company and starts with auditing, and Revelytix Loom, which already does lineage in addition to auditing and is the main product of a company that does metadata management.
Chelsea Kerwin, March 27, 2013
Sponsored by ArnoldIT.com, developer of Augmentext