Parsing Document: A Shift to Small Data
November 14, 2019
DarkCyber spotted “Eigen Nabs $37M to Help Banks and Others Parse Huge Documents Using Natural Language and Small Data.” The folks chasing the enterprise search pot of gold may need to pay attention to figuring out specific problems. Eigen uses search technology to identify the important items in long documents. The idea is “small data.”
The write up reports:
The basic idea behind Eigen is that it focuses what co-founder and CEO Lewis Liu describes as “small data”. The company has devised a way to “teach” an AI to read a specific kind of document — say, a loan contract — by looking at a couple of examples and training on these. The whole process is relatively easy to do for a non-technical person: you figure out what you want to look for and analyze, find the examples using basic search in two or three documents, and create the template which can then be used across hundreds or thousands of the same kind of documents (in this case, a loan contract).
Interesting, but the approach seems similar to identify several passages in a text and submitting these to a search engine. This used to be called “more like this.” But today? Small data.
With the cloud coming back on premises and big data becoming user identified small data, what’s next? Boolean queries?
DarkCyber hopes so.
Stephen E Arnold, November 14, 2019
Comments
One Response to “Parsing Document: A Shift to Small Data”
Steve just letting you know that your snarkiness makes at least one person laugh. -F