Has Enterprise Search Drowned in a Data Lake?
December 6, 2015
I had a phone conversation with a person who unluckily resides in a suburb of New York City. Apparently the long commute allows my contact to think big thoughts and formulate even bigger questions. He asked me, “What’s going to happen to enterprise search?”
I thought this was a C minus questions, but New Yorkers only formulate A plus questions. I changed the subject to the new Chick Fil-A on Sixth. After the call, I jotted down some thoughts about enterprise search.
Here for your contemplation are five of my three comments which consumed three legal pad sheets. I also write small.
Enterprise Search Is Week Old Celery
In the late 1990s when the Verity hype machine was rolling and the Alphabet boys were formulating big thoughts about search, enterprise search was the hot ticket. For some techno cravers, enterprise search was the Alpha and Omega. If information is digital, finding an item of information was the thrill ride ending in a fluffy pile of money. A few folks made some money, but the majority of the outfits jumping into search either sold out or ended up turning off the lights. Today, enterprise search is a utility and the best approach is to use an open source solution. There are some proprietary systems out there, but the appeal of open source is tough to resist. Remember. Search is a utility, not a game changer for many organizations. Good enough tramples over precision, recall, and relevance.
New Buzzwords and the Same Old Tune
Hot companies today do not pound their electric guitar with the chords in findability. Take an outfit like Palantir. It is a search and information access outfit, but the company avoids the spotlight, positions its technology packages as super stealthy magic insight machines. Palantir likes analytics, visualizations, and similar next generation streamlined tangerine colored outputs. Many of the companies profiled in my monograph Cyberosint are, at their core, search systems. But “search” is tucked into a corner, and the amplified functions like fancy math, real time processing, and smart software dominate. From my point of view, these systems are search repackaged and enhanced for today’s procurement professionals. That’s okay. But search is still search no matter what the “visionaries” suggest. Many systems are enterprise search wrapped in new sheet music. The notes are the same.
I find the Big Data theme interesting. The idea of processing petabytes of data in a blink is future forward. The problem is that the way statistical procedures operate is to sidestep analyzing every single item. I can examine a grocery list of 10 items, but I struggle when presented with a real time updating of that list with trillions of data points a second. The reality of Big Data is that it has been around. A monk faced with copying two books in a couple of days has an intractable Big Data problem. The love of Hadoop and its extended family of data management tools does not bring the black sheep of the information family into the party room. Big Data requires pesky folks who have degrees in statistics or who have spent their youth absorbed in Mathematica, MatLab, SPSS, or SAS. Bummer. Enterprise search systems can choke on modest data. Big Data kills some systems dead like a wasp sprayed with Raid.
For a client in the UK, I had to dig into the notion of real time. Guess what the goslings found. There was not one type of real time information system. I believe there were seven distinct types of real time information. Each type has separate cost and complexity challenges. The most expensive systems were the ones charged with processing financial transactions in milliseconds. Real time for a Web site might mean anything from every 10 second or every week or so. Real time is tough because no matter what technologies are used to speed up computer activities, the dark shadow of latency falls. When data arrive which are malformed, the real time system returns incomplete outputs. Yikes. Incomplete? Yep, missing info. Real time is easy to say, but tough to deliver at a price an average Fortune 1000 company can afford or one of the essential US or UK government agencies can afford. Speed means lots of money. Enterprise search systems usually struggle with the real time thing.
Automatic, Smart Indexing, Outputs, Whatever
I know the artificial intelligence, cognitive approach to information is a mini megatrend. Unfortunately when folks look closely at these systems, there remains a need for slug like humans to maintain dictionaries, inspect outputs and tune algorithms, and “add value” when a pesky third party cooks up a new term, phrase, or code. Talk about smart software does not implement useful smart software. The idea is as appealing today as it was when Fulcrum in Ottawa pitched its indexing approach or when iPhrase talked about its smart system. I am okay with talk as long as the speakers acknowledge perpetual and include in the budget the humans who have to keep these motion Rube Goldberg confections on point. Humans are not very good indexers. Automated indexing systems are not very good indexers. The idea is, of course, that good enough is good enough. Sorry. Work remains for the programmers. The marketers just talk about the magic of smart systems. Licensees expect the systems to work, which is an annoying characteristic of some licensees and users.
Poor enterprise search. Relegated to utility status. Wrapped up in marketing salami. Celebrated by marketers who want to binge watch Parks and Recreation.
Enterprise search. You are still around, just demoted. The future? Good enough. Invest in hyper marketing and seek markets which do not have a firm grasp of search and retrieval. Soldier on. There are many streaming videos to watch if you hit the right combination on the digital slot machine.
Stephen E Arnold, December 6. 2015