Enterprise Search: Not Exactly Crazy but Close

April 13, 2020

I think I started writing the first of three editions of the Enterprise Search Report in 2003. I had been through the Great Search Procurement competition for the US government’s search system. The original name for the service was FirstGov.gov (the idea was that the service was the “first” place to look for public facing government information. The second name was USA.gov, and it was different from FirstGov because the search results were pulled from an ad supported Web index.

The highlight of the competition was Google’s losing the contract to Fast Search & Transfer. (Note: The first index exposed to the public was the work of Inktomi, a company mostly lost in the purple mists of Yahoo and time.) Google was miffed because Fast Search & Transfer had teamed with AT&T and replied to the SOW with some of the old fervor that characterized the company before Judge Green changed the game. I recall one sticking point: Truncation. In fact, one of the Google founders argued with me about truncation at a search conference. I pointed out that Google had to do truncation whether the founders wanted to or not. My hunch is that you don’t know much about truncation and what it contributes. I won’t get into the weeds, but the function is important. Think stemming, inflections, etc.

I examined more than 60 “enterprise” search systems, including the chemical structure search systems, the not-so-useful search tools in engineering design systems like AutoCAD, and a number of search systems now long forgotten like Delphis and Entopia, among others.

I have also written “The New Landscape of Search” published by Pandia and “Successful Enterprise Search Management” with Martin White, who is still chugging along with his brand of search expertise. Of course, I follow search and retrieval even though I have narrowed my focus to what I call intelware and policeware. These are next-generation systems which address the numerous short coming of the oversold, over-hyped, and misunderstood software allowing a commercial enterprise to locate specific items of interest from their hotchpotch of content.

In this blog, Beyond Search/DarkCyber I write about some enterprise search systems. In general, I remain very critical of the technologies and the mostly unfounded assertions about what a search-and-retrieval system can deliver to an organization.

With this background, I reacted to “Enterprise Search Software Comparison” with sadness. I was not annoyed by the tone or desire to compare some solutions to enterprise content finding. My response was based on my realization about how far behind understanding of enterprise search’s upsides and downsides, the gap between next-generation information retrieval systems and the “brand” names, and the somewhat shallow understanding of the challenges enterprise search poses for licensees, vendors, and users.

The write up “compares” these systems as listed in the order each is discussed in the source article cited above:

  • IBM Watson Discovery
  • Salesforce Einstein Search
  • Microsoft Search
  • Google Cloud Search
  • Amazon Kendra
  • Lucidworks
  • AlphaSense.

Each of these system merits a couple of paragraphs. For comparison, the discussion of systems in the Enterprise Search report typically required 15 or more pages. In CyberOSINT, I needed four pages for each system described. I had to cut the detail to meet the page limit for the book. A paragraph may be perfect for the thumb typing crowd, but detail does matter. The reason is that a misstep in selecting enterprise software can cost time and money and jobs. The people usually fired are those serving on the enterprise search system procurement team. Why? CFOs get very angry when triage to make a system work costs more than the original budget for the system. Users get angry when the system is slow (try 120 seconds to find a document in a content management system and then learn the document has not been indexed), stakeholders (the investment in search cannot be recovered without tricks, often illegal), and similar serious issues.

Let’s look at each of these systems described in the write up. I am going to move forward in alphabetical order. The listing in the source implies best to worst, and I want to avoid that. Also, at the end of this post, I will identify a few other systems which anyone seeking an enterprise search system may want to learn about. I post free profiles at www.xenky.com/vendor-profiles. The newer profiles cost money, and you can contact me at benkent2020 at yahoo dot com. No, I won’t give you a free copy. The free stuff is on my Xenky.com Web site.

AlphaSense. This is a venture backed company focused on making search the sharp end of a business intelligence initiative. The company is influenced by Eric Schmidt, the controversial Xoogler. The firm has raised about $100 million. The idea is to process disparate information and allow users to identify gems of information. AlphaSense competes with next-generation information services like DataWalk, Voyager Labs, and dozens of other forward looking firms. Will AlphaSense handle video, audio, time series data, and information stored on a remote workers’ laptop? Yeah. To sum up: Not an enterprise search solution; it is a variant of intelware. That’s no problem. AlphaSense is a me too of a different category of software.

Amazon Kendra. Amazon has a number of search solutions. This is Lucene. Yes, Lucene can deliver enterprise search; however, the system requires a commitment. Amazon’s approach is to put enterprise search into AWS. There’s nothing quite like the security of AWS in the hands of individuals who have not been “trained” in the ways of Amazon and Lucene.

Google Cloud Search. This is the spirit of the ill fated Google Search Appliance. The problems of GSA are ameliorated by putting content into the Google Cloud. What’s Google’s principal business? Yep, advertising. Those Googlers are trustworthy: Infidelity among senior managers raises this question, “Can we trust you to keep your body parts out of our private data?” You have to answer that question for yourself. (Sorry. Can’t say. Legal eagles monitor me still.)

IBM Watson Discovery. Okay, this is Lucene, home brew, and acquired technology like Vivisimo. Does it work? Why not ask Watson. IBM does have robust next-generation search, but that technology like IBM CyberTap is not available to the author of the article or to most commercial organizations. So IBM has training wheels search which requires oodles of IBM billable hours. Plus the company has next-generation information access. Which is it? Why not ask Watson? (If you used ITRC in the 1980s, you experienced my contribution to Big Blue. Plus I took money. None of that J5 stuff either.)

Salesforce Einstein Search. If a company puts its sales letter and contacts into this system, one can find the prospect and the email a salesperson sent that individual. Why do company’s want Salesforce search? When a salesperson quits, the company wants to make sure it has the leads, the sales story, etc. There are alternatives to Salesforce’s search system. Why? Maybe there are sufficient numbers of Salesforce customers who want to control what’s indexed and what employees can see? Just a thought.

Microsoft Search. I would like to write about Microsoft Search. (Yep, did a small thing for this outfit.)  I would like to identify the acquisitions Microsoft completed to “improve” search. I would like to point out that Microsoft is changing Windows 10 search again. But that’s the story. One flavor of Microsoft Search is Fast Search & Transfer. It is so wonderful that a competitive solution is available from outfits like Surfray, EPI Server, and even Coveo (yep, the customer support and kitchen sink vendor). Why? Microsoft Search is very similar to the Google search: Young people fooling around in order to justify their salaries and sense of self worth. The result? I particularly like the racist chat bot and the fact that Microsoft bought Fast Search & Transfer as the criminal case for financial fraud was winding through Norway’s court system. Yep, criminal behavior. Why? Check out my previous write ups about Fast Search & Transfer.

Lucidworks. Okay, I did some small work for this outfit when it was called Lucid Imagination. Then the revolving door started to spin. The Lucene/Solr system collected many, many millions and started its journey to … wait for it… digital commerce and just about anything that could be slapped on open source software. Can one “do” enterprise search with Solr? Sure. Just make sure you have money and time. Lucidworks’ future is not exactly one that will thrill its funding sources. But there is hope for an acquisition or maybe an IPO. Is Lucidworks a way to get “faceted search” like Endeca offered in 1998? Sure, but why not license Endeca from Oracle? Endeca has some issues, of course, but I wanted to put a time mark in this essay so the “age” of Lucidworks’ newest ideas are anchored with a me-too peg.

What vendors are not mentioned who can implement enterprise search?

I will highlight three briefly, just to make clear the distortion of the enterprise market that this article presents to a thumb typing millennial procurement professional:

  1. Exalead spawned a number of interesting content companies. One of them is Algolia. It works and has some Exalead DNA.
  2. SearchIT is an outfit in Europe. It delivers what I consider a basic enterprise search system.
  3. Maxxcat produces a search appliance which is arguably a bit more modern than the Thunderstone appliance.
  4. Elastic Elasticsearch. This is the better Compass. How many outfits use Elasticsearch? Lots. There’s a free version and for-fee help when fans of Shay Bannon get stuck. Check out this how to.

There are others, of course, but my point is that mixing apples and oranges gives one a peculiar view of what is in the enterprise search orchard. It is better to categorize, compare and contrast systems that perform “enterprise search” functions. What are these? It took me 400 pages to explain what users expect, what systems can deliver, and the cost/engineering assumptions required to deliver a solution that is actually useful.

Search is hard. The next-generation systems point the way forward. Enterprise search has, in my opinion, not advanced very far beyond the original Smart system or IBM STAIRS III.

PS. Notice I did not use the jargon natural language processing, semantics, text analytics, and similar hoo haa. Why? Search has a different meaning for each worker in quite distinct business units. Do you expect a chemical engineer looking for Hexamethylene triperoxide diamine to use a word or a chemical structure? What about a marketing person seeking a video of a sales VP’s presentation at a client meeting yesterday? What about that intern’s Instagram post of a not-yet-released product prototype? What about the information on that sales VP’s laptop as he returns to his home office after a news story appeared about his or her talk? What about those human resource personnel data files? What about the eDiscovery material occupying the company’s legal team? What about the tweet a contractor sent to a big client about the cost of a fix to a factory robot that trashed a day’s production? What about the emails between an executive and a sex worker related to heroin? (A real need at a certain vendor of enterprise search!) Yeah! Enterprise search.

Stephen E Arnold, April 13, 2014


Comments are closed.

  • Archives

  • Recent Posts

  • Meta