What Has Happened to Enterprise Search?

June 28, 2018

I read “Enterprise Search in 2018: What a Long Strange Trip It’s Been.” I found the information presented interesting. The thesis is that enterprise search has been on a journey almost like a “Wizard of Oz” experience.

The idea of consolidation, from my point of view, boils down to executives who want to cash in, get out, and move on. The reasons are not far to seek: Over promising and under delivering, financial manipulations, and positioning a nuts and bolts utility as something that solves information problems.

lava flow fixed

Some, maybe many, licensees of proprietary enterprise search systems may have viewed their investment as an opportunity that delivered an unexpected but inevitable outcome. Where is that lush scenery? Where’s the beach?

The reality is that enterprise search vendors were aced by Shay Banon. His Act II of Compass: A Finding Story was Elasticsearch and the company Elastic. Why not use free and open source software. At least the code gets some bugs fixed unlike old school proprietary enterprise search systems. Bug fixes? Yep, good luck with your Fast Search & Retrieval ESP platform idiosyncrasies.

The landscape today is a bit like the volcanic transformation of Hawaii’s Vacationland. Real estate agents will be explaining that the lava flows have created new beach views, promising that eruptions are a low probability event.

The write up does not highlight one simple fact: Enterprise search has given way to “roll up” services or what I call “meta-plays.” The idea is that search is tucked inside systems like Diffeo, Palantir Gotham, and other “intelligence” platforms.

Why aren’t these enterprise grade solutions sold as “enterprise search” or “enterprise business intelligence and discovery solutions”?

The answer is that the information retrieval nest has been marginalized by the actions of vendors stretching back to the Smart system and to the present with “proprietary” solutions which actually include open source technology. These systems are anchored in the past.

Consider Diffeo?

Why offer enterprise search when one can provide a solution that delivers information in context, provides collaboration tools, and displays information in different ways with a single mouse click?

Read more

What Happens When Intelligence Centric Companies Serve the Commercial and Political Sectors?

March 18, 2018

Here’s a partial answer:






Years ago, certain types of companies with specific LE and intel capabilities maintained low profiles and, in general, focused on sales to government entities.

How times have changed!

In the DarkCyber video news program for March 27, 2018, I report on the Madison Avenue type marketing campaigns. These will create more opportunities for a Cambridge Analytica “activity.”

Net net: Sometimes discretion is useful.

Stephen E Arnold, March 18, 2018

AIs Newest Hurdle Happens When the Machines Hallucinate

November 27, 2017

Artificial Intelligence has long been thought of as an answer to airport security and other areas. The idea of intelligent machines finding the bad guys is a good one in theory. But what if the machines aren’t as clever as we think? A stunning new article in The Verge, “Google’s AI Thinks This Turtle is a Gun and That’s a Problem,” made us sit up and take notice.

As you can guess by the title, Google’s AI made a huge flub recently:

This 3D-printed turtle is an example of what’s known as an “adversarial image.” In the AI world, these are pictures engineered to trick machine vision software, incorporating special patterns that make AI systems flip out. Think of them as optical illusions for computers. You can make adversarial glasses that trick facial recognition systems into thinking you’re someone else, or can apply an adversarial pattern to a picture as a layer of near-invisible static. Humans won’t spot the difference, but to an AI it means that panda has suddenly turned into a pickup truck.

This adversarial image news is especially concerning when you consider how quickly airports are implementing this technology. Dubai International airport is already using self-driving carts for luggage. It’s only a matter of time until security screening goes the same way. You’d best hope they iron out adversarial image issues before we do.

Patrick Roland, November 27, 2017

Free Services: What Happens When They Are Killed Off?

November 3, 2017

In the salad days of online, one paid for “time” (the online connection) and one paid for the “content” (the citations, data, full text). Today data are free. Hooray.*

For users of the the Google flight information, the news that Google was likely to shut down its flight data feed is bad news. Even worse, those nifty MBA inspired spreadsheets which happily omitted the cost of flight data are going to have to be re-imagined.

And Oath (remember Yahoo?) is, it seems, going to cut off the finance, if the story in Hacker News is accurate. The write up states:

Yahoo Finance has apparently killed is API. Zero warning. Lots of apps probably use this. Before, you could get stock information by using  http://download.finance.yahoo.com/d/quotes.csv Now, you get the following message: It has come to our attention that this service is being used in violation of the Yahoo Terms of Service. As such, the service is being discontinued. For all future markets and equities data research, please refer to finance.yahoo.com. What violation of TOS? People have been using this for years without any issues. If you are going to cut this off, how about a warning and heads up? Guess that’s what we should expect from OATH / Verizon.

The comments are interesting.

Net net: The online model from the 1969 to 1995 phase of online may be poking its nose from a Rip Van Winkle snooze.

And those spreadsheets? MBAs are crafty. The numbers will work out—at least in Excel. In real life? Hmmm. Good question.

Stephen E Arnold, November 3, 2017

* Editor’s update: Heads up. I last night (November 3, 2017) I received an impassioned and mom-like communication from a person who wanted confidentiality about the information he was about to impart via Gmail email. (Isn’t that type of email parsed by smart software for the purpose of collecting ad revenue and data?) The alleged former Googler (aka Xoogler) was unaware that I was at dinner with my wife enjoying a grilled squirrel burger with the cheese on the bottom in the approved Google manner. But this write up was an urgent matter in the mind of the agitated Xoogler eager to share confidential information with me. Lucky me! The email included numbers and a statement that I had to rewrite this article because I was, as I have noted on numerous occasions in the course of this 10 year old Beyond Search blog, an “addled goose”. The email made clear that killing Google services and products does no harm, and I was wrong, incorrect, off base, and a Bambi brained deer. Please, check out the source story from Marketwatch. Make up your own mind, gentle reader, because I try to present my opinion whilst separating the giblets from the goosefeathers.  My view is that abrupt, unilateral modifications of services is a good thing for some devlopers and users. But I do enjoy confidential communications about the inner workings of my favorite search engine as I munch my burger with cheese on the bottom in the Sundar Pichai approved manner. Plus, I enjoy recalling the Google Reader, Google Talk, Google Health, Knol, Google Buzz, and my favorite and the fave of some Brazilians, Orkut. You don’t? Well, you, unlike me, are not trying to be Googley. To refresh your memory, check out the Google Graveyeard. Do you have a problem with terminated services? In my opinion, termination with extreme prejudiced is in your best interests. Now put the cheese on the bottom of the meat patty.

Enterprise Search Revisionism: Can One Change What Happened

March 9, 2016

I read “The Search Continues: A History of Search’s Unsatisfactory Progress.” I noted some points which, in my opinion, underscore why enterprise search has been problematic and why the menagerie of experts and marketers have put search and retrieval on the path to enterprise irrelevance. The word that came to mind when I read the article was “revisionism” for the millennials among us.

The write up ignores the fact that enterprise search dates back to the early 1970s. One can argue that IBM’s Storage and Information Retrieval System (STAIRS) was the first significant enterprise search system. The point is that enterprise search as a productized service has a history of over promising and under delivering of more than 40 years.

image.pngEnterprise search with a touch of Stalinist revisionism.

Customers said they wanted to “find” information. What those individuals meant was have access to information that provided the relevant facts, documents, and data needed to deal with a problem.

Because providing on point information was and remains a very, very difficult problem, the vendors interpreted “find” to mean a list of indexed documents that contained the users’ search terms. But there was a problem. Users were not skilled in crafting queries which were essentially computer instructions between words the index actually contained.

After STAIRS came other systems, many other systems which have been documented reasonably well in Bourne and Bellardo-Hahn’s A History of Online information Services 1963-1976. (The period prior to 1970 describes for-fee research centric online systems. STAIRS was among the most well known early enterprise information retrieval system.)  I provided some history in the first three editions of the Enterprise Search Report, published from 2003 to 2007. I have continued to document enterprise search in the Xenky profiles and in this blog.

The history makes painful reading for those who invested in many search and retrieval companies and for the executives who experienced the crushing of their dreams and sometimes career under the buzz saw of reality.

In a nutshell, enterprise search vendors heard what prospects, workers overwhelmed with digital and print information, and unhappy users of those early systems were saying.

The disconnect was that enterprise search vendors parroted back marketing pitches that assured enterprise procurement teams of these functions:

  • Easy to use
  • “All” information instantly available
  • Answers to business questions
  • Faster decision making
  • Access to the organization’s knowledge.

The result was a steady stream of enterprise search product launches. Some of these were funded by US government money like Verity. Sure, the company struggled with the cost of infrastructure the Verity system required. The work arounds were okay as long as the infrastructure could keep pace with the new and changed word-centric documents. Toss in other types of digital information, make the system perform ever faster indexing, and keep the Verity system responding quickly was another kettle of fish.

Research oriented information retrieval experts looked at the Verity type system and concluded, “We can do more. We can use better algorithms. We can use smart software to eliminate some of the costs and indexing delays. We can [ fill in the blank ].

The cycle of describing what an enterprise search system could actually deliver was disconnected from the promises the vendors made. As one moves through the decades from 1973 to the present, the failures of search vendors made it clear that:

  1. Companies and government agencies would buy a system, discover it did not do the job users needed, and buy another system.
  2. New search vendors picked up the methods taught at Cornell, Stanford, and other search-centric research centers and wrap on additional functions like semantics. The core of most modern enterprise search systems is unchanged from what STAIRS implemented.
  3. Search vendors came like Convera, failed, and went away. Some hit revenue ceilings and sold to larger companies looking for a search utility. The acquisitions hit a high water mark with the sale of Autonomy (a 1990s system) to HP for $11 billion.

What about Oracle, as a representative outfit. Oracle database has included search as a core system function since the day Larry Ellison envisioned becoming a big dog in enterprise software. The search language was Oracle’s version of the structured query language. But people found that difficult to use. Oracle purchased Artificial Linguistics in order to make finding information more intuitive. Oracle continued to try to crack the find information problem through the acquisitions of Triple Hop, its in-house Secure Enterprise Search, and some other odds and ends until it bought in rapid succession InQuira (a company formed from the failure of two search vendors), RightNow (technology from a Dutch outfit RightNow acquired), and Endeca. Where is search at Oracle today? Essentially search is a utility and it is available in Oracle applications: customer support, ecommerce, and business intelligence. In short, search has shifted from the “solution” to a component used to get started with an application that allows the user to find the answer to business questions.

I mention the Oracle story because it illustrates the consistent pattern of companies which are actually trying to deliver information that the u9ser of a search system needs to answer a business or technical question.

I don’t want to highlight the inaccuracies of “The Search Continues.” Instead I want to point out the problem buzzwords create when trying to understand why search has consistently been a problem and why today’s most promising solutions may relegate search to a permanent role of necessary evil.

In the write up, the notion of answering questions, analytics, federation (that is, running a single query across multiple collections of content and file types), the cloud, and system performance are the conclusion of the write up.


The use of open source search systems means that good enough is the foundation of many modern systems. Palantir-type outfits, essential an enterprise search vendors describing themselves as “intelligence” providing systems,, uses open source technology in order to reduce costs, shift bug chasing to a community, The good enough core is wrapped with subsystems that deal with the pesky problems of video, audio, data streams from sensors or similar sources. Attivio, formed by professionals who worked at the infamous Fast Search & Transfer company, delivers active intelligence but uses open source to handle the STAIRS-type functions. These companies have figured out that open source search is a good foundation. Available resources can be invested in visualizations, generating reports instead of results lists, and graphical interfaces which involve the user in performing tasks smart software at this time cannot perform.

For a low cost enterprise search system, one can download Lucene, Solr, SphinxSearch, or any one of a number of open source systems. There are low cost (keep in mind that costs of search can be tricky to nail down) appliances from vendors like Maxxcat and Thunderstone. One can make do with the craziness of the search included with Microsoft SharePoint.

For a serious application, enterprises have many choices. Some of these are highly specialized like BAE NetReveal and Palantir Metropolitan. Others are more generic like the Elastic offering. Some are free like the Effective File Search system.

The point is that enterprise search is not what users wanted in the 1970s when IBM pitched the mainframe centric STAIRS system, in the 1980s when Verity pitched its system, in the 1990s when Excalibur (later Convera) sold its system, in the 2000s when Fast Search shifted from Web search to enterprise search and put the company on the road to improper financial behavior, and in the efflorescence of search sell offs (Dassault bought Exalead, IBM bought iPhrase and other search vendors), and Lexmark bought Brainware and ISYS Search Software.

Where are we today?

Users still want on point information. The solutions on offer today are application and use case centric, not the silly one-size-fits-all approach of the period from 2001 to 2011 when Autonomy sold to HP.

Open source search has helped create an opportunity for vendors to deliver information access in interesting ways. There are cloud solutions. There are open source solutions. There are small company solutions. There are more ways to find information than at any other time in the history of search as I know it.

Unfortunately, the same problems remain. These are:

  1. As the volume of digital information goes up, so does the cost of indexing and accessing the sources in the corpus
  2. Multimedia remains a significant challenge for which there is no particularly good solution
  3. Federation of content requires considerable investment in data grooming and normalizing
  4. Multi-lingual corpuses require humans to deal with certain synonyms and entity names
  5. Graphical interfaces still are stupid and need more intelligence behind the icons and links
  6. Visualizations have to be “accurate” because a bad decision can have significant real world consequences
  7. Intelligent systems are creeping forward but crazy Watson-like marketing raises expectations and exacerbates the credibility of enterprise search’s capabilities.

I am okay with history. I am not okay with analyses that ignore some very real and painful lessons. I sure would like some of the experts today to know a bit more about the facts behind the implosions of Convera, Delphis, Entopia, and many other companies.

I also would like investors in search start ups to know a bit more about the risks associated with search and content processing.

In short, for a history of search, one needs more than 900 words mixing up what happened with what is.

Stephen E Arnold, March 9, 2016

Want to Know What Happens Online Every 60 Seconds?

December 4, 2015

I thought I knew. Time wasting, distractive behavior, and non productive behavior.

Wrong again. I read “What Happens Online Every Minute?” The document is an infographic which reveals a number of factoids. (Who knows if these are accurate or a 20 something daydream.)

  • Every minute Vine users play 1,041,666 videos. I like the precision of this number. The happenstance of the sign of the devil is a delight. Remember? 666.
  • In seconds Alphabet Googlers who can probably spell “video” nine out of ten times upload 300 hours of new video. The idea is that in one minute, you have the opportunity to fritter away 300 hours of couch potato time whether in a Google self driving car, in your own car, or standing on a line to buy a slice in Manhattan.
  • In 1/60th of an hour, Twitter users send 347,222 tweets. How many of these are from marketers? No info. But again the precision of the number is outstanding. I like the 222 number which connotes faith. I have faith in Twitter. Also, 222 is a a strobogrammatic number. Nifty, eh?

View the original. There will be a factoid to make your day or at least a few seconds so you can get back to viewing the video goodness.

Stephen E Arnold, December 4, 2015

Whatever Happened to Social Search?

January 7, 2015

Social search was supposed to integrate social media and regular semantic search to create a seamless flow of information. This was one of the major search points for a while, yet it has not come to fruition. So what happened? TechCrunch reports that it is “Good Riddance To Social Search” and with good reason, because the combination only cluttered up search results.

TechCrunch explains that Google tried Social Search back in 2009, using its regular search engine and Google+. Now the search engine mogul is not putting forth much effort in promoting social search. Bing tried something by adding more social media features, but it is not present in most of its search results today.

Why did this endeavor fail?

“I think one of the reasons social search failed is because our social media “friendships” don’t actually represent our real-life tastes all that well. Just because we follow people on Twitter or are friends with old high school classmates on Facebook doesn’t mean we like the same restaurants they do or share the politics they do. At the end of the day, I’m more likely to trust an overall score on Yelp, for example, than a single person’s recommendation.”

It makes sense considering how many people consider their social media feeds are filled with too much noise. Having search results free of the noiwy makes them more accurate and helpful to users.

Whitney Grace, January 07, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

Appen Uses Humans to Improve Non-English Search Relevance

March 21, 2014

The Appen explanation titled Query Relevance delves into the work that the language, search and social technology company has done recently to improve natural language search. Linguist PhD Julie Vonwiller founded the company in 1996 with her engineer husband Chris Vonwiller. In 2010, Appen merged with Butler Hill Group and began making strides in language resources, search, and text. The article explores the issues at hand when it comes to natural language search,

“Even a query as seemingly simple as the word “blue” could be looking for any of the following: a description or picture of the color, a television show, a credit card, a misspelling of an electronic cigarette brand, or a rap artist. By analyzing what the most likely user intent is and returning valid and appropriate results in the correct order of relevance, we encourage a relationship whereby the user will return again and again to our client’s search engine.”

Appen has established a “global network” of locals who are trained experts in the language and local culture. This team allows for the most accurate interpretations of queries from regional users. The company is continually working to improve their processes, both through collaboration with users and advances in the program to provide the best possible results.

Chelsea Kerwin, March 21, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

What is Happening with Natural Language Processing?

May 29, 2013

Why Are We Still Waiting for Natural Language Processing, an article on The Chronicle of Higher Education, explores the failure of the 21st century to produce Natural Language Processing, or NLP. This would mean the ability of computers to process natural human language. The steps required are explained in the article,

“ In the 1980s I was convinced that computers would soon be able to simulate the basics of what (I hope) you are doing right now: processing sentences and determining their meanings.

To do this, computers would have to master three things. First, enough syntax to uniquely identify the sentence; second, enough semantics to extract its literal meaning; and third, enough pragmatics to infer the intent behind the utterance, and thus discerning what should be done or assumed given that it was uttered.”

Currently, typing a question into Google can result in exactly the opposite information from what you are seeking. This is because it is unable to infer, since natural conversation is full of gaps and assumptions that we are all trained to leap through without failure. According to the article, the one company that seemed to be coming close to this technology was Powerset in 2008. After making a deal with Microsoft, however, their site now only redirects to Bing, a Google clone. Maybe NLP like Big Data, business intelligence, and predictive analytics is just a buzzword with marketing value.

Chelsea Kerwin, May 29, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

A Whatever Happened To… HP and TeraText

May 3, 2013

My Overflight for search vendors generated an odd “recent” update. The item originated from Chrlettestuvv’s Blog. The story pointed to an item called “SAIC’s TeraText Solutions Signs Strategic Alliance Agreement with HP.” The source was an “article from Software Industry Report, August 1, 2005.

HP apparently needed something more than TeraText, which shared some similarities with the now forgotten iPhrase and anticipated features in MarkLogic Server today. I find these search- and content-processing related tie ups interesting.

Each time I recall one or some glitch in the Internet surfaces a partner factoid, I am more confident that search vendors and some growth hungry large corporations move from speed dating to speed dating activity. Do the engagements lead to marriages? Sometimes I suppose. Other times the companies, like boy friends and girl friends in high school, the couples just drift apart.

Search, however, remains mostly unchanged.

Stephen E Arnold, May 3, 2013

Sponsored by Augmentext

Next Page »

  • Archives

  • Recent Posts

  • Meta