Featured

Google and Findability without the Complexity

Shortly after writing the first draft of Google: The Digital Gutenberg, “Enterprise Findability without the Complexity” became available on the Google Web site. You can find this eight page polemic at http://bit.ly/1rKwyhd or you can search for the title on—what else?—Google.com.

Six years after the document became available, Google’s anonymous marketer/writer raised several interesting points about enterprise search. The document appeared just as the enterprise search sector was undergoing another major transformation. Fast Search & Transfer struggled to deliver robust revenues and a few months before the Google document became available, Microsoft paid $1.2 billion for what was another enterprise search flame out. As you may recall, in 2008, Convera was essentially non operational as an enterprise search vendor. In 2005, Autonomy bought the once high flying Verity and was exerting its considerable management talent to become the first enterprise search vendor to top $500 million in revenues. Endeca was flush with Intel and SAP cash, passing on other types of financial instruments due to the economic downturn. Endeca lagged behind Autonomy in revenues and there was little hope that Endeca could close the gap between it and Autonomy.

Secondary enterprise search companies were struggling to generate robust top line revenues. Enterprise search was not a popular term. Companies from Coveo to Sphinx sought to describe their information retrieval systems in terms of functions like customer support or database access to content stored in MySQL. Vivisimo donned a variety of descriptions, culminating in its “reinvention” as a Big Data tool, not a metasearch system with a nifty on the fly clustering algorithm. IBM was becoming more infatuated with open source search as a way to shift development an bug fixes to a “community” working for the benefit of other like minded developers.

image

Google’s depiction of the complexity of traditional enterprise search solutions. The GSA is, of course, less complex—at least on the surface exposed to an administrator.

Google’s Findability document identified a number of important problems associated with traditional enterprise search solutions. To Google’s credit, the company did not point out that the majority of enterprise search vendors (regardless of the verbal plumage used to describe information retrieval) were either losing money or engaged in a somewhat frantic quest for financing and sales).

Here are the issues Google highlighted:

  • User of search systems are frustrated
  • Enterprise search is complex. Google used the word “daunting”, which was and still is accurate
  • Few systems handle file shares, Intranets, databases, content management systems, and real time business applications with aplomb. Of course, the Google enterprise search solution does deliver on these points, asserted Google.

Furthermore, Google provides integrated search results. The idea is that structured and unstructured information from different sources are presented in a form that Google called “integrated search results.”

Google also emphasized a personalized experience. Due to the marketing nature of the Findability document, Google did not point out that personalization was a feature of information retrieval systems lashed to an alert and work flow component. Fulcrum Technologies offered a clumsy option for personalization. iPhrase improved on the approach. Even Endeca supported roles, important for the company’s work at Fidelity Investments in the UK. But for Google, most enterprise search systems were not personalizing with Google aplomb.

Google then trotted out the old chestnuts gleaned from a lunch discussion with other Googlers and sifting competitors’ assertions, consultants’ pronouncements, and beliefs about search that seemed to be self-evident truths; for example:

  • Improved customer service
  • Speeding innovation
  • Reducing information technology costs
  • Accelerating adoption of search by employees who don’t get with the program.

Google concluded the Findability document with what has become a touchstone for the value of the Google Search Appliance. Kimberly Clark, “a global health and hygiene company,” reduced administrative costs for indexing 22 million documents. The costs of the Google Search Appliance, the consultant fees, and the extras like GSA fail over provisions were not mentioned. Hard numbers, even for Google, are not part of the important stuff about enterprise search.

One interesting semantic feature caught my attention. Google does not use the word knowledge in this 2008 document.

Several questions:

  1. Was Google unaware of the fusion of information retrieval and knowledge?
  2. Does the Google Search Appliance deliver a laundry list of results, not knowledge? (A GSA user has to scan the results, click on links, and figure out what’s important to the matter at hand, so the word “knowledge” is inappropriate.)
  3. Why did Google sidestep providing concrete information about costs, productivity, and the value of indexing more content that is allegedly germane to a “personalized” search experience? Are there data to support the implicit assertion “more is better.” Returning more results may mean that the poor user has to do more digging to find useful information. What about a few, on point results? Well, that’s not what today’s technology delivers. It is a fiction about which vendors and customers seem to suspend disbelief.

With a few minor edits—for example, a genuflection to “knowledge—this 2008 Findability essay is as fresh today as it was when Google output its PDF version.

Several observations:

First, the freshness of the Findability paper underscores the staleness and stasis of enterprise search in the past six years. If you scan the free search vendor profiles at www.xenky.com/vendor-profiles, explanations of the benefits and functions of search from the 1980s are also applicable today. Search, the enterprise variety, seems to be like a Grecian urn which “time cannot wither.”

Second, the assertions about the strengths and weaknesses of search were and still are presented without supporting facts. Everyone in the enterprise search business recycles the same cant. The approach reminds me of my experience questioning a member of a sect. The answer “It just is…” is simply not good enough.

Third, the Google Search Appliance has become a solution that costs as much, if not more, than other big dollar systems. Just run a query for the Google Search Appliance on www.gsaadvantage.gov and check out the options and pricing. Little wonder than low cost solutions—whether they are better or worse than expensive systems—are in vogue. Elasticsearch and Searchdaimon can be downloaded without charge. A hosted version is available from Qbox.com and is relatively free of headaches and seven figure charges.

Net net: Enterprise search is going to have to come up with some compelling arguments to gain momentum in a world of Big Data, open source, and once burned twice shy buyers. I wonder why venture / investment firms continue to pump money into what is same old search packaged with decades old lingo.

I suppose the idea that a venture funded operation like Attivio, BA Insight, Coveo, or any other company pitching information access will become the next Google is powerful. The problem is that Google does not seem capable of making its own enterprise search solution into another Google.

This is indeed interesting.

Stephen E Arnold, July 28, 2014

Interviews

Elasticsearch: A Platform for Third Party Revenue

Making money from search and content processing is difficult. One company has made a breakthrough. You can learn how Mark Brandon, one of the founders of QBox, is using the darling of the open source search world to craft a robust findability business.

I interviewed Mr. Brandon, a graduate of the University of Texas as Austin, shortly after my return from a short trip to Europe. Compared with the state of European search businesses, Elasticsearch and QBox are on to what diamond miners call a “pipe.”

In the interview, which is part of the Search Wizards Speak series, Mr. Brandon said:

We offer solutions that work and deliver the benefits of open source technology in a cost-effective way. Customers are looking for search solutions that actually work.

Simple enough, but I have ample evidence that dozens and dozens of search and content  processing vendors are unable to generate sufficient revenue to stay in business. Many well known firms would go belly up without continual infusions of cash from addled folks with little knowledge of search’s history and a severe case of spreadsheet fever.

Qbox’s approach pivots on Elasticsearch. Mr. Brandon said:

When our previous search product proved to be too cumbersome, we looked for an alternative to our initial system. We tested Elasticsearch and built a cluster of Elasticsearch servers. We could tell immediately that the Elasticsearch system was fast, stable, and customizable. But we love the technology because of its built-in distributed nature, and we felt like there was room for a hosted provider, just as Cloudant is for CouchDB, Mongolab and MongoHQ are for MongoDB, Redis Labs is for Redis, and so on. Qbox is a strong advocate for Elasticsearch because we can tailor the system to customer requirements, confident the system makes information more findable for users.

When I asked where Mr. Brandon’s vision for functional findablity came from, he told me about an experience he had at Oracle. Oracle owns numerous search systems, ranging from the late 1980s Artificial Linguistics’ system to somewhat newer systems like the late 1990s Endeca system, and the newer technologies from Triple Hop. Combine these with the SES technology and the hybrid InQuira formed from two faltering NLP systems, and Oracle has some hefty investments.

Here’s Mr. Brandon’s moment of insight:

During my first week at Oracle, I asked one of my colleagues if they could share with me the names of the middleware buyer contacts at my 50 or so named accounts. One colleague said, “certainly”, and moments later an Excel spreadsheet popped into my inbox. I was stunned. I asked him if he was aware that “Excel is a Microsoft technology and we are Oracle.” He said, “Yes, of course.” I responded, “Why don’t you just share it with me in the CRM System?” (the CRM was, of course, Siebel, an Oracle product). He chortled and said, “Nobody uses the CRM here.” My head exploded. I gathered my wits to reply back, “Let me get this straight. We make the CRM software and we sell it to others. Are you telling me we don’t use it in-house?” He shot back, “It’s slow and unusable, so nobody uses it.” As it turned out, with around 10 million corporate clients and about 50 million individual names, if I had to filter for “just middleware buyers”, “just at my accounts”, “in the Northeast”, I could literally go get a cup of coffee and come back before the query was finished. If I added a fourth facet, forget it. The CRM system would crash. If it is that bad at the one of the world’s biggest software companies, how bad is it throughout the enterprise?

You can read the full interview at http://bit.ly/1mADZ29. Information about QBox is at www.qbox.com.

Stephen E Arnold, July 2, 2014

Latest News

Google Searches, Prediction, and Fabulous Stock Market Returns?

I read “Google Searches Hold Key to Future Market Crashes.” The main idea in my opinion is: Moat [female big thinker at Warwick Business School’ continued,... Read more »

July 28, 2014 | | Comment

Surprising Sponsored Search Report and Content Marketing

Content marketing hath embraced the mid tier consulting firms. IDC, an outfit that used my information without my permission from 2012 until July 2014, has published... Read more »

July 28, 2014 | | Comment

Pre Oracle InQuira: A Leader in Knowledge Assessment?

Oracle purchased InQuira in 2011. One of the writers for Beyond Search reminded me that Beyond Search covered the InQuira knowledge assessment marketing ploy in... Read more »

July 28, 2014 | | Comment

From Search to Sentiment

Attivio has placed itself in the news again, this time for scoring a new patent. Virtual-Strategy Magazine declares, “Attivio Awarded Breakthrough Patent for Big... Read more »

July 28, 2014 | | Comment

Searchcode Is a Valuable Resource for Developers

Here is a useful tool that developers will want to bookmark: searchcode does just what its name suggests—paste in a snippet of code, and it returns real-world... Read more »

July 28, 2014 | | Comment

Snowden Effect on Web Search

If you are curious about the alleged impact of intercepts and monitoring on search, you will want to read “Government Surveillance and Internet Search Behavior.”... Read more »

July 27, 2014 | | Comment

Sponsors of Two Content Marketing Plays

I saw some general information about allegedly objective analyses of companies in the search and content processing sector. The first report comes from the Gartner... Read more »

July 27, 2014 | | Comment

IDC and Reports by Schubmehl

I wanted to nail down a handful of facts. First, IDC published without a contract four reports in 2012. These reports were disseminated via the IDC Web site, various... Read more »

July 25, 2014 | | Comment

Microsoft Puts Machine Learning in the Cloud

Machine learning is ascending to the cloud. The Register asks, “Do Data Centers Dream of Electric Sheep? Microsoft Announces Machine Learning Cloud.” As competition... Read more »

July 25, 2014 | | 1 Comment

PetMatch for iOS Finds Furry Friends

A new image-based search tool can take some of the research out of adopting a pet. Lifehacker turns our attention to the free iOS app in, “PetMatch Searches for... Read more »

July 25, 2014 | | Comment