More on Marketing Confusion in Big Data Analytics

September 11, 2012

Search vendors are like a squirrel dodging traffic. Some make it across the road safely. Others? Well, there is a squirrel heaven I assume. Which search vendors will survive the speeding tractor trailers carrying big data, analytics, and visualization to customers who are famished for systems which make sense of information? I don’t know. No one really knows.

What is fascinating is to watch the Darwinian process at work among vendors of search and content processing. TextRadar’s “Content Intelligence: An Unexpected Collision Is Coming” makes clear that there are quite a few companies not widely known in the financial and health care markets. Some of these companies have opportunities to make the leap from government contract work to commercial work for Fortune 1000 companies.

But what about more traditional search vendors?

I received in the snail mail a copy of Oracle Magazine. September October 2012. The article which caught my attention was “New Questions, Fast Answers.” The information was in the form of an interview between Rich Schwerin, an Oracle magazine writer, and Paul Sonderegger, senior director of analytics at Oracle. Mr. Sonderegger was the chief strategist at Endeca, which is now part of the Oracle family of companies.

I have followed Endeca since I first learned about the company in 1999, 22 years ago. Like many traditional search vendors, the underlying technical concepts of Endeca date from the salad days of key word search. Endeca’s innovation was to identify concepts either human-assigned or generated by software to group related information. The idea was that a user could run a query and then click on concepts to “discover” information not in the explicit key word match. Endeca dubbed the function “guided navigation” and applied the approach to eCommerce as well as search across the type of information found in a company. The core of the technology was the “Endeca MDEX” engine. At the time of Endeca’s market entrance, there were only a handful of companies competing for enterprise search and eCommerce. In the last two decades the field has narrowed in one sense with the big name companies acquired by larger firms and broadened in another. There are hundreds of vendors offering search, but the majority of these companies use different words to describe indexing and search.

One Endeca executive (Peter Bell) told me in 2005 that the company had been growing at 100 percent each year since 2002.” At the time of the Oracle buy out, I estimated that Endeca had hit about $150 million in revenues. Oracle paid about $1.1 billion for the company or what, if I am accurate, amounts to about 10 times annual revenues. Endeca was a relative bargain compared to Hewlett Packard’s purchase of Autonomy for $10 billion. Autonomy, founded a few years before Endeca, had reached about $850 million in annual revenues, so the multiple on revenues was greater than the Endeca deal. The point is that both of these search giants ranked one and two in enterprise search revenues. Both companies emphasized their technologies’ ability to handle structured and unstructured information. Both Autonomy and Endeca offered business intelligence solutions. In short, both companies had capabilities which some of the newcomers mentioned in the Text Radar article are now touting as fresh and innovative. One key point: It took 22 years for Endeca to hit $150 million and now Oracle has to generate more revenue from the aging Endeca technology. HP has the same challenge with Autonomy, of course. Revenue generation, in my opinion, has been time consuming and difficult. Of the hundreds of vendors past and present, only two have broken the $150 million in revenue barrier. Google and Microsoft would be quick to point out that their search systems are far larger, but these are special cases because it is difficult to unwrap search revenues from other revenue streams.

What does Mr. Sonderegger say in this Oracle Magazine interview. Let me highlight three points and urge you to read the full text of his remarks.

Easy Access

First, business users do not know how to write queries, so “guided navigation” services are needed. Mr. Sonderegger noted:

There has to be some easy way to explore, some way to search and navigate as easily as you do on an e-commerce site.

Most of the current vendors of analytics and findability systems seem to have made the leap from point-and-click to snazzy visualizations. The Endeca angle is that users want to discover and navigate. The companies referenced in the Text Radar story want to make the experience visual, almost video-game like.

Complementary Functions

Second, Mr. Sonderegger points out before explaining the components of the Endeca solution that:

BI [business intelligence] delivers proven answers to known questions, and that will continue to be important. Information discovery lives alongside BI and offers a way to get fast answers to new questions…Information discovery and BI complement one another in a virtuous circle.

For many years, Oracle has offered business intelligence systems. These have been positioned as “extreme analytics,” “an enterprise platform,” and “insight to action,” among other descriptions. Oracle purchased TripleHop Technologies, a content processing company. Guy Creese, writing in “Oracle Acquires TripleHop”, said: “In the BI space, the history has been that Oracle snaps up great or good technology and then it drops out of sight.  Let’s hope there’s a TripleHop champion at Oracle that will keep it from vanishing beneath the waves.”

Many Sources, Quick Access

Mr. Sonderegger closed his interview with a description of the benefits of the Endeca method:

With Oracle Endeca Information Discovery, organizations now can take a different approach where the IT folks can say to business users, “Give me just two or three data sources that you think might make a difference to answering your questions. We will pour them into Oracle Endeca Information Discovery, which you can access and start to explore and tell us what new questions that inspires.” With Oracle Endeca Information Discovery, when the business discovers the need for new visualizations or additional data sources, IT can respond very quickly, modifying visualizations and adding new data sources. Oracle Endeca Server indexes those new sources, adds them into its index, changes its own model, and then shows those changes in Oracle Endeca Information Discovery.

The idea is that Endeca can ingest diverse structured and unstructured data and make it available for a business user without coding skills to find answers to questions not previously asked. Endeca, therefore, delivers the type of real-time, interactive exploration of data which many of the new comers.

So Where’s the Confusion?

For Oracle Endeca, I think the database company is following a well worn path. The angle is “guided navigation” supplemented with eDiscovery features. Oracle’s market position and its ability to get to senior executives with the Endeca story is a significant advantage. I would expect Endeca to grow rapidly. When Oracle gets traction with the current Endeca positioning, Endeca should choke off sales opportunities for many start ups. I wonder if the investors realize that Oracle is just one big company pursuing the “beyond search” market. IBM is a big player. SAP Business Objects is in the game. Microsoft is in the space as well. Some of the companies referenced in the Text Radar story may have to work a miracle to make substantive headway in this market space.

For the companies anchored in the government contracting world of intelligence and discovery, their advantage is that their technology seems fresher, newer, more vital. But how “fresh” is fresh. Endeca has been in the “beyond search” market for many years. The analytics are “old hat” to IBM which has added to its arsenal with its acquisition of SPSS. And for some companies, hiring IBM or Oracle is “safe.” For other companies, technology which uses visualizations and predictive outputs instead of “guided navigation” seems, on the surface, to be in step with the alleged big data problems companies face. The downside is that few of the companies mentioned in the Text Radar article have the marketing clout to make rapid progress without significant financial and staff resources.

The confusion is that both the established companies like Endeca and the new companies like Centrifuge Systems or inTTENSITY to pick two at random are using what strikes me as explanations similar to Mr. Sonderegger’s.

In my brief brush with these new fields of discovery or what I call “beyond search” vendors, I have learned that it is very tough to identify the unique features of each system. My team prepared a feature comparison chart for one client. The resulting table showed that each vendor offered the same functionality. There were differences, of course. But the explanation of those differences required a page or more to explain. Vendors look for ways to sell without tedious explanations. Visualizations are eye grabbing and powerful marketing tools. Aren’t the technical differences important? Maybe not so much today?

Three questions:

  1. How will these “beyond search” companies differentiate themselves? Will potential buyers have the time and resources to figure out the actual differences between the companies?
  2. What will be the computational burden (cost) these systems impose on the licensee? What will be latency? What staff are needed?
  3. Will users understand the “answers”? Will the visualizations and the assurances that a user can answer questions he or she did not know to ask improve decision making?

Like the squirrel, an unfamiliar environment may cause instinctive reactions to cause the squirrel to meet in an unhappy way with a speeding vehicle. The squirrel might not recognize the rules of the road. Will users of interactive systems have the survival skills necessary to make use of systems which cannot be easily differentiated, operate automatically, and deliver dynamic outputs?

Stephen E Arnold, September 11, 2012

