Enterprise Search: Is Search Big Data Ready?

January 17, 2015

At lunch on Thursday, January 15, 2015, one of my colleagues called my attention to “10 Hot Big Data Startups to Watch in 2015 from A to Z.” The story is by a professional at a company named Zementis. The story appears in or on a LinkedIn page, and I believe this may be from a person which LinkedIn considers a thought leader.

The reason I perked up when my colleague read the list of 10 companies was two fold. First, the author put his company Zementis on the list. Second, the consulting services firm LucidWorks—which I write in this way LucidWorks (Really?)—turned up.

Straight away, here’s the list of the “hot start ups” I am enjoined to “watch” in 2015. I assume that start up means “a newly established business,” according to Google’s nifty, attribution free definition service. “New” means “not existing before; made, introduced, or discovered recently or now for the first time.” Okay, with the housekeeping out of the way, on to the list:

  • Alpine Data Labs, founded in 2010
  • Confluent, founded in 2014 by LinkedIn engineers
  • Databricks, founded in 2013
  • Datameer, founded in 2009
  • Hadoop, now 10 years old and originally an open source project and not a company but figure 2004
  • Interana, founded in 2014 by former Facebook engineers
  • LucidWorks (Really?), né Lucid Imagination, founded in 2007
  • Paxata, founded in 2012
  • Trifacta, founded in 2012
  • Zementis, founded in 2004

Of these 10 companies, the firms that is not a commercial enterprise is Hadoop. Wikipedia suggests that Hadoop is a set of algorithms based on Google’s MapReduce open source version of code the search giant developed prior to 2004.

Okay, now we have nine hot data startups.

I am okay with Confluent and Interana being considered as new. Now we have seven companies that do not strike me as either “hot” or “new”. These non-hot and non-new outfits are Databricks (two years old), Datameer (four years old), LucidWorks Really? (eight years old), Paxata (three years old), and Zementis (11 years old).

I guess I can see that one could describe five of these companies as startups, but I cannot accept the “new” or “hot” moniker without some client names, revenue data, or some sort of factual substantiation.,

Now we have two companies to consider: LucidWorks Really? and Zementis.

LucidWorks Really? is a value added services firm based on Lucene/Solr. The company charges for its home-brew software and consulting and engineering services. According to Wikipedia, Lucene is:

Apache Lucene is a free open source information retrieval software library, originally written in Java by Doug Cutting. It is supported by the Apache Software Foundation and is released under the Apache Software License.

Apache offers this about Solr:

Solr is the popular, blazing-fast, open source enterprise search platform built on Apache Lucene. [Lucene is a trademark of Apache it seems]

As Elasticsearch’s success in combining several open source products as a mechanism for accessing large datasets shows, it is possible to use Lucene as a query tool for information. But, and this is a large but, both the thriving Elasticsearch and LucidWorks Really? are search and retrieval systems. Yep, good old keyword search with some frosting tossed in by various community members and companies repackaging and marketing special builds of what is free software. LucidWorks has been around for eight years. I have trouble perceiving this company and its repositionings as “new”. The Big Data label seems little more than a marketing move as the company struggles to generate revenues.

Now Zementis. Like Recorded Future (funded by the GOOG and In-Q-Tel), Zementis is in the predictive analytics game. The company focuses on “holistic and actionable customer insight across all channels.” I did not include this company in my CyberOSINT study because the company seems to focus on commercial clients like retail stores and financial services. CyberOSINT is an analysis of next generation information access companies primarily serving law enforcement and intelligence entities.

But the deal breaker for me is not the company’s technology. I find it difficult to accept that a company founded 11 years ago is new. Like LucidWorks Really?, the label start up has more to do with the need to find a positioning that allows the company to generate sales and sustainable revenue.

These are essential imperatives. I do not accept the assertions about new, startup, and, to some degree, Big Data.

Furthermore, the inclusion of a project as a startup just adds evidence to support this hypothesis:

The write up is a listicle with little knowledge value. See http://amzn.to/1rUoQyn.

Why am I summarizing this information? The volume of disinformation about companies engaged in next generation information access are making the same marketing mistakes that pushed Delphes, Fast Search & Transfer, Entopia, Fulcrum Technology, iPhrase, and other hype oriented vendors into a corner.

Why not explain what a product does to solve a problem, offer specific case examples, and deal in concrete facts?

I assume that is just too much for the enterprise search and content processing “experts” to achieve in today’s business climate. Wow, what a confused listicle.

Stephen E Arnold, January 17, 2015

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta