LucidWorks and Its Clueless Graphic

October 21, 2014

I noted a link to a LucidWorks presentation in a tweet. I navigated to the presentation on Slideshare. The approach in the presentation was trendy. My approach to presentations is untrendy, so I am no judge.

I found one slide particularly suggestive of the company’s approach to marketing. On slide 20 I saw this:


I am not exactly certain what vowel the asterisk represents. The slides strikes me as possibly offensive. But I live in rural Kentucky. What do I know? I assume the message is clear.

Perhaps this type of marketing messaging is one of the reasons ElasticSearch appears to have more momentum in the commercialized open source search sector?

Here’s a representative ElasticSearch slide from “A Gentle Introduction to ElasticSearch.”


Which company’s presentation resonates with you? Cluelessness or clues?

Stephen E Arnold, October 22, 2014

ElasticSearch How To: A Useful Case Example

October 21, 2014

If you want to avoid the hassle of some proprietary search engines, you may want to take a look at this case study about ElasticSearch. Navigate to “Building Scalable Search from Scratch with ElasticSearch.” The author works through his process for putting ElasticSearch to work in content space with a variety of information; for example, products, text collections, and user information.

What makes this write up useful is the logical layout of the article and the inclusion of a requirements summary, block diagrams, and code snippets.

This type of solid user support is one reason ElasticSearch is outpacing some open source search competitors like LucidWorks and Nutch.

Highly recommended. (As far as I can tell, no mid tier consulting firms has surfed on this content. Dave Schubmehl, this may be an opportunity.)

Stephen E Arnold, October 21, 2014

Autonomy: 33 APIs

October 21, 2014

Curious about Hewlett Packard’s Autonomy APIs? You can see the list of 33 at If you are curious about Autonomy’s Big Data capabilities, you may be puzzled about the lack of explicit analytics application programming interfaces. Don’t be. The savvy developer selects operations, takes outputs, and pumps the data into a search based application, third party number crunching system, a data management system, or plain old Excel. What’s interesting is that the naming of the APIs makes clear the search-centric nature of Autonomy. The marketing of IDOL as a service or a cloud solution shifts attention away from search in my view.

Stephen E Arnold, October 21, 2014

Coveo Pivots to Federated Search

October 21, 2014

Through a post at their blog Coveo Insights, enterprise-search firm Coveo urges, “Power Your Customer Service with Unified Search Driven Knowledge.” The write-up gives a few reasons why such “omni-channel” (federated) search functionality is a wise choice for customer service. Writer and Coveo marketing director Tucker Hall explains:

“Customers … engage with companies across a growing number of channels — from self-service portals and contact centers, to social media and field service engagements. Today’s savvy customer expects (and deserves) a seamless and consistent service experience across all of these channels. Omni-channel customer service has now become essential for companies hoping to maximize customer engagement, satisfaction, and retention.

“Successful omni-channel customer service can prove difficult regardless of the specific technologies and systems an organization has in place. That’s because success demands that customers and support personnel alike have swift, intuitive access to the case-resolving knowledge and expertise they need, when and how they need it.”

Hall asserts that many companies are missing out because they “fail to appreciate” the reasons to choose federated search: data and expertise are located in many systems, crowd-sourcing is a thing, and analytics must be actionable. But you, dear reader, already knew those, didn’t you? More on these points can be found in Coveo’s solution brief on the subject (registration required).

It is interesting to note that, while Coveo and others focus on federated search, Microsoft is more into the search-without-searching method called Delve. Let many flowers bloom!

Coveo serves organizations large, medium, and small with solutions that aim to be agile and easy to use yet scalable, fast, and efficient. The company was founded in 2005 by members of the team which developed Copernic Desktop Search. Coveo maintains offices in the U.S., Netherlands, and Quebec.

Cynthia Murrell, October 21, 2014

Sponsored by, developer of Augmentext

Google and Objective Search Results

October 20, 2014

I recall that in one conference presentation in Boston about Google I attended, the Googler (Dave Girouard, now a Xoogler) emphasized the objectivity of Google search results. I have heard the objective claim from many quarters over the years.

I noted the PC Magazine story “Google ‘Fixes’ Stephen Colbert’s Height Listing.” Here’s the passage I noted:

While Google hasn’t exactly dropped a packet full of stock options off on Colbert’s doorstep, it has managed to address Google’s concerns about his height listing. First up, Colbert now appears as 5 foot 10.5 inches tall on Google’s search results when you query for “Stephen Colbert height.” If you prefer metric, his height is now listed as 1.79 meters… “-ish.”

From my hollow in Harrod’s Creek, this strikes me as an example of Google’s ability to modify search results quickly. I am not sure that the “objective” reference used by Mr. Girouard years ago applies today. If true, Google can intervene in the vaunted PageRank process and make results changes quickly and at will.

Are those claims of outfits like Foundem founded? Maybe, just maybe?

Stephen E Arnold, October 20, 2014

Google Scholar and Google Silos of Content

October 18, 2014

I read “Making the World’s Problem Solvers 10% More Efficient.” The article explains that the Google engineer who was “the key inventor” of Google Scholar is leaving the GOOG.

The write up discloses a couple of interesting factoids; for example:

  • Google Scholar has been around for 10 years
  • The founder of Google Scholar took charge of Google’s indexing in year 2000
  • The inventor of Google Scholar had to figure out how to keep Google’s index fresh; that is, new and changed content are reflected in search results.

The most interesting point in the write up is this statement (I have added the boldface):

Also, the nature of academic papers presented some opportunities for more powerful ranking, particularly making use of the citations typically included in academic papers. Those same scholarly citations had been the original inspiration for PageRank, the technique that had originally made Google search more powerful than its competitors. Scholar was able to use them to effectively rank articles on a given query, as well as to identify relationships between papers.

What happened to Eugene Garfield? I know, “Who?” So does this passage mean that today’s Google Web search discards functionality originally included in year 2000?

But the big point for me is that Google is supposed to deliver “universal search.” To make use of Google Scholar, one must navigate to and run separate queries. Is this universal? It seems to be old school siloing.

I like Google Scholar, but I think Google Web search may lack some of the refinements included in Google Scholar. Well, ads are important. Correction: Revenue is important. Perhaps Google will charge for access to Google scholar and compete directly with commercial database vendors? In my view, Google Scholar had a negative impact on commercial database vendors who charge libraries, corporations, and individual for access to curated and indexed professional and scholarly information. Google seems content to allow the Google Scholar service to drift along. Would more purpose be of value? Queries for patent 2012/0251502 A1’s “the isolated nucleic acid molecule includes the nucleotide sequence of SEQ ID NOs: 1 or 10, or a complement thereof. In another, the nucleic acid molecule includes a nucleotide sequence having at least 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500, 4600, 4700, 4800, or 4900 contiguous nucleotides of the nucleotide sequence of SEQ ID NO: 1” would permit Google to match Ebola ads to Google Scholar content?

Stephen E Arnold, October 18, 2014

Blippex: By the People, For the People

October 17, 2014

Would Blippex be the search engine Alexis de Toqueville would love? The search engine is, according to Bloomberg, “a new crowd sourced public search engine.” Blippex makes use of technology developed for Archify, a system providing users with access to their online history. According to CrunchBase, the system has received seed funding of $700,000.

A year ago, Blippex was described as “the first interesting search engine since Google?” Like Qwant, Blippex is a search system crafted in Europe. Like Qwant, Blippex has ambitions for nibbling into Google’s market share for Web search.

The idea is that the search system is “built by its own users,” a phrase used in the Quartz article to describe the system. Quartz continued:

One of Blippex’s key selling points is that Kossatz and Baeck [the founders] are fanatical about privacy. Though Blippex constructs its search results on the basis of data gathered from its users, it does it in a way that’s anonymous and untraceable to any individual Blippex user. This obsession with privacy allows Blippex to rank pages—i.e., decide which pages to show people—with an algorithm that Google can’t match, because if Google gathered the data that Blippex does, users would find it unacceptably creepy.

Blippex does not track its users. One of the key technologies for the system is WebRTC. WebRTC is an open project that enables Web browsers with Real-Time Communications (RTC) capabilities via simple JavaScript APIs. If you don’t want to fool around with browser add ins, you can use Blippex like any other Web search system.

I ran a query for “enterprise search.” The results were interesting. I did not know that sold state drives were related to a search by a sheriff’s department or to Lenovo.


The order of the results is determined by the amount of time a user spends on a page. This is the “dwell time.”

Worth a look. A privacy centric European search system will have its supporters. The challenge, of course, is that Google dominates Web search in Europe. What is Google’s market share? 80 or 90 percent? Perhaps European regulators can adjust this situation?

Stephen E Arnold, October 17, 2014 Service Offline

October 16, 2014

This may be old news. We were updating out list of search engines and received an error from the service called, a metasearch system. Our last check for this system was in January 2013. At that time the company’s Web site was online and an Android app was available. The name is a variant of the Arabic phrase for “tell me”. More information about the system is available in a nine deck slide presentation at this link.

As you may recall, the service used a panel-style interface or what the company called “cards design”. Each panel corresponded to particular types of content.


The system was described as delivering “knowledge as a service.” One interesting feature of the search results was a grouping of links by domains.

The company was based in Montréal and was a project of Al Akhawayn University. My search file suggests that the system architect may have been Jawad Jari and the service utilized Amazon Web services.

Web metasearch seems to be a harsh taskmaster.

Stephen E Arnold, October 17, 2014

Open Source Search and Kicking the Bukkit

October 15, 2014

There is a presentation “Kicking the Bukkit: Anatomy of an Open Source Meltdown” by Ryan Michela, a developer with experience in open source. Over several years, a game open source project rose and fell. I am not too interested in open source games. At the end of the Slideshare document, there are five reasons an open source game project failed.

Let me summarize these and encourage you to work through he full 55 slide deck. How many of these issues may have an impact on open source search systems. Keep in mind that commercial enterprises like Attivio and IBM make use of open source technology.

  1. Inclusion of decompiled code in an open source project
  2. License issues
  3. Ties ups within the community before a project gains momentum
  4. No contributor license agreement
  5. Disgruntled developers in the community.

The presentation includes a quote that I noted:

It only takes one unhappy developer to kill an unprotected project.

Is there an open source search company vulnerable to one or more of these issues? I can name a couple. I wonder if the firm’s funding sources are concerned about their investment “kicking the bucket”?

Stephen E Arnold, October 15, 2014

Search and Deceptive Ads

October 15, 2014

Short honk: I read “Study Says Google, Yahoo And Bing Are Running ‘Deceptive’ Ads — And Regulators Are Doing Nothing To Stop It.”

I assume this statement is a surprise to some folks:

Now, disclosure text has become very small, and the shading very subtle, meaning users often don’t realize they are clicking through to ads rather than the most relevant result for their query.

In an increasingly important quest for revenue, these allegedly deceptive ads may be just the beginning of math club maneuvers. Relevance has a new meaning. Perhaps it is a synonym for revenue?

Stephen E Arnold, October 14, 2014

« Previous PageNext Page »