Snowden Effect on Web Search

July 27, 2014

If you are curious about the alleged impact of intercepts and monitoring on search, you will want to read “Government Surveillance and Internet Search Behavior.” You may have to pay to access the document. Here’s a passage I noted:

In the U. S., this was the main subset of search terms that were affected. However, internationally there was also a drop in traffic for search terms that were rated as personally sensitive.

Stephen E Arnold, July 27, 2014

Sponsors of Two Content Marketing Plays

July 27, 2014

I saw some general information about allegedly objective analyses of companies in the search and content processing sector.

The first report comes from the Gartner Group. The company has released its “magic quadrant” which maps companies by various allegedly objective methods into leaders, challengers, niche players, and visionaries.

The most recent analysis includes these companies:

BA Insight
Dassault Exalead
Expert System
HP Autonomy IDOL
Lucid Works
Perceptive ISYS Search

There are several companies in the Gartner pool whose inclusion surprises me. For example, Exorbyte is primarily an eCommerce company with a very low profile in the US compared to Endeca or New Zealand based SLI Systems. Expert System is a company based in Italy. This company provides semantic software which I associated with mobile applications. IHS (International Handling Service) provides technical information and a structured search system. MarkLogic is a company with XML data management software that has landed customers in publishing and the US government. With an equally low profile is Mindbreeze, a home brew search system funded by Microsoft-centric Fabasoft. Dassault Exalead, PolySpot, and Sinequa are French companies offering what I call “information infrastructure.” Search is available, but the approach is digital information plumbing.

The IDC report, also allegedly objective, is sponsored by nine companies. These outfits are:

Earley & Associates
HP Autonomy IDOL

This collection of companies is also eclectic. For example, Earley & Associates does indexing training, consulting, and does not have a deep suite of enterprise software. IHS (International Handling Services) appears in the IDC report as a knowledge centric company. I think I understand the concept. Technical information in Extensible Markup Language and a mainframe-style search system allow an engineer to locate a specification or some other technical item like the SU 25. Lexalytics is a sentiment analysis company. I do not consider figuring out if a customer email is happy or sad the same as Coveo’s customer support search system. Smartlogic is interesting because the company provides tools that permit unstructured content to be indexed. Some French vendors call this process “fertilization.” I suppose that for purists, indexing might be just as good a word.

What unifies these two lists are the companies that appear in both allegedly objective studies:

IHS (International Handling Service)

My hunch is that the five companies appearing in both lists are in full bore, pedal to the metal marketing mode.

Attivio and Coveo have ingested tens of millions in venture funding. At some point, investors want a return on their money. The positioning of these two companies’ technologies as search and the somewhat unclear knowledge quotient capability suggest that implicit endorsement by mid tier consulting firms will produce sales.

The appearance of HP and IBM on each list is not much of a surprise. The fact that Oracle Endeca is not in either report suggests that Oracle has other marketing fish to fry. Also, Elasticsearch, arguably the game changer in search and content processing, is not in either pool may be evidence that Elasticsearch is too busy to pursue “expert” analysts laboring in the search vineyard. On the other hand, Elasticsearch may have its hands full dealing with demands of developers, prospects, and customers.

IHS has not had a high profile in either search or content processing. The fact that International Handling Services appears signals that the company wants to market its mainframe style and XML capable system to a broader market. Sinequa appears comfortable with putting forth its infrastructure system as both search and a knowledge engine.

I have not seen the full reports from either mid tier consulting firm. My initial impression of the companies referenced in the promotional material for these recent studies is that lead generation is the hoped for outcome of inclusion.

Other observations I noted include:

  1. The need to generate leads and make sales is putting multi-company reports back on the marketing agenda. The revenue from these reports will be welcomed at IDC and Gartner I expect. The vendors who are on the hook for millions in venture funding are hopeful that inclusion in these reports will shake the money trees from Boston to Paris.
  2. The language used to differentiate and describe the companies referenced in these two studies is unlikely to clarify the differences between similar companies or make clear the similarities. From my point of view, there are few similarities among the companies referenced in the marketing collateral for the IDC and Gartner study.
  3. The message of the two reports appears to be “these companies are important.” My thought is that because IDC and Gartner assume their brand conveys a halo of excellence, the companies in these reports are, therefore, excellent in some way.

Net net: Enterprise search and content processing has a hurdle to get over: Search means Google. The companies in these reports have to explain why Google is not the de facto choice for enterprise search and then explain how a particular vendor’s search system is better, faster, cheaper, etc.

For me, a marketer or search “expert” can easily stretch search to various buzzwords. For some executives, customer support is not search. Customer support uses search. Sentiment analysis is not search. Sentiment analysis is a signal for marketers or call center managers. Semantics for mobile phones, indexing for SharePoint content, and search for a technical data sheet are quite different from eCommerce, business intelligence, and business process engineering.

A fruit cake is a specific type of cake. Each search and content processing system is distinct and, in my opinion, not easily fused into the calorie rich confection. A collection of systems is a lumber room stuffed with different objects that don’t have another place in a household.

The reports seem to make clear that no one in the mid tier consulting firms or the search companies knows exactly how to position, explain, and verify that content processing is the next big thing. Is it?

Maybe a Google Search Appliance is the safe choice? IBM Watson does recipes, and HP Autonomy connotes high profile corporate disputes.

Elasticsearch, anyone?

Stephen E Arnold, July 27, 2014

Honk Tracks Search Marketing Memes

July 26, 2014

The Honk page for Beyond Search now tracks information retrieval marketing memes. The information at now includes a discussion of a coinage designed to sell “search” without using the word “search.” Is the approach likely to reverse the fortunes of search vendors who face increasingly intense uphill battles to generate substantive revenue? The Honk “Meme of the Moment” updates will keep you posted.

Stephen E Arnold, July 26, 2014

PetMatch for iOS Finds Furry Friends

July 25, 2014

A new image-based search tool can take some of the research out of adopting a pet. Lifehacker turns our attention to the free iOS app in, “PetMatch Searches for an Adoptable Pet Based on Appearance.” Now, pet lovers who see their perfect pet on the street can take a picture and find local doppelgangers in need of homes. Perhaps this will help lower dog-napping rates. Reporter Dave Greenbaum notes:

“You should never adopt an animal solely based on looks, of course—you should research the personality of the breed you want—but looks are a factor. This app works great for mixed breed dogs when you aren’t sure what kind of dog you are looking at. I like the fact it will look at local adoption agencies to find a match, too. Online services like help you find local pets to adopt, but you have to know which breed you are looking for first, and searching for mixed breed dogs (common at shelters) is difficult. This app makes it easy to do a reverse image search and do your research based on the results.”

Another point to note is that PetMatch includes a gallery of dog and cat breeds, so if the picture is in your head instead of your phone, you can still search for a look-alike. It’s a clever idea, and an innovative use of image search functionality.

Cynthia Murrell, July 25, 2014

Sponsored by, developer of Augmentext

Behind the Faster Place Search at Pinterest

July 23, 2014

One of the engineers over at Pinterest gets into the nitty-gritty of the site’s place search in, “Introducing a Faster Place Search” at the blog Making Pinterest. Last fall, the invitation-only image hoarding site launched Place Pins. Designed with aspiring travelers in mind, the tool allows users to link pictures to a map that indicates where they were taken. Since then, the Place Pins team has continued to tweak the software. Engineer Jon Parise writes:

“We launched Place Pins a little over six months ago, and in that time we’ve been gathering feedback from Pinners and making product updates along the way, such as adding thumbnails of the place image on maps and the ability to filter searches by Place Boards. The newest feature is a faster, smarter search for Web and iOS that makes it easier to add a Place Pin to the map. There are now more than one billion travel Pins on Pinterest, more than 300 unique countries and territories are represented in the system, and more than four million Place Boards have been created by Pinners. Here’s the story of how the Place Pins team built the latest search update.”

See the article for Parise’s breakdown of challenges and how the team addressed them. One example: Users familiar with a single search box weren’t fond of the original two-box configuration—one for subject and one for place. The seemingly simple fix, combining both terms into one box, required the algorithm to break the query into two parts and identify any geographic names that appear. For that adjustment, the engineers turned to open-source geocoder Twofishes for assistance.

The post concludes with a note that more improvements are on the way. The updated place search has been incorporated into the iOS app, with inclusion in the Android app on the way “soon.”

Cynthia Murrell, July 23, 2014

Sponsored by, developer of Augmentext


July 23, 2014

When you first visit Algolia’s Web site, two things jump out at you. One is this quote:

“Search your database in realtime. Search is a key element of your user experience. To keep your users engaged, search results need to show up instantly and be relevant to them, even when they do typos. Index your database records with our API and get results in milliseconds.”

The second is a counter recording the number of API calls Algolia has handled. The counter adds hundreds of API calls per second. Algolia must be doing something right if they’re answering billions of calls. So what sets Agolia part from other search software companies?

Algolia offers the usual features: database search, search by type, mobile, analytics, linguistics, etc. Agolia does highlight a few features, that while they are standard in other search companies as well, they have taken to a different level than their rivals. Agolia claims to always be up and running with three high-end servers, a low latency routing due to multiple data centers, and top of the line security.

Agolia is a search as a service company with domestic and international clients. Their Web site presents a comprehensive profile of their services and sells them as a reliable search software company.

Whitney Grace, July 23, 2014
Sponsored by, developer of Augmentext

Former Autonomy CFO Tosses Legal Flechette at HP

July 22, 2014

I read “Former Autonomy CFO seeks to Block HP-Shareholder Settlement.” You may have to answer some questions to see the document or try to log in to the paywalled Financial Times’ Web site. Yep, that’s the wonky orange newspaper that is a must read in London, but not so much in Harrod’s Creek, Kentucky.

The story seems to be straightforward. The former chief financial officer of Autonomy has “filed a legal motion to block” the Hewlett Packard shareholder deal. The idea is that if shareholders agree to let HP off the hook for its acquisition of Autonomy and the fascinating $5 billon write down, then HP will go after Autonomy. The law firm assisting HP will be the same outfit that helped shareholders sue HP for the deal in the first place. Got that?

The Financial Times quoted Mr.Hussein’s legal document about the legal action:

“HP seeks to forever bury from disclosure the real reason for its 2012 write down of Autonomy: HP’s own destruction of Autonomy’s success after the acquisition. And, by the broad bar order it seeks, HP seeks to absolve itself of its own responsibility for its losses.”

The FT did not include the link to the actual filing. You can find it at this link.

The issue, according to the Autonomy CFO’s document is that HP is using the shareholder settlement to bury certain facts about HP’s handling of Autonomy. Autonomy’s argument is that HP fumbled the ball after it conducted due diligence and bought Autonomy.

Autonomy wants HP to provide proof that Autonomy fooled HP, its Board, and its consultants. The idea is that Autonomy allowed these folks to review the financials, the marketing collateral, and other sources of information before deciding to buy Autonomy for $11 billion.

I am no longer surprised by the claims and counter claims. Several observations:

First, search and content processing as business sectors generate a disproportionate amount of thrashing. HP analyzes Autonomy. HP buys Autonomy. HP sues Autonomy. Shareholders sue HP. An individual no longer employed at HP Autonomy sues HP. Etc., etc. Fast Search was the leader in post sale legal maneuvering. Autonomy seems to be following the “fast” track now.

Second, HP bought Autonomy and then said it was tricked. Remember this is not like buying a bagel. Autonomy bought a company with thousands of customers and hundreds of million in revenue. If a bagel is bad, I either demand a different one or walk to another bagel shop.

Third, the acquisition took place three years ago. In that time, the enterprise search sector has been subjected to considerable pressure. Just  check out the latest Gartner Magic Quadrant, G00260831. Notice that Elasticsearch (the fastest growing search system) is not in the Gartner analysis. The Gartner enterprise search report appears to mirror the nature of the enterprise search market itself. The HP Autonomy matter AND the preceding Fast Search & Transfer matter have, in my view, contributed to a general malaise for the search and content processing software. The equation in my mind works like this: Buying a search system = Trouble.

Net net: With the parties to the matter allowing their attorneys to put the pedal to the metal, there will be more excitement in the near future. Billing functions at law firms have steamrollers to operate.

Stephen E Arnold, July 22, 2014


July 22, 2014

Each search software company has their own blend on improving search and increasing accuracy. Swiftype uses the slogan “the easiest way to add great search to your Web site” and while its search software may fulfill that statement, it is something other search companies claim as well. The questions then, are it true and what makes Swiftype different from its competition? The latter is easier to answer than the former. Instead of focusing on one section of the search market, Swiftype provides solutions for a variety of Web sites including WordPress, startups, knowledge bases, mobile, publishers, ecommerce, and even open source.

“Swiftype is a hosted software service that eliminates the need to create your own search software from scratch, making it possible for any website owner or mobile app developer to add great search to their product. Features include powerful relevance algorithms, customizable search result ordering, fast auto complete with typo protection, real-time analytics and more. Exceptionally simple to integrate into your existing software, but also remarkably flexible, Swiftype can be extensively customized to match the specific needs of your business.”

The support for the Web site variety is in Swiftype’s favor, but the company also offers real-time analytics and developer support. Search is still in its infancy for mobile devices, but Swiftype has dedicated an entire area that optimizes search for apps on different smartphone brands and mobile Web browsers. Swiftype already supports a hefty client list: Twitch, Twilio, TechCrunch, and Shopify. Swiftype is proving to be a big player in search. Maybe they’ll be blazing new trails and leave its competition behind.

Whitney Grace, July 22, 2014
Sponsored by, developer of Augmentext

Search and Data-Starved Case Studies

July 19, 2014

LinkedIn discussions fielded a question about positive search and content processing case studies. I posted a link to a recent paper from Italy (you can find the url at this link).

My Overflight system spit out another case study. The publisher is Hewlett Packard and the example involves Autonomy. The problem concerns the UK’s National Health Service” and its paperless future. You can download the four page document at

The Italian case study focuses on cheerleading for the Google Search Appliance. The HP case study promotes the Autonomy IDOL system applied to medical records.

the HP Autonomy document caught my attention because it uses a buzzword I first heard at Booz, Allen & Hamilton in 1978. Harvey Poppel, then a BAH partner, coined the phrase. The idea caught on. Mr. Poppel, who built a piano, snagged some ink in Business Week. That was a big deal in the late 1970s. Years later I met Alan Siegel, a partner at a New York design firm. He was working on promotion of the Federal government’s paperless initiative. About 10 years ago, I spent some time with Forrest (Woody) Horton, who was a prominent authority on the paperless office. Across the decades, talk about paperless offices generated considerable interest. These interactions about paperless environments have spanned 36 years. Paper seems to be prevalent wherever I go.

When I read the HP Autonomy case study, I thought about the efforts of some quite bright individuals directed at eliminating hard copy documents. There are reports, studies, and analyses about the problems of finding information in paper. I expected a reference to hard data or some hard data. The context for the paperless argument would have captured my attention.

The HP Autonomy case study talks about an integrator’s engineers using IDOL to build a solution. The product is called Evolve and:

It sued 28 years of information management expertise to improve efficiency, productivity and regulatory compliance. The IDOL analytics engine was co-opted into Evolve because it automatically ingests and segments medical records and documents according to their content and concepts, making it easier to find and analyze specific information.

The wrap up of the case study is a quote that is positive about the Kainos Evolve system. No big surprise.

After reading the white paper, three thoughts crossed my mind.

First, the LinkedIn member seeking positive search and content processing case studies might not find the IDOL case study particularly useful. The information is more of an essay from an ad agency generated in-house magazine.

Second, the LinkedIn person wondered why there were so few positive case studies about successful search and content processing installations. I think there are quite a few white papers, case studies, and sponsored content marketing articles crafted along the lines of the HP Autonomy case study. The desire to give the impression that the product encounters no potholes scrubs out the details so useful to a potential licensee.

Third, the case study describes a mandated implementation. So the Evolve product is in marketing low gear. The enthusiasm for implementing a new product shines brightly. Does the glare from the polish obscure a closer look.

At a minimum, I would have found the following information helpful even if presented in bullet points or tabular form:

  1. What was the implementation time? What days, weeks, or months of professional work were required to get the system up and running?
  2. What was the project’s initial budget? Was the project completed within the budget parameters?
  3. What is the computing infrastructure required for the installation? Was the infrastructure on premises, cloud, or hybrid?
  4. What is the latency in indexing and query processing?
  5. What connectors were used “as is”? Were new connectors required? If yes, how long did it take to craft a functioning connector?
  6. What training did users of the system require?

Information at this level of detail is difficult to obtain. In my experience, most search and content processing systems require considerable attention to detail. Take a short cut, and the likelihood of an issue rises sharply.

Obviously neither the vendor nor the licensee want information about schedule shifts, cost over or under- runs and triage expenses to become widely known. The consequence of this jointly enforced fact void helps create case studies that are little more than MBA jargon.

Little wonder the LinkedIn member’s plea went mostly ignored. Paper is unlikely to disappear because lawyers thrive on hard copies. When litigation ensues, the paperless office and the paperless medical practice becomes a challenge.

Stephen E Arnold, July 19, 2014

What Most Search Vendors Cannot Pull Off

July 19, 2014

I recently submitted an Information Today column that reported about Antidot’s tactical play to enter the US market. One of the fact checkers for the write up alerted me that most of the companies I identified were unknown to US readers. Test yourself. How many of these firms do you recognize? How many of them provide information retrieval services?

  • A2ia
  • Albert (originally AMI Albert and AMI does not mean friend)
  • Dassault Exalead
  • Datops
  • EZ2Find
  • Kartoo
  • Lingway
  • LUT Technologies
  • Pertimm
  • Polyspot
  • Quaero
  • Questel
  • Sinequa

How did you do? The point is that French vendors of information retrieval and content processing technology find themselves in a crowded boat. Most of the enterprise search vendors have flamed out or resigned themselves to pitching to venture capitalist that their technology is the Next Big Thing. A lucky few sell out and cash in; for example Datops. Others are ignored or forgotten.

The same situation exists for vendors of search technology in other countries. Search is a tough business. And when former Googlers like Marissa Meyer was the boss when Yahoo’s share of the Web search market sagged below 10 percent. In the same time period, Microsoft increased Bing’s share to about 14 percent. Google dogpaddled and held steady. Other Web search providers make up the balance of the market players. Business Insider reported:

This is a big problem for Yahoo since its search business is lucrative. While Yahoo’s display ad business fell 7% last quarter, revenue from search was up 6% on a year-over-year basis. Revenue from search was $428 million compared to $436 million from its display ad business.

Now enterprise search vendors have been trying to use verbal magic to unlock consistently growing revenue. So far only two vendors have been able to find a way to open the revenue vault’s lock. Autonomy tallied more than $800 million in revenue at the time of its sale to Hewlett Packard. The outcome of that deal was a multi-billion dollar write off and many legal accusations. One thing is clear through the murky rhetoric the deal produced. Hewlett Packard had zero understanding of search and has been looking for a scapegoat to slaughter for its corporate decision. This is not helping the search vendors chasing deals.

Google converted Web search into a $60 billion revenue stream. The fact that the core idea for online advertising originated with the pay-to-play company GoTo which then morphed into Overture which THEN was acquired by Yahoo. Think of the irony. Yahoo has the technology that makes Google a one trick, but very lucrative revenue pony. But, to be fair, Google Web search is not the enterprise search needed to locate a factoid for a marketing assistant. Feed this query “how me the versions of the marketing VP’s last product road map” to a Google appliance and check the results. The human has to do some old fashioned human-type work. To find this information with a Google Search Appliance or any other information retrieval engine for that matter is tricky. Basic indexing cannot do the job, so most marketing assistants hunt manually through files, folders, and hard copies looking for the Easter egg.

Many of the pioneering search engines tried explaining their products and services using euphemisms. There was question answering, content intelligence, smart content, predictive retrieval, entity extraction, and dozens and dozens of phrases that sound fine but are very difficult to define; for example, knowledge management and the phrase “enterprise search” itself or “image recognition” or “predictive analytics”, among others.

I had a hearty chuckle when I read “Don’t Sell a Product, Sell a Whole New Way of Thinking.” Search has been available for at least 50 years. Think RECON, Orbit, Fulcrum Technologies, BASIS, Teratext, and other artifacts of search and retrieval. Smart folks cooked up even the computationally challenged Delphes system, the metasearch system Vivisimo, and the essentially unknown Quertle.

A romp through these firm’s marketing collateral, PowerPoints, and PDFs makes clear that no buzzword has been left untried. Buyers did and do not know what the systems actually delivered.  This is evidence that search vendors have not been able to “sell a whole new way of thinking.”

No kidding. The synonyms search marketers have used in order to generate interest and hopefully a sale are a catalog of information technology jargon. Here is a short list of some of the terms from the 1990s:

  • Business intelligence
  • Competitive intelligence
  • Content governance
  • Content management
  • Customer support then customer relationship management.
  • Knowledge management
  • Neurodynamics
  • Text analytics

If I accept the Harvard analysis, the failing of enterprise search is not financial fiddling and jargon. As you may recall, Microsoft paid $1.2 billion for Fast Search & Transfer. The investigation into allegations of financial fancy dancing were resolved recently with one executive facing a possible jail term and employment restrictions. There are other companies that tried to blend search with content only to find that the combination was not quite like peanut butter and jelly. Do you use Factiva or Ebsco? Did I hear a “what?’ Other companies embraced slick visualizations to communicate key information at a glance. Do you remember Grokker? There was semantic search. Do you recollect Siderean Software.

One success story was Oingo, renamed Applied Semantics. Google understood the value of mapping words to ads and purchased the company to further its non search goals of generating ad revenue.

According to the HBR:

To find the shift, ask yourself a few questions. What was the original insight that led to the innovation? Where do you feel people “don’t get it” about your solution? What is the “aha” moment when someone turns from disinterested to enthusiastic?

Those who code up search systems are quite bright. Is this pat formula of shifting thinking the solution to the business challenges these firms face:

Attivio. Founded by Fast Search & Transfer alums, the company has ingested more than $35 million in venture funding. The company’s positioning is “an actionable 360 degree view of anything you need.” Okay. Dassault Exalead used the same line several years.

Coveo. The company has tapped venture firms for more than $30 million since the firm’s founding in 2004, Coveo uses the phrase “enterprise search” and wraps it in knowledge workers, custom service, engineering, and CRM. The idea is that Coveo delivers solutions tailored to a specific business functions and employee roles.

SRCH2. This is a Xoogler founded company that like Perfect Search before emphasizes speed. The alternative is better than open source search solutions.

Lucid Works. Like Vivisimo, Lucid Works has embraced Big Data and the cloud. The only slow downs Lucid has encountered has been turnover in CEOs, marketing, and engineering professionals. The most recent hurdle to trip up Lucid is the interest in ElasticSearch, fat with almost $100 million in venture funding and developers from the open source community.

IBM Watson. Based on open source and home grown technology, IBM’s marketers have showcased Watson on Jeopardy and garnered headlines for the $1 billion investment IBM is making in its “smart” information processing system. The most recent demonstration of Watson was producing a recipe for Bon Appetit readers.

Amazon’s search approach is to provide it as a service to those using Amazon Web services. Search is, in my mind, just a utility for Amazon. Amazon’s search system on its eCommerce site is not particularly good. Want to NOT out books not yet available on the system. Well, good luck with that query.

After I stopped chuckling, I realized that the Harvard article is less concerned with precision and recall than advocating deception, maybe cleverness. No enterprise search vendor has approached Autonomy’s revenues with the sole exception of Google’s licensing of the wildly expensive Google Search Appliance. At the time of its sale to Oracle, Endeca was chugging along at an estimated $150 million in revenue. Oracle paid about $1 billion for Endeca. With that benchmark, name another enterprise search vendor or eCommerce search vendor that has raced past Endeca. For the majority of enterprise search vendors, revenues of $3 to $10 million represent very significant achievements.

An MBA who takes over an enterprise search company may believe that wordsmithing will make sales. Sure, some sales may result but will the revenue be sustainable. Most enterprise search sales are a knee jerk to problems with the incumbent search system.

Without concrete positive case studies, talking about search is sophistry. There are comparatively few, specific, return on investment analyses for enterprise seach installations. I provided a link to a struggling LinkedIn person about an Italian library’s shift from the 1960s BASIS system to a Google Search Appliance.

Is enterprise search an anomaly in business software. Will the investment firms get their money back from their investments in search and retrieval?

Ask a Harvard MBA steeped in the lore of selling a whole new way of thinking. Ignore 50 years of search history. Success in search is difficult to achieve. Duplicity won’t do the job.

Stephen E Arnold, July 19, 2014

« Previous PageNext Page »