CyberOSINT banner

Watson Joins the Hilton Family

April 30, 2016

It looks like Paris Hilton might have a new sibling, although the conversations at family gatherings will be lackluster.  No, the hotel-chain family has not adopted Watson, instead a version of the artificial intelligence will work as a concierge.  Ars Technica informs us that “IBM Watson Now Powers A Hilton Hotel Robot Concierge.”

The Hilton McLean hotel in Virginia now has a now concierge dubbed Connie, after Conrad Hilton the chain’s founder.  Connie is housed in a Nao, a French-made android that is an affordable customer relations platform.  Its brain is based on Watson’s program and answers verbal queries from a WayBlazer database.  The little robot assists guests by explaining how to navigate the hotel, find restaurants, and tourist attractions.  It is unable to check in guests yet, but when the concierge station is busy, you do not want to pull out your smartphone, or have any human interaction it is a good substitute.

” ‘This project with Hilton and WayBlazer represents an important shift in human-machine interaction, enabled by the embodiment of Watson’s cognitive computing,’ Rob High, chief technology officer of Watson said in a statement. ‘Watson helps Connie understand and respond naturally to the needs and interests of Hilton’s guests—which is an experience that’s particularly powerful in a hospitality setting, where it can lead to deeper guest engagement.’”

Asia already uses robots in service industries such as hotels and restaurants.  It is worrying that Connie-like robots could replace people in these jobs.  Robots are supposed to augment human life instead of taking jobs away from it.  While Connie-like robots will have a major impact on the industry, there is something to be said for genuine human interaction, which usually is the preference over artificial intelligence.  Maybe team the robots with humans in the service industries for the best all around care?


Whitney Grace, April 30, 2016
Sponsored by, publisher of the CyberOSINT monograph

Unicorn Land: Warm Hot Chocolate and a Nap May Not Help

April 25, 2016

In the heady world of the unicorn, there are not too many search and content processing companies. I do read open source information about Palantir Technologies. Heck, I might even wrap up my notes about Palantir Gotham and make them available to someone with a yen to know more about a company which embraces secrecy but has a YouTube channel explaining how its system works.

I was poking around for open source information about how Palantir ensures that a person with a secret clearance does not “see” information classified at a higher level of access. From what I have read, the magic is in time stamps, open source content management, and some middleware. I took a break from reading the revelations from a person in the UK who idled away commute time writing about Palantir and noted “On the Road to Recap: Why the Unicorn Financing Market Just Became Dangerous for All Involved.”

I enjoy “all” type write ups. As I worked through the 5,600 word write up, I decided not to poke fun at the logic of “all” and jotted down the points which struck me as new information and the comments which I thought might be germane to Palantir, a company which (as I document in my Palantir Notebook) has successfully fast cycles of financing between 2003 and 2015 when the pace appears to have slowed.

There is no direct connection between the On the Road to Recap article and Palantir, and I certainly don’t want to draw explicit parallels. In this blog post, let me highlight some of the passages from the source article and emphasize that you might want to read the original article. If you are interested in search and content processing vendors like Attivio, Coveo, Sinequa, Smartlogic, and others of their ilk, some of the “pressures” identified in the source article are likely to apply. If the write up is on the money, I am certainly delighted to be in rural Kentucky thinking about what to have for lunch.

The first point I noted was new information to me. You, gentle reader, may be MBAized and conversant with the notion of understanding the lay of the land; to wit:

most participants in the ecosystem have exposure to and responsibility for specific company performance, which is exactly why the changing landscape is important to understand.

Ah, reality. I know that many search and content processing vendors operate without taking a big picture view. The focus is on what I call “what can we say to close a deal right now” type thinking. The write up roasts that business school chestnut of understanding life as it is, not as a marketer believes it to be.

I noted this statement in the source article:

Late 2015 also brought the arrival of “mutual fund markdowns.” Many Unicorns had taken private fundraising dollars from mutual funds. These mutual funds “mark-to-market” every day, and fund managers are compensated periodically on this performance. As a result, most firms have independent internal groups that periodically analyze valuations. With the public markets down, these groups began writing down Unicorn valuations. Once more, the fantasy began to come apart. The last round is not the permanent price, and being private does not mean you get a free pass on scrutiny.

Write downs, to me, mean one might lose one’s money.

I then learned a new term, dirty term sheets. Here’s the definition I highlighted in a bilious yellow marker hue:

“Dirty” or structured term sheets are proposed investments where the majority of the economic gains for the investor come not from the headline valuation, but rather through a series of dirty terms that are hidden deeper in the document. This allows the Shark to meet the valuation “ask” of the entrepreneur and VC board member, all the while knowing that they will make excellent returns, even at exits that are far below the cover valuation. Examples of dirty terms include guaranteed IPO returns, ratchets, PIK Dividends, series-based M&A vetoes, and superior preferences or liquidity rights. The typical Silicon Valley term sheet does not include such terms. The reason these terms can produce returns by themselves is that they set the stage for a rejiggering of the capitalization table at some point in the future. This is why the founder and their VC BOD member can still hold onto the illusion that everything is fine. The adjustment does not happen now, it will happen later.

I like rejiggering. I have experienced used car sales professionals rejiggering numbers for a person who once worked for me. Not a good experience as I recall.

I then circled this passage:

One of the shocking realities that is present in many of these “investment opportunities” is a relative absence of pertinent financial information. One would think that these opportunities which are often sold as “pre-IPO” rounds would have something close to the data you might see in an S-1. But often, the financial information is quite limited. And when it is included, it may be presented in a way that is inconsistent with GAAP standards. As an example, most Unicorn CEOs still have no idea that discounts, coupons, and subsidies are contra-revenue.

So what’s this have to do in my addled brain with Palantir? I had three thoughts, which are my opinion, and you may ignore them. In fact, why not stop reading now.

  1. Palantir is a unicorn and it may be experiencing increased pressure to generate a right now pay out to its stakeholders. One way Palantir can do this is to split its “secret” business from its Metropolitan business for banks. The “secret” business remains private, and the Metropolitan business becomes an IPO play. The idea is to get some money to keep those who pumped more than $700 million into the company since 2003 sort of happy.
  2. Palantir has to find a way to thwart those in its “secret” work from squeezing Palantir into a niche and then marginalizing the company. There are some outfits who would enjoy becoming the go-to solution for near real time operational intelligence analysis. Some outfits are big (Oracle and IBM), and others are much, much smaller (Digital Reasoning and Modus Operandi). If Palantir pulls off this play, then the government contract cash can be used to provide a sugar boost to those who want some fungible evidence of a big, big pay day.
  3. Palantir has to amp up its marketing, contain overhead, and expand its revenue from non government licenses and consulting.

Is Palantir’s management up to this task? The good news is that Palantir has not done the “let’s hire a Google wizard” to run the company. The bad news is that Palantir had an interesting run of management actions which resulted in a bit of a legal hassle with i2 Group before IBM bought it.

I will continue looking for information about Gotham’s security system and method. In the back of my mind will be the information and comments in On the Road to Recap.

Stephen E Arnold, April 25, 2016

Alphabet Google Wants to Spell GE

March 28, 2016

I read a whizzy MBA-in-Silicon-Valley type analysis of the GOOG, which is now Alphabet. After working through the write up, I focused on one statement as interesting:

One way to understand Alphabet is as a vehicle to build essential physical infrastructure in the real world. What if you were to build a next-generation GE today?

GE had Neutron Jack, whom I had the pleasure of meeting. My employer (which shall remain nameless) screwed up a project and GE refused to pay a six figure bill. My boss took me to a meeting to learn how to get the bill paid AND to sell more work to Neutron Jack. To cut to the cob, my boss sold a $1 million job and got the unpaid bill settled in full.

What’s the difference between the new Google as described in “Learning Larry Page’s Alphabet”?

The answer is not Neutron Jack, although he was a canny manager. The answer is, “My boss.”

The Alphabet Google thing is riding high. It has more money in the bank than the current president of the University of Louisville. (Keep trying, Dr. Ramsey. Keep trying.)

For Alphabet Google to become more than an online advertising outfit, the company is going to have to do more than cook up science club projects. A person who can look adversity in the eye (Neutron Jack) and then manage the situation into a big payday has to have his or her hands on the steering wheel. Sorry, an autonomous auto kill switch won’t do the job.

The article pivots on the assumption that many motor boats can maneuver more quickly than an aircraft carrier. How has that worked out at Google. After more than 15 years of effort, Alphabet Google’s stallion remains saddled with Steve Ballmer’s insight:

Google is a one trick pony.

I noted this passage in the write up:

Here’s another way to view the company’s costly moonshot habit: as a marketing expense.

Isn’t that evidence for the one trick pony observation by a person who owns a basketball team?

What’s the strategic vision? I highlighted this passage as a possible answer to the question:

This is why Alphabet is more than just a spectacular corporate reengineering. Page picked the perfect time to reset his company—at the very moment that analysts were heralding Peak Google. He knew that traditional corporate structure limits innovation at the pace he wants and needs. He broke his business into smaller pieces to make them simpler and focused them more narrowly to discourage drift and distraction, while trying to maintain the advantages of scale and resources and a compelling culture to recruit talent. Page isn’t ready to settle for status quo. He wants to make the world a better place—with electric cars and smart cities and universal Internet access and no more disease—and also find lucrative new businesses that keep the company part of the present and future. He wants everything, from A to Z.

The friction building in the Alphabet Google machine may cause the rocket ship to veer off course. Alphabet Google has to traverse the air space of the EC, Russia, and China. The US does not have a “no fly zone” in place to bedevil Google…yet. And there is the pesky annoyances doing business as Amazon and Facebook.

Stephen E Arnold, March 28, 2016

Amazon Web Services: Crushing the Competition?

March 23, 2016

I read “Attack! Run. WTF? A Decade of Enterprise Class Fear and Uncertainty with AWS.” I am not sure if Amazon’s Web Services’ business is being praised or criticized. Nevertheless, the write up has some interesting factoids. I highlighted these statements:

IBM’s Cloud Services

  • IBM, … was so flabbergasted [when Amazon won a US government contract] that the Blue Shirts of Armonk decided on the old-school route to victory and filed a legal complaint asking the government to re-evaluate IBM’s deal against that of Amazon, which Big Blue later withdrew.
  • Famed for re-inventing itself around software in the 1990s under Lou Gerstner, the majority of IBM’s focus for the 2000s was devoted to unloading the PC and the server businesses on China. The firm is now trapped in a maelstrom of transition, restructuring and layoffs. Like Microsoft, IBM seems to have believed AWS couldn’t happen to it, that what the world needed was the same server software and services. It was nearly seven years after AWS that IBM realized something was afoot – probably when it lost both the CIA deal and got slapped about its attempts to make the CIA love it – that Big Blue said it would spend $2bn buying computing player SoftLayer and in 2014 throw $1.2bn into a massive data centre expansion to host your data and compute.

Microsoft Cloud Services

  • Azure succumbed to classic innovator’s dilemma: how to sell a new platform as a package and at a price to maximize revenue without cannibalizing the company’s actual main money-makers – PC and server software. After delayed starts under Ray Ozzie and Bob Muglia, the technology roadmap only really clicked under new CEO Satya Nadella and executive software nerd Scott Guthrie. One brought the CEO-level commitment, the other made Azure work for developers.
  • Gartner today regards Azure as number two, behind AWS, and yet… According to Gartner’s incumbent Cloud Queen Lydia Leong, Azure lacks the polish of AWS.

Oracle Cloud Services

  • Oracle, which bought Sun, preferred to play a Game of Thrones that was corporate M&A to hold onto its position in IT. Sadly, it chose wrong; Oracle spent $8.5bn on Sun but ultimately discontinued the company’s fledgling utility computing service. Hardware and Java was what Oracle wanted.
  • Today, Oracle’s resultant hardware business makes just half the revenue of AWS and is is shrinking – falling 13 per cent to $1.1bn – versus AWS’s 69 per cent growth last quarter to $2.4bn. That past complacency of Oracle’s CEO on cloud has put Oracle firmly in a pack of also rans behind AWS on platform cloud, with Oracle now throwing PR at a problem to convince Wall St it is credible as a provider of IT as a service.

And what about Amazon? The write up points out:

  • AWS is still attacking – growing at a phenomenal rate, 71 per cent in its recent quarter to $2.4bn and 69 per cent for the year to $7.88bn. The appetite among enterprises for AWS’s style of technology and model of delivery clearly hasn’t yet been satiated.
  • …the truth is AWS now has its fences across so much of the cloud, removing them isn’t an option. The big question then for AWS at the age of 10 is this: when will the old men of IT regain their wind? How big will be their counter-attack and will it be concerted? Will it pose a tangible threat and how would AWS respond?

I noted that Apple has shifted some of its cloud business to the Google from AWS. I assume the Board of Directors’ excitement is now behind the kids from Cupertino. What’s clear is that IBM and Oracle seem to face an uphill slog if I understand the write up. Read the original and decide for yourself. I love the WTF. Some stakeholders may be asking this question too.

Stephen E Arnold, March 23, 2016

Yellowfin: Emulating i2 and Palantir?

March 22, 2016

I read “New BI Platform Focuses on Collaboration, Analytics.” What struck me about this explanation of a new version of YellowFin is that the company is adding the type of features long considered standard in law enforcement and intelligence. The idea is that visualizations and collaboration are components of a commercial business intelligence solution.

I noted this paragraph:

Other BI vendors have tried to push data preparation and analysis responsibilities onto business users “because it’s easier to adapt what they have to fulfill that goal.” But Yellowfin “isn’t a BI tool attempting to make the business user a techie. It is about presenting data to users in an attractive visual representation, backed-up with some of the most sophisticated collaboration tools embedded into a BI platform on the market.”

The reason for analyst involvement in the loading of data is a way to eliminate the issue of content ownership, indexing, and knowledge of what is in the system’s repository. I am not confident that any system which allows the user to whack away at whatever data have been processed by the system is ready for prime time. Sure, Google can win at Go, but the self driving auto ran into a bus.

The write up, which strikes me as New Age public relations, seems to want me to remember what’s new with YellowFin with this mnemonic example: Curated. Baffled? Here’s what curated means:

  • Consistent: Governed, centralized and managed
  • Usable: by any business to consume analytics
  • Relevant: connected to all the data users need to do their jobs well
  • Accurate: data quality is paramount
  • Timely: Provide real time data and agile content development
  • Engaging: Offer a social or collaborative component
  • Deployed: widely across the organization.

Business intelligence is the new “enterprise search.” I am not sure the use of notions like curated and adding useful functions delivers the impact that some marketers promise. Remember that self driving car. Pesky humans.

Stephen E Arnold, March 23, 2016

HP Enterprise: Is Haven Autonomy IDOL after a Project Runway Touch Up?

March 16, 2016

Short honk: I read “HPE Launches Machine-Learning-As-a-Service on Microsoft Azure.” The hook for me was the pricing for a new cloud search and content processing service. I did not understand the approach; for example, what the heck is an “API unit”?


But what caused me to jot down this note was this list of HPE Haven OnDemand functions. Here’s the list I circled:

  • Advanced Text Analysis, which pulls concepts and sentiment from text.
  • Format conversion, which converts data wherever it lives.
  • Search tools across on-premises or cloud data.
  • Image recognition and face detection.
  • Knowledge graph analysis.
  • Pattern and speech recognition.

Based on my sketchy knowledge about Autonomy IDOL, this list seems to be a summary of Autonomy’s integrated data operating features. Most of these were added to the IDOL platform in the years before HP paid $11 billion for the 1998 system which, to be fair, had been upgraded in the intervening years.

The list also reminded me of some of the functions I associated with “augmented intelligence,” a niche currently occupied by outfits like Palantir and IBM i2.

In terms of pricing, the Palantir Hobbits charge for a license, training, support, and some other goodies. But the pricing is not variable. The IBM i2 folks deliver a collection of options and each option has a price tag.

HPE’s pricing is a bit of a mystery. How many API units fit on the head of Big Data project? Whittling down that $11 billion investment suggests that the API units may be more expensive than the monthly fees suggest; for example, the introductory offer offers 50,000 API units and 15 Resource Units for [the] first three months for all paid plans.” What’s a “Resource Unit”?

The write up raises more questions than it answers in my opinion. I wonder how Autonomy IDOL will look in fall fashions?

Stephen E Arnold, March 16, 2016

Weakly IBM: Watson, Where Are the Revenues?

March 12, 2016

I read “What’s happening at IBM (It’s Dying).” The article has a quote to note. I highlighted this snappy phrase:

Things aren’t going well at all in cloud, analytics, mobile, social and security land. When those kick-in (if they kick-in) IBM will be just one company in a crowd with no particular advantage over the others. IBM used to be able to count on its size, its people, its loyal customers, but all of those are going or gone.

If accurate, the observations in this paragraph are likely to trouble IBM’s stakeholders, partners, and employees.

I noted “KPMG Will Use the Power of IBM’s Watson.” Like the recipe play and the flow of information about curing disease, this tie up appears to unite two important companies in a stirring high technology activity. The article introduces an interesting idea:

IBM and KPMG have announced a partnership today, bringing IBM’s Watson supercomputer to KPMG’s professional services offerings.


Notice that IBM Watson has morphed into a supercomputer. Perhaps the author is exercising a bit of metaphorical freedom? Perhaps Watson is more than Lucene, home brew scripts, and a collection of disparate technologies which IBM acquired?

How will KPMG use the Watson supercomputer? I learned:

Watson will allow KPMG to analyze massive amounts of data with greater ease, delivering insights more quickly. It will also eliminate judgment-driven processes that usually happen in KPMG’s audit, tax, advisory and other professional services.

Google is making its system beat the pants off a human Go player. But Google continues to generate money from its advertising business. The company, in general, seems to be doing the science and math club projects without making headlines with massive layoffs and giving me a flow of material which Jack Benny’s comedy writers could have converted to entertainment gold.

IBM had technology which could have delivered on this Watson promise. Has anyone at IBM exploited the potential of the i2 platform, Cybertap, and other high value information systems? The answer is, “A little bit.”

Unfortunately a bunch of little bits don’t make a bite in the problems Mr. Cringely has identified and been pointing out for years.

Weakly moves IBM. Watson is not much of a bench presser in the heavy revenue gym it appears.

Stephen E Arnold, March 12, 2016

Watson Weakly: Jargon and Resource Allocations

March 9, 2016

In case you missed the news, IBM seems to be trimming its workforce. Does anyone remember Robert X. Cringely’s “IBM Is So Screwed?” I do. I would wager that Mr. Cringely remembers IBM’s suggestion that Mr. Cringely was off base with his analysis.

Perhaps Mr. Cringely is vindicated. I read  “IBM Job Cuts: US Tech Giant Begins Mass Firing One Third of Workforce.” Hmmm. One third of a workforce having an opportunity to find its future elsewhere? That sounds like a swell way to greet spring 2016. March in like a lion and march out like a lamb. Is the lamb heading to the local meat packers?

Against this cheerful seasonal background, I want to mention “Moving from Enterprise Search to Cognitive Exploration.” This is a recycling of an earlier white paper for which one must register in order to read or download the document. Please, note that you will have to jump through some hoops to get this March 2016 publication. Do not complain to me about the link, the involvement of a middleman, and the need to provide details about your interest in enterprise search. Take it up with IBM; that is, if someone will take your call or answer your email. Hey, good luck with that.

What’s notable about this white paper is this word pair: Cognitive Exploration. Original? Nah. The phrase turns up in the title of a collection of essays called Cognitive Exploratioin of Language and Linguistics in 1999. The phrase is some of the jingoism from the super reliable psychology linguistics disciplines. IBM has dallied with the phrase for a number of years but in the RA world, the phrase is getting a jump start. An example of IBM’s arguement is that no one no longer runs a search across a customer service database. Nope, one cognitively explores that customer database.

Cognitive Exploration. It flows trippingly on the tongue does it not. IBM does not fire people; IBM RA’s them. (RA. Resource allocation or termination or reduction in force.)

What is Cognitive Exploration? Well, it is Lucene search plus some home brew code and a dollop of acquired technology. IBM’s original commercial enterprise search system (STAIRS) is just not up to the task of cognitively exploring one’s information assets it seems.

The white paper is a tribute to the search buzzwords that have been used by marketers in the past. I just love Cognitive Exploration.

What is it? For the full answer, you will need to read the 13 pages of explanation. Here’s a sampling of the facts in the write up:

Analysts expect the total data created and copied to reach 44 ZB by the year 2020 (Analyst firm IDC).  After all, there are more than 204,000,000 emails launched every minute every day (  How do you manage, search, and process that data and turn it into usable information?

Yep, that’s a lot of information. How is an organization going to deal with “all” those zeros and ones? I suppose I would begin by using a system designed to manipulate large data flows. How about Palantir, BAE Systems, Leidos for starters. What no IBM? Bummer.

The IBM argument advances:

To meet today’s expectations, a search system must be able to access all of your important data sources and filter results based on a user’s access permissions within the organization.

I love the “all”. IBM obviously has nailed video, audio, binaries of various types, disparate file types, and dynamic content flows from intercepts, social media, and interesting sources from the Dark Web. I love “all” type solutions. Too bad these are science fiction based on my experience.

The fix is Cognitive Exploration. Thank you, IBM. A new buzzword to explain what search and retrieval has flubbed for — what? — 50 years” IBM explains:

Cognitive exploration is the combination of search, content analytics, and cognitive computing. Not only can cognitive exploration accelerate the rate at which users can find and navigate information; by leveraging advanced technologies such as content analytics, machine learning, and reasoning it has the potential to augment human expertise.

I don’t want to be a party pooper, but this is perilously close to Palantir’s “augmented intelligence” jargon. Attivio, BA Insight, and even the French folks at Sinequa use similar lingo. Me-too’ism at its finest? Nah, this is IBM, the outfit taking Groupon (a discount coupong business) to court for allegedly infringing on Prodigy patents. Prodigy? Remember that online service?

After snoozing through the white paper’s three pillars of Cognitive Exploration, I raced to the the finish line.

Cognitive Exploration involves the i2 type of relationship analysis, some good old fashioned cuddling between search and cognitive computing (think Watson, gentle reader), and a unified view or what a popular novelist calls “God’s eye” view. Please note that IBM offers some examples, but get the numbering wrong. Where is number one? Watson, Watson, can you assist me? Guess not. IBM’s cognitive exploration essay begins counting with number 2. I am okay with zero. I am okay with one. But I am not okay with an enumerated list beginning with the number two. Careless typo? Indifference? Rushing to the RA meeting? Don’t know. Cognitive Watson counts two, three, four, not one, two, three.

At the end of this remarkable description of Cognitive Exploration I learned:

The cognitive capabilities that can be leveraged by Watson Explorer are provided by the IBM Watson platform.

Isn’t this a recycling of some of the early 1990s marketing material from i2 Group Limited, which IBM bought. Isn’t this lingo influenced by Palantir’s explanations of its Gotham platform?

Omitted from the “all” I assume is the seamless interchange of Gotham files with i2 Analyst Notebook and i2 Analyst Notebook with Gotham. The users and customers have to learn that “all,” like Mr. Clinton’s “is” may not be exactly congruent with one’s understanding of “federation” and “unified.”

Enough already. Go for the close:

IBM Watson Explorer unlocks the value within your data, utilizing that information to help employees make well-informed decisions, provide better support, and identify more customers and business opportunities. By reaching across multiple silos of information within your enterprise, search results will include information never previously integrated into single solutions. Users will benefit from search results from all the data in your company, structured and unstructured, and include data from outside as well. Rather than trying to make good decisions with limited insight, cognitive exploration users can now extract and understand all of the valuable information at their fingertips.

With such a wonderful tool at IBM’s disposal, why is IBM’s management unable to generate revenues? Perhaps the silliness of the marketing explanation of Cognitive Exploration does not deliver the results that obviously someone at IBM believes.

I am stuck on that error in numbering, the recycling of Palantir’s marketing lingo, and the somewhat silly phrase “Cognitive Exploration.”

I won’t sail my Nina, Pinta, and Santa Maria to that digital shore. I will use Google Earth and tools which I know sort of work.

Stephen E Arnold, March 9, 2016

Enterprise Search Revisionism: Can One Change What Happened

March 9, 2016

I read “The Search Continues: A History of Search’s Unsatisfactory Progress.” I noted some points which, in my opinion, underscore why enterprise search has been problematic and why the menagerie of experts and marketers have put search and retrieval on the path to enterprise irrelevance. The word that came to mind when I read the article was “revisionism” for the millennials among us.

The write up ignores the fact that enterprise search dates back to the early 1970s. One can argue that IBM’s Storage and Information Retrieval System (STAIRS) was the first significant enterprise search system. The point is that enterprise search as a productized service has a history of over promising and under delivering of more than 40 years.

image.pngEnterprise search with a touch of Stalinist revisionism.

Customers said they wanted to “find” information. What those individuals meant was have access to information that provided the relevant facts, documents, and data needed to deal with a problem.

Because providing on point information was and remains a very, very difficult problem, the vendors interpreted “find” to mean a list of indexed documents that contained the users’ search terms. But there was a problem. Users were not skilled in crafting queries which were essentially computer instructions between words the index actually contained.

After STAIRS came other systems, many other systems which have been documented reasonably well in Bourne and Bellardo-Hahn’s A History of Online information Services 1963-1976. (The period prior to 1970 describes for-fee research centric online systems. STAIRS was among the most well known early enterprise information retrieval system.)  I provided some history in the first three editions of the Enterprise Search Report, published from 2003 to 2007. I have continued to document enterprise search in the Xenky profiles and in this blog.

The history makes painful reading for those who invested in many search and retrieval companies and for the executives who experienced the crushing of their dreams and sometimes career under the buzz saw of reality.

In a nutshell, enterprise search vendors heard what prospects, workers overwhelmed with digital and print information, and unhappy users of those early systems were saying.

The disconnect was that enterprise search vendors parroted back marketing pitches that assured enterprise procurement teams of these functions:

  • Easy to use
  • “All” information instantly available
  • Answers to business questions
  • Faster decision making
  • Access to the organization’s knowledge.

The result was a steady stream of enterprise search product launches. Some of these were funded by US government money like Verity. Sure, the company struggled with the cost of infrastructure the Verity system required. The work arounds were okay as long as the infrastructure could keep pace with the new and changed word-centric documents. Toss in other types of digital information, make the system perform ever faster indexing, and keep the Verity system responding quickly was another kettle of fish.

Research oriented information retrieval experts looked at the Verity type system and concluded, “We can do more. We can use better algorithms. We can use smart software to eliminate some of the costs and indexing delays. We can [ fill in the blank ].

The cycle of describing what an enterprise search system could actually deliver was disconnected from the promises the vendors made. As one moves through the decades from 1973 to the present, the failures of search vendors made it clear that:

  1. Companies and government agencies would buy a system, discover it did not do the job users needed, and buy another system.
  2. New search vendors picked up the methods taught at Cornell, Stanford, and other search-centric research centers and wrap on additional functions like semantics. The core of most modern enterprise search systems is unchanged from what STAIRS implemented.
  3. Search vendors came like Convera, failed, and went away. Some hit revenue ceilings and sold to larger companies looking for a search utility. The acquisitions hit a high water mark with the sale of Autonomy (a 1990s system) to HP for $11 billion.

What about Oracle, as a representative outfit. Oracle database has included search as a core system function since the day Larry Ellison envisioned becoming a big dog in enterprise software. The search language was Oracle’s version of the structured query language. But people found that difficult to use. Oracle purchased Artificial Linguistics in order to make finding information more intuitive. Oracle continued to try to crack the find information problem through the acquisitions of Triple Hop, its in-house Secure Enterprise Search, and some other odds and ends until it bought in rapid succession InQuira (a company formed from the failure of two search vendors), RightNow (technology from a Dutch outfit RightNow acquired), and Endeca. Where is search at Oracle today? Essentially search is a utility and it is available in Oracle applications: customer support, ecommerce, and business intelligence. In short, search has shifted from the “solution” to a component used to get started with an application that allows the user to find the answer to business questions.

I mention the Oracle story because it illustrates the consistent pattern of companies which are actually trying to deliver information that the u9ser of a search system needs to answer a business or technical question.

I don’t want to highlight the inaccuracies of “The Search Continues.” Instead I want to point out the problem buzzwords create when trying to understand why search has consistently been a problem and why today’s most promising solutions may relegate search to a permanent role of necessary evil.

In the write up, the notion of answering questions, analytics, federation (that is, running a single query across multiple collections of content and file types), the cloud, and system performance are the conclusion of the write up.


The use of open source search systems means that good enough is the foundation of many modern systems. Palantir-type outfits, essential an enterprise search vendors describing themselves as “intelligence” providing systems,, uses open source technology in order to reduce costs, shift bug chasing to a community, The good enough core is wrapped with subsystems that deal with the pesky problems of video, audio, data streams from sensors or similar sources. Attivio, formed by professionals who worked at the infamous Fast Search & Transfer company, delivers active intelligence but uses open source to handle the STAIRS-type functions. These companies have figured out that open source search is a good foundation. Available resources can be invested in visualizations, generating reports instead of results lists, and graphical interfaces which involve the user in performing tasks smart software at this time cannot perform.

For a low cost enterprise search system, one can download Lucene, Solr, SphinxSearch, or any one of a number of open source systems. There are low cost (keep in mind that costs of search can be tricky to nail down) appliances from vendors like Maxxcat and Thunderstone. One can make do with the craziness of the search included with Microsoft SharePoint.

For a serious application, enterprises have many choices. Some of these are highly specialized like BAE NetReveal and Palantir Metropolitan. Others are more generic like the Elastic offering. Some are free like the Effective File Search system.

The point is that enterprise search is not what users wanted in the 1970s when IBM pitched the mainframe centric STAIRS system, in the 1980s when Verity pitched its system, in the 1990s when Excalibur (later Convera) sold its system, in the 2000s when Fast Search shifted from Web search to enterprise search and put the company on the road to improper financial behavior, and in the efflorescence of search sell offs (Dassault bought Exalead, IBM bought iPhrase and other search vendors), and Lexmark bought Brainware and ISYS Search Software.

Where are we today?

Users still want on point information. The solutions on offer today are application and use case centric, not the silly one-size-fits-all approach of the period from 2001 to 2011 when Autonomy sold to HP.

Open source search has helped create an opportunity for vendors to deliver information access in interesting ways. There are cloud solutions. There are open source solutions. There are small company solutions. There are more ways to find information than at any other time in the history of search as I know it.

Unfortunately, the same problems remain. These are:

  1. As the volume of digital information goes up, so does the cost of indexing and accessing the sources in the corpus
  2. Multimedia remains a significant challenge for which there is no particularly good solution
  3. Federation of content requires considerable investment in data grooming and normalizing
  4. Multi-lingual corpuses require humans to deal with certain synonyms and entity names
  5. Graphical interfaces still are stupid and need more intelligence behind the icons and links
  6. Visualizations have to be “accurate” because a bad decision can have significant real world consequences
  7. Intelligent systems are creeping forward but crazy Watson-like marketing raises expectations and exacerbates the credibility of enterprise search’s capabilities.

I am okay with history. I am not okay with analyses that ignore some very real and painful lessons. I sure would like some of the experts today to know a bit more about the facts behind the implosions of Convera, Delphis, Entopia, and many other companies.

I also would like investors in search start ups to know a bit more about the risks associated with search and content processing.

In short, for a history of search, one needs more than 900 words mixing up what happened with what is.

Stephen E Arnold, March 9, 2016

Hershey Chocolate: Semi Sweet Analytics?

March 4, 2016

I am wrapping up my profile of Palantir Technologies. I located a couple of references to Palantir’s activities in the non-government markets. One of the outfits allegedly swooned by the Hobbits was Hershey chocolate. A typical reference to the Hobbits and Kisses folks was “Hershey Turns Kisses and Hugs into Hard Data.”


When I read “The Hershey Company Partners with Infosys to Build Predictive Analytics Capability using Open Source Information Platform on Amazon Web Services,” I wondered why Palantir Technologies was not featured in the write up. Praescient Analytics, near Washington, DC, can plug industrial strength predictive analytics like Recorded Future’s into a Metropolitan installation without much hassle.

The write up makes clear that the chocolate outfit is going a new way. The path leads through Amazon Web Services to the Infosys Information Platform.

I find this quite a surprise. I have no doubt that Infosys has some competent folks on its team. But the questions flashing through my mind are:

  • What’s up with the Palantir system?
  • Why jump to Infosys when there are darned good outfits available in Boston and Washington, DC?
  • What’s an outsourcing firm able to deliver that specialists with deep experience in making sense of data cannot?

I never understood Mars, and now I don’t understand the makers of the York Peppermint Patty.

Perhaps this is a “whopper” of a project?

Stephen E Arnold, March 4, 2016

Next Page »