Search: An Information Retrieval Fukushima?

May 18, 2011

Information about the scale of the horrific nuclear disaster in Japan at the Fukushima Daiichi nuclear complex is now becoming more widely known.

Expertise and Smoothing

My interest in the event is the engineering of a necklace of old-style reactors and the problems the LOCA (loss of coolant accident) triggered. The nagging thought I had was that today’s nuclear engineers understood the issues with the reactor design, the placement of the spent fuel pool, and the risks posed by an earthquake. After my years in the nuclear industry, I am quite confident that engineers articulated these issues. However, the technical information gets “smoothed” and simplified. The complexities of nuclear power generation are well known at least in engineering schools. The nuclear engineers are often viewed as odd ducks by the civil engineers and mechanical engineers. A nuclear engineer has to do the regular engineering stuff of calculating loads and looking up data in hefty tomes. But the nukes need grounding in chemistry, physics, and math, lots of math. Then the engineer who wants to become a certified, professional nuclear engineer has some other hoops to jump through. I won’t bore you with the details, but the end result of the process produces people who can explain clearly a particular process and its impacts.


Does your search experience emit signs of troubles within?

The problem is that art history majors, journalists, failed Web masters, and even Harvard and Wharton MBAs get bored quickly. The details of a particular nuclear process makes zero sense to someone more comfortable commenting about the color of Mona Lisa’s gown. So “smoothing” takes place. The ridges and outcrops of scientific and statistical knowledge get simplified. Once a complex situation has been smoothed, the need for hard expertise is diminished. With these simplifications, the liberal arts crowd can “reason” about risks, costs, upsides, and downsides.


A nuclear fall out map. The effect of a search meltdown extends far beyond the boundaries of a single user’s actions. Flawed search and retrieval has major consequences, many of which cannot be predicted with high confidence.

Everything works in an acceptable or okay manner until there is a LOCA or some other problem like a stuck valve or a crack in a pipe in a radioactive area of the reactor. Quickly the complexities, risks, and costs of the “smoothed problem” reveal the fissures and crags of reality.

Web search and enterprise search are now experiencing what I call a Fukushima event. After years of contentment with finding information, suddenly the dashboards are blinking yellow and red. Users are unable to find the information needed to do their job or something as basic as locate a colleague’s telephone number or office location. I have separated Web search and enterprise search in my professional work.

I want to depart for a moment and consider the two “species” of search as a single process before the ideas slip away from me. I know that Web search processes publicly accessible content, has the luxury of ignoring servers with high latency, and filtering content to create an index that meets the vendors’ needs, not the users’ needs. I know that enterprise search must handle diverse content types, must cope with security and access controls, and perform more functions that one of those two inch wide Swiss Army knives on sale at the airport in Geneva. I understand. My concern is broader is this write up. Please, bear with me.

Search and Fukushima Style Problems

What I mean by a “Fukushima event” is that search is not working for most users. The knowledge of the problem is spreading and there is panic amongst vendors, chief financial officers, content producers and owners, and users. Let me illustrate:

I am fielding two to three phone calls or emails a day about search systems that do not deliver results users expect. From the Web crowd, I hear about a loss of traffic from Google Panda, the latency in the Blekko index, or the weird results generated when search for SharePoint in I also see odd changes at Google and Microsoft. Both companies are trying to provide access to social content, push ads that generate money into every nook and cranny of an output, and developing or buying new features and services that clutter, not answer, a finding journey.

In the enterprise, I hear about the “black hole that swallows money” when an enterprise tries to get content out of SharePoint or other content management systems. I hear from vendors outraged that my new study does not profile a specific system. With more than 200 vendors selling enterprise search, how am I or anyone going to profile in a meaningful way more than a handful of systems. I see data from user surveys which report dissatisfaction with search * regardless * of the system deployed. I hear from vendors who do key word search that their system now does customer support, business intelligence, or eDiscovery

This is the visible and undeniable evidence of a search Fukushima event:

  1. Everything looks the same and nothing really works all that well unless appropriate resources are brought to bear on a situation. “Appropriate resources” is my way of saying you need money, expertise, infrastructure, and time. Without these resources, it is difficult to make search “work” as its developers intended.
  2. The complexity of search is ignored until something goes wrong, then the howls escalate
  3. The licensees change horses in the middle of the stream, which in my experience is not a good idea for a technical tenderfoot
  4. The users find other, usually less secure and often “off the reservation” solutions. Also, users complain to anyone who will listen, which usually means a team of interviewers I send into a company as part of a fact finding project
  5. Consultants flower like Kentucky flora at Derby time

A Danger Zone: Caution

My view is that Web and enterprise search have reached a high risk point. I am delighted to point out that there are some bright spots in search. I am happy with the relevance of the system for most of my queries. I am delighted to report that Exalead and its CloudView system has hundreds of happy customers. Digital Reasoning delivers high value outputs via predictive analytics. These are the exceptions, and there are a handful of others.

But for the overall search market, the radioactivity is leaking into the environment. Consider these warning signs on my Geiger counter:

First, Google has disrupted relevance in its most recent “code wrappers” for the PageRank system and method. The changes are not working for me and for others. When a big outfit in what I call a quasi-monopolistic situation cannot generate relevant results, something has gone wrong. A setting is incorrect or a pipe has sprung a leak but no one can plug the fracture.

Second, enterprise search vendors are changing their marketing stories with a speed that confuses me. Every time the revolving door spins, the search vendor crates a new “legend”. Keep in mind that a “legend” in my world is a fabricated myth that looks real to everyone except those who crafted the cover. Who knows whether Vendor A is really able to deliver “semantics”. Semantics is not defined and most people just think semantics sounds really cool. Sentiment analysis. Yep, really cool. Don’t know what it is, but it is cool. there are even more bizarre flights of fancy like “cyber situational awareness.” Yep, another some what desperate marketing effort. Vendors are literally all over the conceptual map.

Third, licensees have zero idea what they are buying. Examples range from organizations happy with the solutions from IBM, Microsoft, and Oracle. Why? These are safe choices. Do they work? Sure, but we are back to the resources issues. Without appropriate resources, the systems like many others generate huge costs and user dissatisfaction in most cases. Other licensees abandon the commercial, proprietary ship and embrace open source. Open source solutions like Lucid Imagination’s work well as long as the appropriate resources are available. Even Web search works when one has the time and know how to run a query across multiple information services, deduplicate the results, analyze the unique records, and extract the needed information. But who has the “resources” for this activity. MBAs, English majors, mathematicians—none do this type of work so the information gathered is often accepted without verification or ignored altogether.

There you have it. Web search and enterprise search are now experiencing a Fukushima event. What’s the concomitant reaction to a search Fukushima? First, I think a changing of the guard is likely to happen. This means that today’s leaders will be tomorrow’s followers. Second, there will be more push back when the “safe” choices do not resolve access and findability issues. Third, old style research may make a come back, particularly for mission critical fact gathering operations. Fourth, I see faster spinning revolving doors on the vendor side and within the client organizations as well. Fifth, there will be more push back from licensees about costs, support, and more attention on contractual terms and conditions.

Will Japan’s power industry survive? Sure. Fukushima is a bad, bad problem. But it is somewhat isolated in the context of the nation state. Will search survive? Sure. The important first step is to recognize that search has a Fukushima event happening. Until that reality is recognized, I do not anticipate a significant change in information retrieval. But when it is too late, work will begin in earnest. The question becomes, “Is tool late really too late?” Nah. The problem will be smoothed and the cycle will repeat. One can see this happening in the number of conferences that talk about “governance” and search. What’s really happening is a massive clean up. Like radioactivity, a lousy search system leaves dangerous material for a long, long time. Good for consultants. Not so good for licensees, vendors, and users.

Search is a tough problem and will remain one for the foreseeable future. Will attending a conference remediate these information retrieval “hot spots”? Not a chance. Look at the upside. You might get a cool T shirt.

Stephen E Arnold, May 18, 2011



Comments are closed.

  • Archives

  • Recent Posts

  • Meta