Groping the Enterprise Search Elephant

May 12, 2008

In the 2000 to 2003 period, ArnoldIT.com delivered a number of tutorials about search. Some of these presentations were held in conjunction with conferences such as the Boston Search Engine Meeting, Gilbane’s conferences, and the Information Today line up of professional programs. Others were delivered to small groups at various financial institutions, search vendors, and government entities.

elephant_final

This is the search elephant. In a meeting, you will hear many people talk about search. Each person will have a specific meaning and assume that the others in the room will know exactly what’s meant when she uses the word search. If you take all these individual meanings of search and put them together, you have a better idea of what a search system is supposed to deliver.

In each case, I had to take more time than budgeted to define the different types of search encountered in enterprise behind-the-firewall deployments. This issue surfaced this week end when I spoke with a colleague grousing about the different perceptions of search in a consulting firm in Europe.

The purpose of this essay is to provide an abbreviated and hopefully useful look at the different meanings of search. To help make these ideas concrete, You can learn more about this subject in Enterprise Search Report and the brand-new Beyond Search study that came out in April 2008. I wrote the first three editions of ESR and played a minor part in the current edition, but you will get some color on this topic in those for-fee analyses.

Everybody Knows about Search

The definition issue is skipped over because most people today believe they know about search. At dinner last night, people said, “I did a search for a cruise to Brazil”, “I looked up my health care benefits and found they were reduced” and I’m not sure it’s worth seeing” and “My boss had me find a proposal he thought he had lost when his laptop was stolen”. None of these people were information retrieval professionals or computer scientists. But each of them talked about search as if it were a routine activity like finding a parking space.

The need for a definition goes up when people assume others mean the same thing for search. Let’s look at the meanings for search in an enterprise.

Enterprise Search or Behind-the-Firewall Search

This is the buzz word of the moment. Companies know intuitively that if a worker can’t find information on the company’s own internal network, the worker is going to waste time looking for what’s needed. Even worse, the employee can’t find the accurate information and makes a bone head decision.

Enterprise search is a contradiction. No boss in the world wants “everything” indexed and searchable. Problems come from indexing “everything”. A few of the bombs in the enterprise search mine field are:

  • Email on topics that are or can be problematic
  • Information about company secrets like Coca Cola’s formula for the fizzy drink
  • Information about legal matters
  • Information an employee puts on a company server about non-company activities
  • Personal, salary, and medical information
  • Pricing information
  • Stolen software, information from a third-party provider without paying a license fee or obtaining a copyright permission, information about a competitor that was obtained via an email from a friend

Search works best when the domain of information to index is narrowly defined, reviewed, and subject to a formal approval and review policy. Ad hoc indexing of behind-the-firewall information can trigger big trouble fast.

Desktop Search

Most people working in an organization can’t find files on their local computer or elsewhere in the organization. An enterprise search system can index the contents of an individual user’s computer, and you have to do some homework to determine what the indexing policy should be with regard to an employee’s computer. A variant of this type of search need can be complicated with a user or her department takes search into its own hands. A free desktop systems or a Google Search Appliance may provide local access. Problems exist in this type of solution with regard to laptop computers, USB drives, and cloud-based mail systems.

You may need to survey employees and inventory documents to figure out what information resides on these devices or systems. Some employees think local for information until the employees need access to previous versions of documents, boilerplate text for a proposal, or some other information.

In any event, this type of “local” search is not simple. The person in the meeting who narrows search to specific file types or email may have a particular need. Search, when broad or complex from that person’s point of view, is a problem. Example: Google Desktop Search and the Google Search Appliance.

Everything Search

Busy people often tell colleagues, “I need everything about Company X’s new product now.” The idea is that this executive will scan the information and recognize what she needs to do her job. The scope of the indexing job is not understood by this type of person. The costs of a system to deliver “everything” don’t make any sense to this type of person either. The cost of the system is someone else’s problem. I don’t have any bullet proof way to manage this type of search definition. Indexing “everything” is a problem because fine-grained access controls have to be applied before shoving content into the search system.

The “everything” search does not mean content residing on the company’s servers. This type of search definition reaches out to the Internet, to third-party content providers who enforce copyright and charge for access, and to information that may be off limits for most users; for example, information about a legal matter. Example: any Big Name Vendor.

Web Site Search

Some people define search as looking for information about their own company on Google. An all-too-frequent situation is an employee who says, “I couldn’t find the information on our Web site search system, so I used Google to find this PowerPoint of the president’s talk last month.” Jaws drop because most people don’t realize what information is on the company’s public-facing Web server.

This type of search can bend the minds of many people in an organization. The Web site has a search function. The idea that Google’s search finds information on the company’s Web site faster and more efficiently is not understood. In many cases, senior managers don’t know what’s on the company’s Web site or that the Web site search function doesn’t work very well.

There are ways to make a company’s Web site searchable along with information residing on the organization’s servers. Effort is required to figure out what’s on the Web site and the method for making that Web site information available on the behind-the-firewall search system. When the Web site is maintained by a third-party, the company has to get everyone on the same page. In my experience, this is neither easy nor without political friction. Example: Google Custom Search Engine

Competitor Search

Some people in the enterprise use the word search to mean “what do we know about this topic”. The search system has to provide a way to see what Vivisimo calls “information overlook” about a competitor, a market trend or some other exogenous action.

This person’s definition of search is related to the “everything” definition of search with one important difference: A report is what’s needed. A laundry list of results is not what the person involved in competitive intelligence wants. Analysts grunt through raw information. The person who defines search in terms of competitive intelligence wants to know what’s important and then have access to selective details.

Enterprise search system vendors assert that their technology delivers this type of solution. In most cases, these systems have to be slathered with money, time, and programming to conform to this definition of search. Many organizations want to buy intelligence, but these organizations lack the resources to make their dream come true. Example: Coveo, ISYS Search Software, SAS, Siderean Software.

Database Search

Search can be defined in terms of a specific work process. The people who need to answer customer support questions or deal with inquiries about a customer shipment that’s lost in transit want search embedded in the system used to perform work. The notion of search is not entering words in a search box. These individuals want a way to access specific data or data related to a task without any typing at all. Figuring out the work flow and then embedding saved queries in that process is search. The people who want this type of access may not be interested in looking up any other information.

This type of search may require forms-based queries or reports. It’s important to figure out if the search system for this definition should be separate from a general search service or part of it. In my experience I hear about this type of search from employees who want to query a database and sometimes get related information. A generalized search system that can index structured information may not be what these people mean by search.

The folks use the word search to mean “just have the system give me what I need without my having to type guesses in a search box or running around to find out who knows what I need to answer a question”. Example: Dieselpoint, Intelligenx

Who-Do-We-Know Search

The person who defines search in terms of business contacts, experts inside an organization, or using buzz words like “social graphs” wants to monitor actions and data, then mine those data for connections among people, places, things, and values like phone numbers.

When a person uses search in this sense, it’s important to understand the individual’s role in the organization and what the budget authority is. Most search systems do a lousy job of monitoring and anlyzing relationships. If your search system must perform these types of functions, you will want to conduct additional data collection on the need for this type of system. Trying to make a bare bones search system operate for who-do-we-known search can be a thankless job.

When a key sales professional quits, management wants to “search” this person’s contacts, sales letters, and proposals to make sure the information has not “walked”. One of the charms of a hosted contact management system paid for by a company is that the sales contact information is in one place and accessible. The nightmare is that the sales professional kept these data in a personal account the company cannot access.

Search, privacy, and security are three siblings who travel together. Example: Tacit

Observations

You may want to quibble with my major categories. If you have others, please, post them as a comment to this essay. If you want to add a category, let me know. I will update these posts as I get reader-submitted information. Here are the points that are in my mind after summarizing these definitions of search:

  1. Search has different meanings. One way to keep a search procurement on track is to define what you mean by search and spell out a budget for the implementation. Cost overruns are endemic because the meanings of search are not narrowed and then agreed upon before work on search begins.
  2. Search touches upon security, privacy, regulated actions, and confidentiality. Ignoring these nuances often translates into expensive retooling.
  3. Google’s Web search and the search in the Google Search Appliance are quite different. Allowing the notions of Google Web search to color a behind-the-firewall search system or a Web site search system leads to employee confusion.
  4. Vendors exploit confusion about search. A vendor’s generalization is often heard by people with different notions of search as the ideal remedy for information pains.

Discovering the different meanings of search is interesting. Getting the term defined before a vendor is selected is a useful intellectual exercise.

Stephen Arnold, May 12, 2008

Comments

2 Responses to “Groping the Enterprise Search Elephant”

  1. Gilbane Chats Up a Silly Goose: The Arnold Interview : Beyond Search on June 18th, 2008 1:21 am

    […] wrote about the search elephant here. Many different functions involving information access are made available to an employee, […]

  2. Gilbane Chats Up a Silly Goose: The Arnold Interview : Beyond Search on June 18th, 2008 1:21 am

    […] wrote about the search elephant here. Many different functions involving information access are made available to an employee, […]

  • Archives

  • Recent Posts

  • Meta