A 12-Step Program for Behind-the-Firewall Search

March 28, 2008

In 2006, one of the young engineers working on a search system at a large company said to me, “I’m in a 12-step program for this !%$&^ search system–two six packs of beer.”

This clever and stressed young engineer was the “owner” of her employer’s blue-chip, high–profile, it-slices-it-dices search system. The young wizard was learning that high marks in computer science do not a smooth behind-the-firewall search system make.

I kept this “12-step” tag in my mind. In late 2006, I used this graphic to illustrate one way to deploy a behind-the-firewall search system with few hassles and certainly no recourse to alcohol.

12 steps

Let me run through the 12 steps and conclude with a reminder that short cuts can lead to some interesting challenges.

Step 1. You will need a team to assist you with your behind-the-firewall search project. Search has quite a few moving parts. Working alone is not a good idea.

Step 2. You need to know a great deal about the content you plan to index. You want to know how much content you must index; how much change occurs in the content; how much new content becomes available every day, week, month, and year; access constraints; file types; and special issues such as chemical structures that must be indexed, among other points.

Step 3. You need to know what problem your behind-the-firewall search system is to solve. Is it key word search relevancy, or are you deploying a business intelligence system?

Step 4. You need to have a clear idea about who can access what information. If your organization has a security officer who handles these details, bond with this person. If not, yoiu will need to take steps to manage access to information processed by the system. Allowing colleagues to see health and salary data without authorization creates new challenges.

Step 5. You need to have a clear statement of system requirements. Keep in mind that you want to focus on the must-have features. The “nice to have” requirements should be winnowed from the “must have” requirements. Focus on the “must haves”.

Step 6. Get your budget ducks in a line. You need to know how much you have to spend. You need to know how you will spend the money. You need to know how to get more money if your search ship runs aground. If you don’t have adequate money for the search system, stall until you do.

Step 7. Create a clear, detailed request for quotation (RFQ). Have a short list of vendors who deploy systems that have a track record of success. Vendors have competencies, so take steps to match vendors to your RFQ. If you have $10,000, don’t call a vendor whose starting license fee is $330,000.

Step 8. Invest time evaluating the proposals. You may want to arrange for demos before you create your short list of vendors who receive the RFQ, or you may want to get the demo before you winnow your candidates. Study the proposals, ask questions, and kick the tires before you sign a contract.

Step 9. Make your selection. Remember. You are “marrying” this vendor. Look beyond the honeymoon.

Step 10. Deploy your system in a methodical way. Set up a test system. Index some content. Seek feedback. Make changes. Then index more content. Customize if you need to. Tune the system. When you are certain you have security and other key functions in hand, deploy to a group of guinea pigs. When you are certain you are ready for prime time, release the system.

Step 11. Evaluate your work. You want to talk to your users, conduct surveys, and analyze the log file data. Understand what’s good, bad, and indifferent.

Step 12. Make changes. Search systems require tuning, enhancing, and customizing. Search is–and you are not going to want to hear this–a 24×7 job.
When you have worked through your 12-step program, you have not completed your work. You start over again. By the time you work through these 12 steps, the technology and requirements will have changed. With search, the work is never done.

As you work through your search system plan, you will encounter roadblocks–often unexpected one. Here in rural Kentucky, the simplest chore such as going to the store often surprises me. If a familiar task can “gang aft agley”, imagine what happens to “the best laid schemes o’ mice an’ men”.

Roadblock

In conclusion, following this type of methodical approach helps minimize the risk of these common pitfalls:

  1. Your system works but it doesn’t meet the needs of the users. Users might not say much to you, but the users will ignore the system. You know you have a problem when the marketing department buys its own Google Search Appliance and tells you, “The search system sucks.”
  2. Your budget is a disaster. Not only have you exceeded your spend limit, but you have no idea how to estimate the costs for the next six months. You know you have a problem when the CFO stops by your office and says, “Get your invoices together. We need to talk.”
  3. Your colleagues in information technology drag their feet when you ask for support. The search system has created headaches and you sit alone in your cube sifting through invoices and ducking colleagues who complain that documents that should be in the search system are not.

To conclude, behind-the-firewall search requires a by-the-book discipline. If you take your eye off the ball, you may get bonked in the snoot.

Stephen Arnold, March 28. 2008

Comments

4 Responses to “A 12-Step Program for Behind-the-Firewall Search”

  1. Martin White on March 30th, 2008 3:07 pm

    Steve

    Can I emphasis the comment you make in #1 in your excellent list. I have seen so many companies install a search application without any understanding of what it takes in staff terms to implement it.

    My view is that the skills needed are a) a search manager, b) someone with the IT skills to maintain and optimise all the technology, c) someone who is going to work through the search logs and d) someone who is going to provide user support and training, and run usability tests. Depending on the application you may also need taxonomy/meatadata management support.

    At a recent conference I sat next to a sales person from FAST Search who said that I was way over the top and just two people would be needed. When he presented the latest technology from FAST later in the conference he had a question from the audience about whether FAST could assist in developing taxonomies. He replied that the organisation would have do to that work. So who was going to actually do it – fairies?

    I would go so far as to say that if there is not the commitment from the organisation to provide the support team up front at #1 then there is no point in doing any more of the steps.

    Until search vendors come clean on what it takes to make search work, and organisations invest in a search team, all the prior work so clearly set out in your 12 steps is going to be wasted.

    Martin White

  2. Stephen E. Arnold on March 30th, 2008 6:08 pm

    Thanks for taking the time to comment. The list is a bit long in the tooth, but it makes the point that behind-the-firewall search cannot be taken too casually. The more steps, the more likely a ‘slip twixt cup and lip’.
    Stephen Arnold, 7 pm, March 30, 2008

  3. jed cawthorne on April 1st, 2008 2:39 am

    Hi Steve

    The list might be a bit long in tooth, but valuable nontheless to see it laid out explicitly. My current organisation is one of Martin’s customers and to pick up on his comments, we have recently had presentations from many vendors (environmental scanning) – and they all answer that question with “we have a customer who is XXX big, and is searching a corpus of XXX TB and manages with 1.5 FTE administrators” – totally missing Martin’s point ref taxonomy, content management, ‘user liaison and training’ etc in other words the ‘organisation’ bits of Information Organisation and Access.

  4. Stephen E. Arnold on April 1st, 2008 3:34 pm

    Jed, thanks for commenting. I thought the list was already in circulation. I am delighted that you find it useful. Watch for a series of three short essays about the information tokamak, which is a new way to talk about the needs, strengths, and deficiencies of search and content processing.
    Stephen Arnold, April 1, 2008, 16 34 Eastern time

  • Archives

  • Recent Posts

  • Meta