Power Leveling
February 20, 2008
Last week I spoke with a group of young, enthusiastic programmers. In that lecture, I used the phrase power leveling. I didn’t coin this term. In my preparation for my lecture, I came across an illustration of a maze.
What made the maze interesting was a rat had broken through the maze’s dividers. From the start of the maze to the cheese at the exit, the mouse bulldozed through the barriers. Instead of running the maze, the rat went from A to B in the shortest, most direct way.
Power leveling.
When I used the term, I was talking about solving some troublesome problems in search and retrieval. What I learned in the research for Beyond Search was that many companies get trapped in a maze. Some work very hard to figure out one part of the puzzle and fail to find the exit. Other companies solve the maze, but the process is full of starts and stops.
Two Approaches Some Vendors Take
In terms of search and retrieval, many vendors develop solutions that work in a particular way on a specific part of the search and retrieval puzzle. For example, a number of companies performing intensive content processing generate additional indexes (now called metatags) for each document processed. These companies extract entities, assign geo spatial tags, classify documents and those documents components. The thorough indexing is often over kill. When these systems crunch through email, which is often cryptic, the intense indexing can go off the rails. The user can’t locate the needed email using the index terms and must fall back on searching by date, sender, or subject. This type of search system is like the rat that figures out how to solve one corner of the maze and never gets to the exit and freedom.
The other approach does not go directly to the exit. These systems iterate, crunch, generate indexes, and rerun processes repeatedly. With each epoch of the indexing processing, the metatags get more accurate. Instead of a blizzard of metatags, the vendor delivers useful metadata. The vendor achieves the goal with the computational equivalent of using a submachine gun to kill the wasp in the basement. As long as you have the firepower, you can fire away until you solve the problem. The collateral damage is the computational equivalent of shooting up your kitchen. Instead of an AK-47, these vendors require massive amounts of computing horsepower, equivalent storage, and sophisticated infrastructure.
Three Problems to Resolve
Power leveling is neither of these approaches. Here’s what I think more developers of search-and-retrieval systems should do. You may not agree. Share you views in the comments section of this Web log.
First, find a way around brute force solutions. The most successful systems often use techniques that are readily available in text books or technical journals. The trick is to find a clever way to do the maximum amount of work in fewest cycles. Just because today’s processors are pretty darn quick, you will deliver a better solution by letting software innovations do the heavy lifting. Search systems that expect me to throw iron at bottlenecks are likely to become a money pit at some point. A number of high-profile vendors are suffering from this problem. I won’t mention any names, but you can identify the brute force systems doing some Web research.
Second, how can you or a vendor get the proper perspective on the search-and-retrieval system? It is tough to get from A to B in a nice Euclidian way if you keep your nose buried in a tiny corner of the larger problem space. In the last few days, two different vendors were thunderstruck that my write ups of their system described their respective products more narrowly than the vendors’ saw the products. My perspective was broader than theirs. These two vendors struggled and are still struggling to reconcile my narrow perception of their systems with the broader and, I believe, inaccurate descriptions of these systems.
I have identified a third problem with search-and-retrieval systems. Vendors work hard to find an angle, a way to make themselves distinct. In this effort to be different, I identified vendors who have created systems that can be used when certain, highly-specific requirements call for these functions. Most organizations don’t want overly narrow solutions. The need is to have a system that allows the major search-and-retrieval functions to be performed at a reasonable cost on relatively modest hardware. As important, the customers want a system that an information technology generalist can understand, maintain, and enhance. In my experience, most organizations don’t want rocket science. Overly complex systems are fueling interest in SaaS (software as a service. Believe me, there are search-and-retrieval vendors selling systems that are so convoluted, so mind-boggling complicated that their own engineers can’t make some changes without consulting the one or two people who know the “secret trick”. Mere mortals cannot make these puppies work.
Not surprisingly, the 50 or 60 people at my lecture were surprised to hear me make suggestions that put so much emphasis on being clever, finding ways to go through certain problems, keeping the goal in sight, and keeping their egos from getting between their customers and what the customer needs to do with a system.
A Tough Year Ahead
Too many vendors find themselves in a very tough competitive situation. The prospects often have experience with search-and-retrieval systems. The reason these prospects are talking to vendors of search-and-retrieval systems is because the incumbent system doesn’t do the job.
With chief financial officers sweating bullets about costs, search-and-retrieval vendors will have to deliver systems that work, can be maintained without hefty consulting add ons, and get the customer from point A to B.
I think search-and-retrieval as a separate software category is in danger of being commoditized. Lucene, for example, is a good enough solution. The hundreds of companies chasing a relatively modest pool of potential buyers is ripe for a shake out and consolidation. Vendors may find themselves blocked by super platforms who bundle search and content processing with other, higher value enterprise applications.
Search-and-retrieval vendors may want to print out the power leveling illustration and tape it to their desk. Inspiration? Threat? You decide.
Stephen Arnold, February 20, 2008