Search: An Old Taxi with a Faux Cow Hide Interior
July 2, 2008
The last time I was in a big city I hailed a taxi. What a clunker. It smelled of fast food, incense, and hot plastic. One fender was dented and the curb side door would not open. The window would not go down. “She dead,” smiled the driver. The interior of the taxi had a set of blinking lights popular at holiday times. The taxi was a mess, but the faux cow interior was unusual. lights were working.
Thanks to ABC Australia for the photo. The original is here. http://www.abc.net.au/news/newsitems/200610/s1770336.htm
I have been clicking and scanning the opinions about the Microsoft Powerset deal. Scanning the links at Congoo.com, Megite.com, and Techmeme.com will take a long time. I have been a slacker, clicking at random and looking for some substantive news.
Why is search like a lousy taxi with a useless faux cow hide interior?
My thought for this evening is that search is string matching. The other functions are ways to:
- Make it easier for a busy person who does not have time or the desire to read a traditional document; that is, a multi page report.
- Show the user what is available and push the user toward that information. The user, who doesn’t want to make this effort, will let the software do the work.
- Support a user who is not to swift when it comes to thinking about abstract digital data.
- Reduce the time a user spends fumbling for information.
- Put training wheels on a worker who forgets work processes the way I forget where I put my automobile keys five minutes ago.
What’ happening is that key word search, string matching, and its kissing cousin Boolean are the lousy taxi. Good enough but not too pleasant.
The cow interior for search are these types of enhancements:
- Assisted navigation, a fancy term for Use For and See also references
- Clustering, putting like things together in a folder or under a heading
- Discovery, an interface that provides an overview of information
- Semantic search, a system that figures out what you mean when you type a two word query
- Natural language processing, a term that now means answering a question, assuming that someone takes the time to think up a question and type it into a search box
- Dashboards, a report that has panels or containers, each containing different information. Some dashboards look like speedometers with text; others can be quite fanciful.
- Access to metadata about what person in an organization gets the most email about a specific technical issue. This type of monitoring and analysis is now called social search because surveillance is not politically correct in many circles.
You get the idea.
Possible impacts
Let’s consider the consequences.
First, enterprise search is complicated. Today I spoke with an enthusiastic and young professional. The call touched upon creating a plan for enterprise search. Like most organizations, this outfit has three separate enterprise search systems. None work all that well, so the phone rings. This is a common situation, and I am not to optimistic that enterprise search will work very well when there are competing factions each with a favorite search engine to support. Adding whizzy new functionality adds to the cost and complexity, and I am not convinced users want to do much more than find the needed information and move on to another task.
Autonomy PR Coup: The Financial Times Essay
July 1, 2008
I gave up on the hard copy of the Financial Times. The Pearson operations wizards could not do reliable daily delivery in rural Kentucky. Fortunately, a colleague in a more civilized part of the world send me a copy of “Embracing the Friend, Taming the Beast–Web 2.0 in the Enterprise” by Sir Michael Lynch, Ph.D., Founder and Chief Executive Officer of Autonomy. You can try to find the article on the Financial Times’s Web site here. (Note: The FT is working to correct the search sins in its past, but it is a work in progress. You can also read the good summary by Oliver Marks, a ZDNet columnist here.)
The thrust of Dr. Lynch’s essay is Autonomy, not Web 2.0. The idea is that a host of technologies–social networks, folksonomies, wikis, blogs, Web services–are finding their way into organizations. The key point of Dr. Lynch’s essay was for me:
Next-generation solutions are available to help enterprises organize, manage and regulate user-generated content in a secure, consistent and scalable manner to ensure that employees benefit from instant access to relevant information and that brand integrity is properly protected. Such solutions bring conceptual understanding and an unprecedented level of automation to content management and address liabilities by continually reading entries, spotting problematic content and removing it in real-time. In addition, they can automatically reconcile tags that differ but are close in meaning, or actually provide the level of specificity needed in the enterprise that social methods struggle to deliver.
If you review other write ups about Dr. Lynch’s essay, you will find emphasis placed on the whizzy technologies enumerated above. My view is that the essay sets forth Autonomy’s value proposition for its enterprise software, IDOL or the Integrated Data Operating Layer.
The public relations coup is that this outstanding positioning piece appears about six weeks after the Casenove report about Autonomy. This is not a public report, but I wrote a short note about the report and its assertions here. As I understood the analysis, Autonomy has to make its business model perform at peak efficiency and create an environment in which Autonomy’s acquisitions can generate strong growth.
The difference between my reading of this excellent essay and that of Mr. Marks’ interpretation boils down the the difficulty of pinpointing exactly what business some vendors focus their sales efforts. To illustrate: Mr. Marks refers to Autonomy as a CMS vendor. “CMS” means to me “content management system”. I do not think of Autonomy as a primary provider of content management solutions. The company makes a strong case on its Web site and in some of its presentations that I have witnessed as a leader in search with strong competency in video search, fraud detection, and eDiscovery. The blurring of what Autonomy “does” makes it difficult for some potential customers to know in which software category to place Autonomy, for example.
My reading is informed by my knowledge of Autonomy’s search technology. The paragraph I highlighted above says to me, “Autonomy can contribute significantly to a world in which user-generated content exists with other types of information.”
So, I say, “PR coup.” I anticipate similar “essays” from Endeca, Microsoft Fast, and possibly Oracle. Allowing Autonomy to define the terms for “search” cedes some market influence to Autonomy. The editor of the Financial Times will be invited to some interesting lunches as vendors and their public relations professionals lobby to place another “essay” in front of Financial Times’s readers.
Stephen Arnold, July 1, 2008
Noetix Search
July 1, 2008
Noetix Search, based in Redmond, has announced its Noetix Search system. The system supports Oracle and PeopleSoft. The idea is that Noetix generates reports, not laundry lists of results, when querying Oracle tables and PeopleSoft information stores. The company says:
Noetix Search is a web-based search application that enables Noetix users to quickly identify [sic] the right Noetix views for their reports via a familiar, intuitive user interface. Using their favorite browser (including Internet Explorer, Firefox, and Safari), users enter a search term and are able to browse logical classifications of the most accurate results available for their specific data request.
Noetix Search takes a template-based approach to reports. The idea is that users can sort and filer, explore related information, and search across view names, view column names, table names, and table column names. Noetix Search words as a cloud-based service and integrates with third-party query tools, including Microsoft Excel. The full text of the announcement of Noetix Search is here.
Noetix has landed customers eager to make Oracle data more accessible. Customers include Cummins, Motorola, Starbucks, Toshiba, Welch’s, and Visa, among others. The company was founded in 2000 and has funding from Polaris Venture Partners and Sigma Partners.
My thought is that the word “search” may not be the appropriate one to describe this particular application. You can obtain more information about the company at its Web site http://www.noetix.com.
Stephen Arnold, July 1, 2008
Business Week: Microsoft’s Search Moves Analyzed
July 1, 2008
Catherine Holahan’s “Microsoft’s Plan B for Search” popped into my news reader this morning. The interesting essay–almost a business school-type write up–appears on the Business Week Web site here. I think the story will appear in some form in the hard copy magazine, but I read the online version this morning, July 1, 2008.
Ms. Holahan looks at the alleged Powerset buy out by Microsoft. The “Plan B” is acquiring additional search technologies in the aftermath of Redmond’s failed Yahoo deal. Her analysis is closely-reasoned, so it is difficult for me to summarize the argument.
I did find one point of particular interest; that is:
Rather than focus on creating one consumer-facing site capable of answering any query, like Google has, Microsoft has split its search engine into specific categories—a comparison-shopping engine, Microsoft Live Cashback; a travel search engine, Farecast; and a health-specific search engine, health.live.com. Today, semantic search engines do best with such category-specific searches, which help them to scan a smaller set of pages in detail. Scanning the entire Web in that much detail is difficult to do quickly.
Business Week has done a good job of explaining that Microsoft has a more fractionalized approach to search than Google. Keep in mind, however, that Google is not a single piece of digital cloth. There are different search mechanisms in operation at Google; specifically, the search system used for Google Base differs from the search system used for the search box on Google.com. The Google Search Appliance is also moving in its own direction as well.
In general, I applaud Ms. Holahan for identifying the difference initiatives within Microsoft. She has also identified two other interesting “semantic engines”. The first is Hakia, a company that offers a “Compare Hakia” function here and the Berggi Search for mobile devices other than the BlackBerry or iPhone, which limits the market somewhat. Hakia is working to generate the type of buzz that Powerset’s team found so effortless. She also mentions Expert System, based in Modena, Italy, and founded in 1989. The firm has beefed up its US presence with a new president and a more focused public relations campaign. You can learn more about Expert System here. Expert System has gained some traction for its software componetns in the mobile search market and has a lower profile in North America than Italy.
Observations
- The buzz about semantic search is gaining pitch and volume. My view is that semantic search is not an end in itself; it is a component of a search system. Vendors of semantic search are likely to find warmer welcomes as utilities or refinement functions within larger constellations of information retrieval methods. I guess I don’t buy the notion of “semantic search”.
- The key difference between Google and Microsoft boils down to the fact that Google has been working on its infrastructure for a decade. Without a honking big super computer, semantic technology is tough to implement on [a] large amounts of content and [b] content that changes frequently. The well known problem of updating indexes becomes quite challenging.
- Fragmented search is not necessarily a bad thing. But when there are many different search systems, costs become a problem quickly. Each system requires its own technology, engineers, and infrastructure. Google–while not homogeneous–avoided the “pushcart full of junk” approach taken by Yahoo. Microsoft, with its purchase of Fast Search & Transfer, may be unconsciously following the Yahoo model. Google’s approach of greater, not less, search homogeneity is the lower cost path. I was surprised Business Week’s B-school analysis of Microsoft’s Plan B ignored cost as a factor. Cost is a very big deal in search, which is the reason search vendors crash and burn. There’s no money to buy fuel.
Agree? Disagree? Use the comments section of this Web log to inform me of my intellectual short comings.
Stephen Arnold, July 1, 2008
Microsoft Data Centers: Spending Billions
July 1, 2008
GigaOM has another scoop. You can learn about Microsoft’s next-generation data centers in this exclusive video or read a summary of the main points. Both items available on the GigaOM.com Web site here. The information is quite dense, and I won’t try to summarize it. Navigate to GigaOM and watch the video. The key point for me is this statement:
Microsoft is taking the design of servers into its own hands. “We are doing some unique things in the mother board designs, server designs, and because we are Microsoft, operating systems.”
The quote in the quote is from Microsoft’s corporate VP of global foundation services, Debra Chrapaty. I have a diagram of Microsoft’s data center design circa 2006. If I can find it, I will post it and offer some observations. My research conducted in 2007 for a financial institution indicated that the “old data centers” posed some challenges for Microsoft; namely:
- Caching was needed to make Web search fast. The need for expensive edge services contributed to Microsoft’s purchase of Savvis.
- Microsoft was using a range of techniques to move data among data centers including some peer to peer technology and a bit of Linux.
- SQL Server access was a bit of a bottleneck, so a mechanism was set up to minimize direct access to SQL Server data tables.
The new changes may make Microsoft more competitive. But whatever Microsoft does, the company has to leap frog Google. The Sergey and Larry team has been working on data centers and infrastructure for a decade. Time may be running out for Microsoft to bound over the still growing Googzilla.
Stephen Arnold, July 1, 2008
IBM Search: Circling Back
July 1, 2008
I learned that an engineer named Michael Moran worked on IBM’s public facing search system for many years. You can read about this person’s contributions here. (Click this link quickly. The Yahoo news disappears faster than Yahooligans resign.) Mr. Moran has left IBM to join Converseon, a social media company. I hope there was no connection between my critique of IBM’s Web search system, Planetwide. But it is pretty terrible. Because Mr. Moran will not join Conversion until September 2008, he has time to tweak Planetwide and IBM’s e-commerce sub system as well.
To be fair to Big Blue, I dived back into the Web site. This time I focused on buying something. Providing an e-commerce function seems a reasonable expectation. Plus I own NetFinity 5500 servers, and I sometimes need parts.
Let’s take a look. You can look at these tiny WordPress processed screen shots or navigate to the e-commerce splash page and run this sample query.
Finding the Store Front
On the www.ibm.com splash page is a tab labeled “Shop For”. So far, so good. I click the tab and the drop down bar displays my choices.
I decide to shop for a workstation. Years ago I owned a ZPro workstation, and it was a workhorse. The case fell apart, but the guts kept on ticking for years.
Here’s the page for workstations. Remember. I want to buy something.
Instead of an Amazon or eBay like listing, I see a picture of a workstation. Okay, I click on the smaller workstations. The system shows me more text. Here is the product information for the Unix workstations that I wanted to buy, but I am now getting frustrated. Where are the products? Dell Computer in its darkest days with its sluggish e-commerce search system does better than this. Amazon, despite the baloney promoting the Kindle and showing me crazy recommendations, lets me get to products. Not IBM. The pages look alike. In fact, I am not sure that the display has changed. I like consistency, but I also like to see products.
I wade through the text in the center column under the picture and I click on IBM Intellistation POWER 265 Express. I get this screen:
More choices and more text. I scroll to the bottom of the page and I get a list of features. I am convinced. I scroll back to the top of the page where the “Browse and Buy” button is. I click it, and finally I get some bite-sized information and a price in red no less.