Answering Questions: Holy Grail or Wholly Frustrating
July 2, 2008
The cat is out of the bag. Microsoft has acquired Powerset for $100 million. You can read the official announcement here. The most important part of the announcement to me was:
We know today that roughly a third of searches don’t get answered on the first search and first click…These problems exist because search engines today primarily match words in a search to words on a webpage [sic]. We can solve these problems by working to understand the intent behind each search and the concepts and meaning embedded in a webpage [sic]. Doing so, we can innovate in the quality of the search results, in the flexibility with which searchers can phrase their queries, and in the search user experience. We will use knowledge extracted from webpages [sic] to improve the result descriptions and provide new tools to help customers search better.
I agree. The problem is that delivering on these results is akin to an archaeologist finding the Holy Grail. In my experience, delivering “answers” and “better results” can be wholly frustrating. Don’t believe me? Just take a look at what happened to AskJeeves.com or any of the other semantic / natural language search systems. In fact, doubt is not evident in the dozens of posts about this topic on Techmeme.com this morning.
So, I’m going to offer a different view. I think the same problems will haunt Microsoft as it works to integrate Powerset technology into its various Live.com offerings.
Answering Questions: Circa 1996
In the mid 1990s, Ask Jeeves differentiated itself from the search leaders with its ability to answer questions. Well, some questions. The system worked for this query which I dredged from my files:
What’s the weather in Chicago, Illinois?
At the time, the approach was billed as natural language processing. Google does not maintain comprehensive historical records in its public-facing index. But you can find some information about the original system here or in the Wikipedia entry here.
How did a start up in the mid-1990s answer a user’s questions online? Computers were slow by today’s standards and expensive. Programming was time consuming. There were no tools comparable to python or Web services. Bandwidth was expensive and modems, chugged along south of 56 kilobits per second, eagerly slowing down in the course of a dial up session.
I have no inside knowledge about AskJeeves.com’s technology, but over the years, I have pieced together some information that allows me to characterize how AskJeeves.com delivered NLP (natural language processing) magic.
Humans.
AskJeeves.com compiled a list of frequently asked questions. Humans wrote answers. Programmers put data into database tables. Scripts parsed the user’s query and matched it to the answers in the tables. The real magic, from my point of view, was that AskJeeves.com updated the weather table, so when the system received my query “What is the weather in Chicago, Illinois?”, the system would pull the data from the weather table and display an answer. The system also showed links to weather sites in case the answer part was incorrect or not what the user wanted.
Over time, AskJeeves.com monitored what questions users asked and added these to the system.
What happened when the system received a query that could not be matched to a canned answer in a data table? The system picked the closest question to what the user asked and displayed that answer. So a question such as “What is the square of aleph zero plus N?” generated an answer along the lines “The Cubs won the pennant in 1918?” or some equally crazy answer.
AskJeeves.com discovered several facts about its approach to natural language processing:
- Humans were expensive. AskJeeves.com burned cash. The company tried to apply its canned question answering system to customer support and ended up part of the Barry Diller empire. Humans can answer questions, but the expense of paying humans to craft templates, create answer tables, and code the system were too high then and remain cash hungry today.
- Humans asked questions but did not really mean what they asked? Humans are perverse. A question like “What’s a good bar in San Francisco?” can go off the rails in many ways. For example, what type of bar does the user require? Biker, rock, blue collar? What’s San Francisco? Mission, Sunset, or Powell Street? The problem with answering questions, then, is that humans often have a tough time formulating the right question.
- Information changes. The answer today may not be the answer tomorrow. A system, therefore, has to have some way of knowing what the “right” answer is in the moment. As it turns out, the notion of “real time”–that is, accurate information at this moment–is an interesting challenge. In terms of stock prices, the “now quote” costs money. The quote from yesterday’s closing bell is free. Not only is it tricky to keep the index fresh, to have current information may impose additional costs.
This mini-case sheds light on two challenges in natural language processing.
First, in order to make NLP–or any variant of semantic processing “work”–programmers have to create systems and methods that allow software to think like a human, preferably a subject matter expert in tune with what other humans intend when asking a question. There are fancy terms to describe this bit of software magic. I’m going to skip blithely over disambiguation, synonym expansion, context, and other bits of search impedimenta. You can find some guidance in the glossary to my study Beyond Search, published by the Gilbane Group in April 2008. Solving this problem is akin to squaring a circle or using telekinesis to freshen your cup of coffee.
Second, the nature of information means that some mechanism has to be devised to keep the indexes doing the answering up to date. Again, solving this particular problem is technically difficult. The costs associated with refreshing indexes and handling user questions are known to have a built-in buoyancy. In short, the infrastructure to obtain information, index, and perform query processing to answer questions can be expensive to operate, maintain, and scale. The reason is that humans and systems keep creating more information. The more information available translates to more indexing to keep the indexes current. This is a perverse cost feedback loop that baffles some venture capitalists and clever MBAs.
Semantic Search 2008
In the last 14 years, computer scientists have made significant progress in making search and retrieval systems somewhat more intelligent. The poster child for search innovation is Google. I know this rankles the likes of Autonomy and Endeca, but let’s face it. Google has revenues that are orders of magnitude greater than either Autonomy and Endeca. Google has a business model that generates billions of dollars every 12 weeks. Google has wacky believers who are dragging a consumer-centric company into government agencies and Fortune 500 companies and most search companies have sales people who push products into companies.
Google cheats. First, the company looks at what is popular and concludes, “If it is popular, then we can assume that if a person types the query “spears” we can show this result and be right about 80 percent of the time. Type the query “spears” on July 2, 2008, and this is what Google showed me:
The Google system displayed results in what the company calls its “universal search” form. This is a type of federated search that shows content from different indexes. In this example, I see pictures of the starlet, links to specific information organized in categories, and a hot link to the Wikipedia write up. The system also shows me a link to information about a “spear” as a weapon in case the 80 percent answer is not what I want.
So, with a single word, I get information that is derived from traditional indexing plus Google fancy math. The answer is not just good enough; the answer is good enough to give Google 70 percent of the world’s Web search traffic. Enough said.
So now the problem becomes, “How can another company leap frog Google?” The answer for Microsoft today is, “Buy Powerset.” My knowledge of Powerset is limited to a talk I heard by Ron Kaplan, one of the Powerset wizards, my use of the system to search Wikipedia (a pretty small set of content by today’s standards), and the amazing public relations that Powerset and its venture backers were able to generate. Powerset gets an average grade in search technology and an A plus in media relations.
The guts of Powerset combines technology from Xerox Palo Alto Research Center. Presumably, Powerset’s technology is better than the Inxight Software technology which emerged from the same facility in 1997. Inxight’s technology won some key government customers and went on to provide tools to other search systems. Before Business Objects bought Inxight, Inxight had matured its product line to include a “Discovery Server” and enterprise search. The Inxight technology worked well, but appropriate resources were required to keep the indexes fresh. Inxight, from my point of view, was a path breaker, but like many innovators in content processing, it hit a glass ceiling in terms of revenue.
Identified entities from an Inxight method.
Historically, most search and content processing companies don’t generate big revenues; that is revenues, north of $500 million. Right now, Google is the search leader because of its proven ability to smash through the glass ceiling that keeps most search and content processing companies’ revenues modest. In the enterprise sector, Autonomy is in the $300 million range. Google–without much effort–has pushed past $400 million when I include partner revenue, maps, educational accounts and add those revenues to Google’s reported $188 million in search revenue. So, Google is the biggest enterprise search vendor at this time. This assertion evokes howls from other search vendors, but it is what it is. Other vendors need to generate more revenues and blow past Google. Not likely this year, I think.
Here is an older screen shot that shows the type of parsing that a semantic system has to do in order to figure out what a document means. Keep in mind that the system has to parse the user’s query, disambiguate it, and then match the query to the system’s indexes, process the results, and display something that resembles an answer to the user. Just to remind you: These are machine processes that involve iterative methods, statistical guess-timates, and look ups in knowledge bases of some type. Humans do multiple processes quickly. Software–even when parallelized and distributed–does not work as quickly or as well as humans for figuring out meaning and intent.
ClearForest example. Copyright Thomson Reuters, 2008.
To my knowledge, Powerset has not found a way to alter the computational burden or the cost buoyancy of rich text processing. To say this another way, chip makers like Intel confront the laws of physics in an effort to make smaller traces on a die. Search companies face equally problematic laws of computation in an effort to figure out the meaning of information. Intel can’t beat the laws of physics without a major leap frog innovation. Powerset and its new owner can’t beat the cost laws of search without a comparable innovation. With chunks of Powerset running on Amazon Web services, Microsoft may find its massive investments in data centers too little and too late. To get the cycles needed to leverage Powerset’s technology quickly, Microsoft may have to outsource Powerset processing to Amazon. Just as Microsoft had to buy Savvis to keep some online costs under control, Microsoft may have to cut some deal with Amazon. In the long run, it might be cheaper to buy Amazon and use that infrastructure to close the gap with Google.
Everyone has to repeat after me, “Google is not standing still. Google is innovating and moving forward.” Microsoft has to run faster and jump over Google. Microsoft has to get far enough ahead to force Google to burn cash, lose market share, and make stupid mistakes.
After spending $1.23 billion on Fast Search & Transfer’s enterprise and Web search technology, Microsoft spends one-tenth the amount on Powerset. Gentle reader, what is the cost of integrating these systems into the already complex, search-choked Microsoft? My view is that Microsoft does not know the cost of integration. More to the point, I don’t think Microsoft has worked out the cost of implementing rich text processing on a scale required to make much of a difference to the market controlled by Google. Microsoft has money. Hopefully it will have enough to rewrite the laws of rich text processing before Google does.
Observations
The Powerset acquisition warrants several observations:
- Until the technology is integrated, it is impossible to know if there is significant value in the deal for Microsoft.
- Technology from a company that in turn licenses technology from a third party is not likely to put Microsoft in a position to out-Google Google.
- Google will react with incremental semantic functions. You can see some of this capability if you go to the Google.com “ig” or individualized Google service and customize your “ig” page. You can run this query from Google.com “air schedule sfo lga” and see another bit of Google magic. But there’s more, much more. I have mentioned Ramanathan Guha and dataspaces. Semantics are alive and well at Google, just not front and center like Powerset’s Wikipedia demo.
Agree? Disagree? Use the comments feature of this Web log to set me straight.
Stephen Arnold, July 2, 2008