Adaptive Search

June 9, 2008

Technology Review, a publication affiliated with the Massachusetts Institute of Technology, has an important essay by Erica Naone about adaptive computing. Her story here is “Adapting Websites [sic] to Users” provides a useful run down of high-profile sites that change what’s displayed for a particular user based on what actions the user takes on a Web page. I found the screen shots of a prototype British Telecom service particularly useful. When a large, fuzzy telecommunications company embraces autonomous computing on a Web site, I know a technology has arrived. Telcos enforce rigorous testing of even trivial technology to make certain an errant chunk of code won’t kill the core system.

For me, the most interesting point in the article is a quotation Ms. Naone attributes to John Hauser, a professor at MIT’s business school; to wit:

Suddenly, you’re finding the website [sic] is easy to navigate, more comfortable, and it gives you the information you need. The user, he says, shouldn’t even realize that the website [sic] is personalized.

User Confusion?

I recall my consternation when one of the versions of Microsoft software displayed reduced menus based on my behaviors. The first time I encountered this change is appearance, I was confused. Then I rooted around in the guts of the system to turn off the adaptive function. I have a visual memory that allows me to recall locations, procedures, and methods using that eidetic ability. Once I see something and then it changes, it throws off a wide range of automatic mental processes. In college, I recall seeing an updated version of an economics book, and I could pinpoint which charts had been changed, and I found one with an error almost 20 years after taking the course.

This is a schematic I prepared of a simplified autonomous computing process. Note that the core system represented by the circle receives inputs from external and internal processes and sources. The functions in the circular area are, therefore, able to adapt to information about different environmental factors.

Adaptive displays, for me, are a problem. If you want to sell products or shape information for those without this eidetic flaw, adaptive Web pages are for you.

As I thought about the implications of this on-the-fly personalization, I opened a white paper sent to me by a person whom I met via the comments section of my Web log “Beyond Search.”

Microsoft Active in the Field Too

The essay is “What Is Autonomous Search?”, and it is a product of Microsoft’s research unit. The authors are Youssef Hamadi, Eric Monfroy, and Fréderéic Saubion. Each author has an academic affiliation, and I will let you download the paper and sort out its provenance. You can locate the paper here.

In a nutshell, the paper makes it clear that Microsoft wants to use autonomous techniques to make certain types of search smarter. The idea is a deeper application of algorithms and methods that morph a Web page to suit a user’s behaviors. Applied to search, autonomous functions monitor information, machine processes, and user behaviors via log files. When something significant changes, the system modifies a threshold or setting in order to respond to a change.

The system automatically makes inferences. A simple example might be a surge in information and clicks on a soccer player; for example, Gomez. The system would note this name and automatically note that Gomez was associated with the German Euro 2008 team. Relevance is automatically adjusted. Other uses of the system range from determining what to cache to what relationships can be inferred about users in a geographic region.

Google: Automatic with a Human Wrangler Riding Herd

Not surprisingly, Google has a keen interest in autonomous functions. What is interesting is that in the short essay I wrote about Peter Norvig’s conversation with Anand Rajaraman here, Dr. Norvig–now on a Google leave of absence–emphasized Google’s view of automated functions. As I understand what Mr. Rajaraman wrote, Google wants to use autonomous techniques, but Google wants to keep some of its engineers’ hands on the controls. Some autonomous systems can run off the tracks and produce garbage.

I can’t name the enterprise search systems with this flaw, but those search systems that emphasize automated processes that run after ingesting training sets are prone to this problem. The reason is that the thresholds determined by processing the training sets don’t apply to new information entering the system. A simple thought experiment reveals why this happens.

Assume you have a system designed to process information about skin cancer. You assemble a training set of skin cancer information. The search and retrieval system generates good results on test queries; for example, precision and recall scores in the 85 percent range. You turn the system loose on content that is now obtained from Web sites, professional publishers, and authors within an organization. The terminology differs from author to author. The system–anchored in a training set–cannot handle the diffusion of terms or even properly resolve new terms; for example, a new treatment methodology from a different research theater. Over time, the system works less and less well. Training autonomous systems is a tricky business, and it can be expensive.

Google’s approach, therefore, bakes in an expensive human process to keep the “smart” algorithms from becoming dumber over time. The absent mindedness of an Albert Einstein is a quirk. A search system that becomes stupid is a major annoyance.

You can read more about Google’s approach to intelligent algorithms by sifting through the papers on the subject here. If you enjoy patent applications and view their turgid, opaque prose as a way to peek under Google’s kimono, I recommend that you download US2008/0022267. this invention by H. Bruce Johnson, Jr. and Joel Webber discloses how a smart system can handle certain programming chores at Google. The idea is that busy, bright Googlers shouldn’t have to do certain coding manually. An autonomous system can handle the job. The method involves the types of external “looks” and internal “inputs” that appear in the Microsoft paper by Hamadi, Monfry, and Saubion.

Observations

I anticipate more public discussion of autonomous computing systems and methods in the near future. Because the technology is out of sight, it is out of mind. It does have some interesting implications for broader social computing issues as well as enterprise search; for example:

Control. Some users–specifically, me–want to control what I see. If there are automatic functions, I want to see the settings and have the ability to fiddle the dials. Denied that, I will spend considerable time and energy trying to get control of the system. If I can’t, then I will take steps to work around the automated decisions.
Unexpected costs. Fully automated systems can go off the rails. In the enterprise search arena, a licensee must be prepared to retrain an automatic system or assign an expensive human to ride herd on the automated functions. Most search vendors provide administrative interfaces to allow a subject matter expert to override or input a correction. Even Google in its new site search and revamped Google Mini allows a licensee to weight certain values such as time.
Suspicion of nefarious intent. When a system operates autonomously, how is a user to know that a particular adjustment has been made to “help” the user. Could the adjustment be made to exploit a psychological weakness of the user. Digital used car sales professionals could become a popular citizen in the Internet community.
Ineffective regulation. Government officials may have a difficult time understanding autonomous systems and methods. As a result, the wizards of autonomous computing operate without any effective oversight.

The concern I have is that “big data” makes autonomous computing work reasonably well. It follows that the company with the “biggest data” and the ability to crunch those data will dominate. In effect, autonomous computing may set the stage for an enterprise that takes the best pieces of the US Steel, the Standard Oil, and J.P. Morgan models to build a new type of monopoly. Agree? Disagree? Use the comments section to let me know your thoughts.

Stephen Arnold, June 9, 2008

Written by Stephen E. Arnold · Filed Under Enterprise, News, Search

Comments

One Response to “Adaptive Search”

jordhy ledesma on December 4th, 2008 12:43 am

Autonomous search is the future of search technology. However I respectfully disagree with the conclusion of this article. I will not be the companies with the largest datasets those who will have the advantage in autonomous search.

If that were the case, small and medium enterprises would be left behind in search technologies. However, early adopters and, in particular, effective early adopters of database, data mining and internet tecnologies where small companies with less data than their established competitors. This was particularly true for Google, Yahoo and now Facebook.

I’m firmly convinced, that the key (by typer of query and data) and the subsequent application of the most efficient search algorithm for that specific domain.

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.