The Future of Search? It’s Here and Disappointing

August 13, 2008

AltSearchEngines.com–an excellent Web log and information retrieval news source–tapped the addled goose (me, Stephen E. Arnold) for some thoughts about the future of search. I’m no wizard, being an a befuddled flow, but I did offer several hundred words on the subject. I even contributed one of my semi-famous “layers” diagrams. These are important because each layer represents a slathering of computational crunching. The result is an incremental boost to the the underlying search system’s precision, recall, interface outputs, and overall utility of the system. The downside is that as the layers pile up so does complexity and its girl friend costs. You can read the full essay and look at the diagram here. A happy quack to the AltSearchEngines.com team for [a] asking me to contribute an essay and [b] having the moxie to publish the goose feathers I generate. The message in my essay is no laughing matter. The future of search is here and in many ways, it is deeply disappointing and increasingly troubling to me. For an example, click here.

Stephen Arnold, August 13, 2008

Comments

5 Responses to “The Future of Search? It’s Here and Disappointing”

  1. Charlie Hull on August 13th, 2008 10:56 am

    Interesting article, Stephen. However I think there’s a more positive spin to be put on some of these developments: the availability of powerful, scalable open-source search technology will open up search to customers who previously couldn’t afford it. With even small enterprises now producing millions of documents there is even more of a need for enterprise search – but these small enterprises don’t want (or even need) all the bells and whistles that Autonomy or Endeca promise. They simply want their employees to be able to find stuff, and to have some control of what the engine is actually doing.

  2. Dr. Kathleen Dahlgren on August 13th, 2008 1:27 pm

    It’s a good article, but you say that advances in search are just “layering of functions on top of basic key word indexing”. That is true of Google’s popularity and many statistically-based semantic search technologies.

    However, one alternative method does not use the pattern or key word as the basis. That is a linguistic semantic approach in which the string is interpreted for meaning, one term at a time, so that the index is not of strings, but of word meanings (or concepts).

    Take the string “strike”. It has 22 meanings in English, such as “hit or beat”, “discover”. “labor walkout”, “occur to someone”, “state of the game of baseball”, “ignite”, and so on. Pattern-matchers save the string “strike” in the index. A linguistic semantic search engine first determines the meaning of strike in context and then saves that meaning in the index. So “strike on the head” is interpreted as “strike1” meaning “hit or beat” and “head6” meaning a part of the body. On the other hand “the workers went out on strike” is interpreted as “worker1” meaning “laborer”, “go20” meaning “walk out” and “strike5” meaning “labor walkout”.

    In searching, meanings in the query are matched to meanings in the document base, dramatically improving precision. Recall is improved while retaining precision, because synonyms of just the desired meanings of a term can be found. For example, if the query is “Did Fred strike Harry on the head”, a document with “Fred beat Harry on noggin” is returned, but not “Fred struck the head of match for Harry”, because strike and head don’t have the same meaning in the document as in the query.

    You also mention Powerset as an example of a technology that builds upon key word technology. If Powerset is disambiguating words as described above, it should not be classified as a using a key word approach.

    Another source of precision in linguistic semantics is the interpretation of phrases. “Bill of Rights” is interpreted as a fixed and frequent phrase, so that a document with “Bill has a lesion on his right leg” is not returned in response to a query about the “Bill of Rights”.

    You can see all of this at work on the demo sites at http://www.cognition.com.

  3. Stephen E. Arnold on August 13th, 2008 3:56 pm

    C. Hull,

    I agree that I can be more positive. However, in my line of work I see train wrecks more than I see pots of gold. Feel free to amend my comments either in this Web log or on the AltSearchEngines.com Web log. My heart attack and skin cancer have nibbled into my optimistic bank balance in life’s check book.

    Stephen Arnold, August 13, 2008

  4. Stephen E. Arnold on August 13th, 2008 3:58 pm

    Dr. Dahlgren,

    Thanks for writing. I try to take a position and see what shaking the apple tree yields. Feel free to expand on your views in the comments section of this Web log.

    Stephen Arnold, August 13, 2008

  5. The Future of Search? It’s Here and Disappointing | Easycoded on August 14th, 2008 8:46 am

    […] ScottGu AltSearchEngines.com–an excellent Web log and information retrieval news source–tapped the addled goose (me, Stephen E. Arnold) for some thoughts about the future of search. I’m no wizard, being an a befuddled flow, but I did offer several hundred words on the subject. I even contributed one of my semi-famous “layers” … […]

  • Archives

  • Recent Posts

  • Meta