Is the AskJeeves Approach the Next Big Thing Again?

March 14, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

Way back when I worked in silicon Valley or Plastic Fantastic as one 1080s wag put it, AskJeeves burst upon the Web search scene. The idea is that a user would ask a question and the helpful digital butler would fetch the answer. This worked for questions like “What’s the temperature in San Mateo?” The system did not work for the types of queries my little group of full-time equivalents peppered assorted online services.

image

A young wizard confronts a knowledge problem. Thanks, MSFT Copilot. How’s that security today? Okay, I understand. Good enough.

The mechanism involved software and people. The software processed the query and matched up the answer with the outputs in a knowledge base. The humans wrote rules. If there was no rule and no knowledge, the butler fell on his nose. It was the digital equivalent of nifty marketing applied to a system about as aware as the man servant in Kazuo Ishiguro’s The Remains of the Day.

I thought about AskJeeves as a tangent notion as I worked through “LLMs Are Not Enough… Why Chatbots Need Knowledge Representation.” The essay is an exploration of options intended to reduce the computational cost, power sucking, and blind spots in large language models. Progress is being made and will be made. A good example is this passage from the essay which sparked my thinking about representing knowledge. This is a direct quote:

In theory, there’s a much better way to answer these kinds of questions.

  1. Use an LLM to extract knowledge about any topics we think a user might be interested in (food, geography, etc.) and store it in a database, knowledge graph, or some other kind of knowledge representation. This is still slow and expensive, but it only needs to be done once rather than every time someone wants to ask a question.
  2. When someone asks a question, convert it into a database SQL query (or in the case of a knowledge graph, a graph query). This doesn’t necessarily need a big expensive LLM, a smaller one should do fine.
  3. Run the user’s query against the database to get the results. There are already efficient algorithms for this, no LLM required.
  4. Optionally, have an LLM present the results to the user in a nice understandable way.

Like AskJeeves, the idea is a good one. Create a system to take a user’s query and match it to information answering the question. The AskJeeves’ approach embodied what I called rules. As long as one has the rules, the answers can be plugged in to a database. A query arrives, looks for the answer, and presents it. Bingo. Happy user with relevant information catapults AskJeeves to the top of a world filled with less elegant solutions.

The question becomes, “Can one represent knowledge in such a way that the information is current, usable, and “accurate” (assuming one can define accurate). Knowledge, however, is a slippery fish. Small domains with well-defined domains chock full of specialist knowledge should be easier to represent. Well, IBM Watson and its adventure in Houston suggests that the method is okay, but it did not work. Larger scale systems like an old-fashioned Web search engine just used “signals” to produce lists which presumably presented answers. “Clever,” correct? (Sorry, that’s an IBM Almaden bit of humor. I apologize for the inside baseball moment.)

What’s interesting is that youthful laborers in the world of information retrieval are finding themselves arm wrestling with some tough but elusive problems. What is knowledge? The answer, “It depends” does not provide much help. Where does knowledge originate, the answer “No one knows for sure.” That does not advance the ball downfield either.

Grabbing epistemology by the shoulders and shaking it until an answer comes forth is a tough job. What’s interesting is that those working with large language models are finding themselves caught in a room of mirrors intact and broken. Here’s what TheTemples.org has to say about this imaginary idea:

The myth represented in this Hall tells of the divinity that enters the world of forms fragmenting itself, like a mirror, into countless pieces. Each piece keeps its peculiarity of reflecting the absolute, although it cannot contain the whole any longer.

I have no doubt that a start up with venture funding will solve this problem even though a set cannot contain itself. Get coding now.

Stephen E Arnold, March 14, 2024

Comments

Got something to say?





  • Archives

  • Recent Posts

  • Meta