Another Way to Inject Ads into Semi-Relevant Content?
May 25, 2021
It looks like better search is just around the corner. Again. MIT Technology Review proclaims, “Language Models Like GPT-3 Could Herald a New Type of Search Engine.” Google’s PageRank has reigned over online search for over two decades. Even today’s AI search tech works as a complement to that system, used to rank results or better interpret queries. Now Googley researchers suggest a way to replace the ranking system altogether with an AI language model. This new technology would serve up direct answers to user queries instead of supplying a list of sources. Writer Will Douglas Heaven explains:
“The problem is that even the best search engines today still respond with a list of documents that include the information asked for, not with the information itself. Search engines are also not good at responding to queries that require answers drawn from multiple sources. It’s as if you asked your doctor for advice and received a list of articles to read instead of a straight answer. Metzler and his colleagues are interested in a search engine that behaves like a human expert. It should produce answers in natural language, synthesized from more than one document, and back up its answers with references to supporting evidence, as Wikipedia articles aim to do. Large language models get us part of the way there. Trained on most of the web and hundreds of books, GPT-3 draws information from multiple sources to answer questions in natural language. The problem is that it does not keep track of those sources and cannot provide evidence for its answers. There’s no way to tell if GPT-3 is parroting trustworthy information or disinformation—or simply spewing nonsense of its own making.”
The next step, then, is to train the AI to keep track of its sources when it formulates answers. We are told no models are yet able to do this, but it should be possible to develop that capability. The researchers also note the thorny problem of AI bias will have to be addressed for this approach to be viable. Furthermore, as search expert Ziqi Zhang at the University of Sheffield points out, technical and specialist topics often stump language models because there is far less relevant text on which to train them. His example—there is much more data online about e-commerce than quantum mechanics.
Then there are the physical limitations. Natural-language researcher Hanna Hajishirzi at the University of Washington warns the shift to such large language models would gobble up vast amounts of memory and computational resources. For this reason, she believes a language model will not be able to supplant indexing. Which researchers are correct? We will find out eventually. That is ok, we are used to getting ever less relevant search results.
Cynthia Murrell, May 25, 2021