An Oath from the Past: Yahoo Web Scale Semantic Search

October 9, 2020

I spotted a link to “Yahoo: Web Scale Semantic Search.” You remember Yahoo, don’t you. This is the outfit with the data breaches, the clueless business model, and the sale to the Baby Bell Verizon. The executives too are memorable: Marissa, Alex, Terry, and the Peanut Butter memo man.

The link displayed a presentation by Edgar Meij, a laborer in Yahoo Labs. The topic was an X ray view from Mt. Olympus intended to reveal Web scale semantic search.

The slide deck requires 62 clicks to traverse. There are many riches in the presentation. I want to highlight three of these, and invite you to make your own determination of these insights.

First, there is a “text” accompanying the deck. It contains a riot of jargon and buzzwords. In fact, I have saved the text, despite a portion being truncated, as a glossary of Web search jive talk; for example “s a sequence of terms s 2 s drawn from the set S, s ? Multinomial(?s) e a set of entities e 2 e.” (I knew you would experience the same thrill I did when I read this line.) True to Slideshare’s attention to detail, the text for slides 32 to 62 has been removed. Great loss indeed.

Second, Yahoo cares about knowledge. Consider this diagram:

image

The idea is that one acquires knowledge (I assume this means scraping and indexing Web site content), knowledge integration (creating a big index), and knowledge consumption (maybe finding something when a user or system sends a query to the search subsystem). The key point is “knowledge” is important. How about that? Yahoo search was focusing on knowledge? Is that why Yahoo floundered in search for many, many years before bowing to failure?

Third, Yahoo’s approach to semantic search requires humans. Here’s proof:

image

When Yahoo announced Vin Diesel was dead, he was alive. So much for smart software.

Why am I mentioning this blast from the past.

Knowledge was talked about in my interview/discussion with Dr. Stavros Macrakis. We tackled the difference between Web search and enterprise search. This Yahoo deck illustrates that talk about knowledge is one thing. Delivering useful results to a user is quite another.

Jargon in search and retrieval has made more progress than search technology itself. That’s why the Yahoo deck could have been crafted yesterday by one of the search vendors still chasing a huge market in the era of Lucene/Solr and “good enough” information access.

Stephen E Arnold, October 9, 2020

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta