Google Nails Patent for Query Synonyms in Query Context
December 24, 2009
If you want to know how smart Steven Baker is, you won’t find the information in the Google index. His patent 7,409,383 is also a slippery fish. Where did he go? I have an email for him, a blog post about search quality, and some odd references to programming. Not much online as of December 22, 2009 at 9 pm Eastern. In fact, I have noticed in my addled goose way that some Google wizards are easy to find; for example, Jeff Dean. Others, like Steven Baker, are tough to find. Steven Baker and John Lamping (also a wizard) invented the system and method disclosed in “Determining Query Term Synonyms with Query Context.” This type of process is significant, and at Google’s scale, the invention is quite interesting. The crystal clear prose of Google’s full time and rental legal eagles says:
A method is applied to search terms for determining synonyms or other replacement terms used in an information retrieval system. User queries are first sorted by user identity and session. For each user query, a plurality of pseudo-queries is determined, each pseudo-query derived from a user query by replacing a phrase of the user query with a token. For each phrase, at least one candidate synonym is determined. The candidate synonym is a term that was used within a user query in place of the phrase, and in the context of a pseudo-query. The strength or quality of candidate synonyms is evaluated. Validated synonyms may be either suggested to the user or automatically added to user search strings.
You can breeze over to the USPTO and download this open source document. I recommend checking out the cross references to other Google patents, the method of organizing user queries over time, the numerical recipes disclosed, and the 19 claims. Another piece of the semantic puzzle nailed in my opinion. This invention at Google scale is darned nifty.
Stephen E. Arnold, December 23, 2009
Oyez, oyez, I wish to disclose that I was not paid to highlight this patent document nor to point out that Google engineer Steve Baker has become a tough lad to whom to link. I wonder why. Do you? He has had some interesting computing pals. Think Jon Kleinberg of Clever fame. Maybe I should write the Bureau of Missing Googlers?