Inside Microsoft Search
May 7, 2009
The Register ran an intriguing article called “Microsoft’s New Search – Built on Open-Source” here by Cade Metz. The article stated:
In July of last year, Microsoft acquired Powerset, a San Francisco startup intent on bringing natural language processing to web search. And like the original Hotmail, the startup’s semantic search engine leans heavily on open source code.
Ms. Cade asserted that:
Powerset generates its search index via Hadoop, the same open-source distributed computing platform that juices Yahoo!’s search engine. Based on Google’s MapReduce distributed computing platform and GFS file system, Hadoop was originally developed by open-source maven Doug Cutting, now on the Yahoo! payroll. But it was Powerset that originated Hadoop’s HBase project, an effort to mimic Google’s famous distributed storage system, BigTable.
You will want to read the original story to get the full analysis. I want to highlight a scintillating sentence: “And it’s [Hadoop] the bastard child of the Google Chocolate Factory.”
My thoughts swirled when I read this write up. I recalled hearing that open source had been used in the Fast Search & Transfer system too. I don’t know what to think about this article. Quite a challenging story.
Stephen Arnold, June 8, 2009