Hakia to Accelerate Semantic Analysis of the Web
July 10, 2008
A somewhat bold headline hopped from my news reader screen this morning (July 10, 2008). A news release from Hakia, one of the players in the semantic search football match, told me: “Hakia Leverages Yahoo Search BOSS to Accelerate Its Semantic Analysis of the World Wide Web.” You can get a copy of this release from Farrah Hamid (farrah at hakia dot com). As of 8 50 am, the news release is not on the Hakia Web log nor is there a link to this Hakia announcement.
The key point in the news release is that Hakia is using Yahoo’s Build Your Own Search Service or BOSS. The idea is that Hakia will use Yahoo’s search infrastructure to “accelerate Hakia’s crawling of the Web to identify quality documents for semantic analysis using its advanced QDEX (Query Detection and Extraction) technology. The “its” refers to Hakia’s patented technology, not Yahoo’s BOSS service.
Using Yahoo makes sense for two reasons. First, scaling to index Web content is expensive, a fact lost on many search mavens who don’t have a sense of the economics of content processing. Second, Yahoo’s BOSS makes it reasonably easy to tap into Yahoo’s plumbing. I wondered by other semantic search vendors have not looked at this type of hook up to better demonstrate the power of their systems. A couple of years ago, Siderean Software processed the Delicious.com content, and I found that a particularly good demo of the Siderean technology as well as providing me with a very useful resource. I have lost track of Siderean’s Delicious index, so I will need to do a bit of sleuthing later today.
Also, you can refresh your recollection of BOSS at http://www.developer.yahoo.com/boss. While you are at the Yahoo site, check out Yahoo’s own semantic search system, which left me a trifle disappointed. This system is shod with this url http://www.yr-bcn.es/demos/microsearch/. My write up about yr-bcn is here. One hopes the Hakia system raises the bar for Yahoo-based semantic efforts. It would be useful if Hakia puts up a head-to-head comparison of its system compared to Yahoo’s. You can see the Hakia comparison with Google here.
The choice of the BOSS service is understandable. Yahoo these days seems pliable. Cutting a deal with Google is fuzzy, often depending on which Googler one tracks down via email or at a conference. In my opinion, Google has been playing hardball in the semantic space. I am starting to think Google has designs on jump starting the semantic search “revolution” and putting its own systems and methods in place. The semantic Web certainly has not taken off, so why not entertain the notion of Google as the Semantic Web? Makes sense to me.
Microsoft, fresh from its hunt for semantic technology, is a big outfit, so it is also difficult to find an “owner” of the task a company like Hakia wants to use. Microsoft can put a price tag on accessing its index, which one cheery Redmonian told me now contained 25 billion Web pages. I told the Redmonian, “My tests suggest that the index is in the 5 to 7 billion page range.” I was told that I was an addled goose. So, what’s new.
Yahoo–troubled outfit that it is–probably welcomes an opportunity to allow Hakia to get the portal some positive media coverage. But if I had been advising Hakia (which I am not), I would have suggested Hakia give Exalead in Paris, France, a jingle. Exalead’s Web index is fresh, contains eight billion or so Web pages, and its engineers are quite open to new ideas. Yandex also might have made my list partners.
Check out the Hakia system at http://www.hakia.com. When I get additional information, I will try to update this post.
Stephen Arnold, July 10, 2008
Update: July 10, 2008, 10 am: My Hakia post is part of a larger fabric of Yahoo BOSS coverage. You will want to read “Yahoo Radically Opens Web Search with BOSS” in the July 9, 2008, TechCrunch. Mark Hendrickson’s coverage is a very good summary of the information on Yahoo’s Web site. He also takes a positive stance, noting “BOSS is the second concrete product to come out of Yahoo’s Open Strategy. The first was Search Monkey back in April .” I am not ready to even think about being positive. These types of announcements are coming when the firm is in disarray. Any announcement, therefore, may be moving deck chairs on the Titanic. I will take a more skeptical position and say, “Let’s see how this plays out.” Yahoo is in flux, and its own semantic search system, referenced in the essay above, is not too good.
Update 2, July 10, 2008 10 10 am Eastern time: Hakia provided this information to me just a few moments ago.
- The news release is on the Hakia Web site at http://company.hakia.com/pr-070308.html. Don’t forget the dots. (How about an explicit link on the splash page, Hakia?)
- You can find other Hakia news releases at this location http://company.hakia.com/press.
- The “official” Yahoo release is here: This url is too crazy to reproduce.