Microsoft and Its Entity Cube

December 30, 2009

Entity extraction has been around for a long, long time. Microsoft Research has revivified the discipline with its EntityCube. Here’s the description of EntityCube in the Microsoft EntityCube team’s words:

EntityCube is a research prototype for exploring object-level search technologies, which automatically summarizes the Web for entities (such as people, locations and organizations) with a modest web presence. The Chinese-language version is called Renlifang. The need for collecting and understanding Web information about a real-world entity (such as a person or a product) is mostly collated manually through search engines. However, information about a single entity might appear in thousands of Web pages. Even if a search engine could find all the relevant Web pages about an entity, the user would need to sift through all these pages to get a complete view of the entity. EntityCube generates summaries of Web entities from billions of public Web pages that contain information about people, locations, and organizations, and allows for exploration of their relationships. For example, users can use EntityCube to find an automatically generated biography page and social-network graph for a person, and use it to discover a relationship path between two people.

Microsoft Research points out that this is a test, eschewing the Google term “beta”. There are some issues, which include entity extraction, name disambiguation, entity ranking, and relationship extraction, among others. Softpedia’s “Introducing Microsoft’s EntityCube” is a useful overview.

I ran the query “dataspace” on Microsoft Academic, which has some EntityCube features enabled. I got a results list and as shown in the screenshot below, a list of entities on the left side of the results screen. I reviewed the hits and the ranking was somewhat unexpected. Experts whom I expected to appear toward the top of the results list were buried.

entitycube

There was a more general purpose version of the system available at http://entitycube.research.microsoft.com/, but I did not want to install Silverlight, the bane of Major League Baseball, on this underpowered netbook. If you want quotes and more bells and whistles you can walk down this path.

According to Geek in Disguise, EntityCube offers some interesting features when you search for information about a person. Here’s what Geek in Disguise said, “Specifically, EntityCube automatically generates:

  • A biography page for a person.
  • A social-network graph for a person.
  • A shortest-relationship path between two people.
  • All titles of a person that are found on the Web. “

On10.net reported that EntityCube:

builds a dynamic Wikipedia page for the entity or person you search for. The types of information you’ll find include biographies, a social-network graph, relationships between people (mouse over the link to see how they are connected), and titles of people.

I checked my files and found a note to myself that a similar technology was in use to pluck product names from content for Microsoft’s Live.com product catalog service. You can compare the Microsoft technology with that of Cluuz.com. I find Cluuz.com’s approach more useful for my research. The set of content for Cluuz.com comes from the Yahoo.com index. Results from the Microsoft demo seem sparse. Cluuz.com seems to have addressed the problems identified with the Microsoft service.

Stephen E. Arnold, December 30, 2009

A freebie. If anyone were working at a watchdog agency in Washington, I would report this sad state of money-free writing.

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta