Entity Extraction with Solr

July 11, 2013

Entity extraction is a feature that many enterprise users want to build into their architecture. Solr 4 has the features that allow a work around or “poor man’s” entity extraction. Erik Hatcher, one of the founders of LucidWorks, explains how in his SearchHub blog entry, “Poor Man’s ‘Entity’ Extraction with Solr.”

The instructions begin:

“Entity extraction, as defined on Wikipedia, ‘seeks to locate and classify atomic elements in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc.’ When drilling down into the specifics of the requirements from our customers, it turns out that many of them have straightforward solutions using built-in (Solr 4.x) components, such as: Acronyms as facets; Key words or phrases, from a fixed list, as facets; Lat/long mentions as geospatial points.”

SearchHub is one of many means through which LucidWorks bolsters its support and training to all Apache Lucene Solr developers as well as LucidWorks customers. LucidWorks users find that both the LucidWorks Big Data and LucidWorks Search solutions are ready to go out-of-the-box but allow customization and scalability in a way that Hatcher demonstrates above.

Emily Rae Aldridge, July 11, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search


Comments are closed.

  • Archives

  • Recent Posts

  • Meta