Watson Speaks Naturally
September 3, 2015
While there are many companies that offer accurate natural language comprehension software, completely understanding the complexities of human language still eludes computers. IBM reports that it is close to overcoming the natural language barriers with IBM Watson Content Analytics as described in “Discover And Use Real-World Terminology With IBM Watson Content Analytics.”
The tutorial points out that any analytics program that only relies on structured data loses about four fifths of information, which is a big disadvantage in the big data era, especially when insights are supposed to be hidden in the unstructured. The Watson Content Analytics is a search and analytics platform and it uses rich-text analysis to find extract actionable insights from new sources, such as email, social media, Web content, and databases.
The Watson Content Analytics can be used in two ways:
- “Immediately use WCA analytics views to derive quick insights from sizeable collections of contents. These views often operate on facets. Facets are significant aspects of the documents that are derived from either metadata that is already structured (for example, date, author, tags) or from concepts that are extracted from textual content.
- Extracting entities or concepts, for use by WCA analytics view or other downstream solutions. Typical examples include mining physician or lab analysis reports to populate patient records, extracting named entities and relationships to feed investigation software, or defining a typology of sentiments that are expressed on social networks to improve statistical analysis of consumer behavior.”
The tutorial runs through a domain specific terminology application for the Watson Content Analytics. The application gets very intensive, but it teaches how Watson Content Analytics is possibly beyond the regular big data application.