AI Is Learning To Read

September 19, 2014

Machines know how to read, because they have been programmed to understand letters and numbers. They, however, do not comprehend what they are “reading” and cannot regurgitate it for users. The Research Blog that comments on Google’s latest news “Teaching Machines To Read Between The Lines (And A New Corpus With Entity Salience Annotations),” about how the search engine giant is using the New York Times Annotated Corpus to teach machines entity salience. Entity salience basically means machines can comprehend what they are “reading,” locate required information, and be able to use it. The New York Times Corpus is a large dataset with 1.8 million articles from twenty years. If a machine can learn salience from anything, it would be this collection.

Entity salience is determined by term ratios and complex search indexing done-brought to you by Knowledge Graph. The machine reading the article records the indicator for salience, byte offsets, entity index, mention count of entity determined by conference system, and other information to digest the document.

The system does work better with proper nouns:

“Since our entity resolver works better for named entities like WNBA than for nominals like “coach” (this is the notoriously difficult word sense disambiguation problem, which we’ve previously touched on), the annotations are limited to names.”

On a similar note on the Team Leada blog people can ask Google’s Director of Research Peter Norvig questions. He was asked:

“What is one of the most-often overlooked things in machine learning that you wished more people would know about or would study more? What are some of the most interesting data science projects Google is working on?”

Norvig responded that there are many problems depending on the project you are working on and Google is doing a lot of data science projects, but nothing specific.

Machine learning and reading is being worked on. In short, machines are going to school.

Whitney Grace, September 19, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta