BLEURT Blabs NLG Well

June 18, 2020

 

Humans want their technological offspring to sound like them. AI and machine learning have advanced to where computers can carry simple conversations, but they are far from fluent. Natural language generation had grown in recent years and their performance is measured by humans and automatic metrics. Neowin discusses Google’s BLEURT, an automatic metric for natural language models, in “BLEURT Is A Metric For Natural Language Generators Boasting Unprecedented Accuracy.”

BLEURT works like some other smart software:

“At the heart of BLEURT is machine learning. And for any machine learning model, perhaps the most important commodity is the data it trains on. However, the training data for an NLG performance metric is limited. Indeed, on the WMT Metrics Task dataset, which is currently the largest collection of human ratings, contains approximately 260,000 human ratings apropos the news domain only. If it were to be used as the sole training dataset, the WMT Metric Task dataset would lead to a loss of generality and robustness of the trained model.”

BLEURT’s research team employed transfer learning to improve upon the WMT MEtrick Task dataset and also used a novel pre-training scheme to better its robustness and accuracy. It underwent two training phases: language modeling followed by evaluating NLG models. BLEURT scored the highest amongst similar technology. BLEURT’s goal is to improve Google’s language abilities.

Whitney Grace, June 18, 2020

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta