Google Continues to Improve Voice Search
November 5, 2015
Google’s research arm continues to make progress on voice search. The Google Research Blog updates us in, “Google Voice Search: Faster and More Accurate.” The Google Speech Team begins by referring back to 2012, when they announced their Deep Neural Network approach. They have since built on that concept; the team now employs a couple of models built upon recurrent neural networks, which they note are fast and accurate: connectionist temporal classification and sequence discriminative (machine) training techniques. The write-up goes into detail about how speech recognizers work and what makes their latest iteration the best yet. I found the technical explanation fascinating, but it is too lengthy to describe here; please see the post for those details.
I am still struck when I see any article mention that an algorithm has taken the initiative. This time, researchers had to rein in their model’s insightful decision:
“We now had a faster and more accurate acoustic model and were excited to launch it on real voice traffic. However, we had to solve another problem – the model was delaying its phoneme predictions by about 300 milliseconds: it had just learned it could make better predictions by listening further ahead in the speech signal! This was smart, but it would mean extra latency for our users, which was not acceptable. We solved this problem by training the model to output phoneme predictions much closer to the ground-truth timing of the speech.”
At least the AI will take direction. The post concludes:
“We are happy to announce that our new acoustic models are now used for voice searches and commands in the Google app (on Android and iOS), and for dictation on Android devices. In addition to requiring much lower computational resources, the new models are more accurate, robust to noise, and faster to respond to voice search queries – so give it a try, and happy (voice) searching!”
We always knew natural-language communication with machines would present huge challenges, ones many said could never be overcome. It seems such naysayers were mistaken.
Cynthia Murrell, November 5, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph