Meta Algorithm Can Now Grok Like The Best Of Us
February 10, 2022
AI algorithms are far from decrying the robot revolution, but each day brings us closer towards more intuitive technology. For example, The Register shares that “Meta Trains Data2vec Neural Network To Grok Speech, Images, Text So It ‘Can Understand The World.’”
Meta researchers, formerly Facebook, say they developed a new algorithm model, dubbed data2vec, to decipher speech, classify objects, check grammar, and even perform accurate sentimental analysis. Are you grokking this? The news brief reads like a description from a Heinlein science-fiction novel! However, Meta is pushing the boundaries of AI science and data2vec is certainly not fiction.
Instead of training data2vec on one data model, it was trained on three: images, text, and speech. While data2vec interprets data from three models, it processes them separately. Data2vec is a transformer-based neutral network and is designed to be self-supervised to learn patterns in audio, NLP, and computer vision:
“The model learns to operate with different types of data by learning how to predict how the representation of data it’s given; it knows it has to guess the next group of pixels when given an image, or the next speech utterance in audio, or fill in the words in a sentence.”
Data2vec is different from other AI algorithms because of how it separately processed data:
“ ‘We train separate models for each modality but the process through which the models learn is identical,’ Alexei Baevski, a research engineer at Meta AI told The Register. ‘We hope that it will enable future work to build high performing self-supervised models that combine modalities and are more effective than specialized models. Different modalities can add additional information to the same piece of content – for example body language from video, prosodic information from audio, and text can combine into a richer representation of a dialog. The algorithms that currently try to combine multi-modal information exist but they do not yet perform well enough to replace specialized algorithms and we hope our work will help change that.’”
Instead of using the same multi-modal information design, Meta AI approached the problem differently. In order for data2vec to gain a grokking ability, Meta AI broke down the process and simplified how the algorithm learned. The concept is similar to breaking down a lego construction to its individual bricks, then being rebuilt with data on how and why each brick works in its specific place.
Meta AI is closer to making AI capable of human, even Martian grokking. That is grokking unbelievable.
Whitney Grace, February 10, 2022