Believe It or Not: The First AI to Surpass Humans
May 12, 2021
Say what you will about other aspects of Google, you have to hand it to the company’s research arm. Developers at Google Brain, DeepMind, and the University of Toronto have developed the reinforcement-learning AI DreamerV2. According to Analytics India Magazine, “Now DeepMind’s New AI Agent Outperforms Humans” as measured by the Atari benchmark. The tech evolved from last year’s Dreamer agent created by the same team. It uses a world model, an approach that is more adept at forming generalizations than traditional trial-and-error machine learning processes. World models have not been as accurate as many other algorithms, however. Until now. Reporter Ambika Choudhury writes:
“Dreamer learns a world model from the past experience and efficiently learns far-sighted behaviors in its latent space by backpropagating value estimates back through imagined trajectories. DreamerV2 is the successor of the Dreamer agent. … This new agent works by learning a world model and uses it to train actor-critic behaviors purely from predicted trajectories. It is built upon the Recurrent State-Space Model (RSSM) — a latent dynamics model with both deterministic and stochastic components — allowing to predict a variety of possible futures as needed for robust planning, while remembering information over many time steps. The RSSM uses a Gated Recurrent Unit (GRU) to compute the deterministic recurrent states. DreamerV2 introduced two new techniques to RSSM. According to the researchers, these two techniques lead to a substantially more accurate world model for learning successful policies: [a] The first technique is to represent each image with multiple categorical variables instead of the Gaussian variables used by world models; [b] *The second new technique is KL balancing. This technique lets the predictions move faster toward the representations than vice versa.”
See the write-up for a chart of DreamerV2’s performance compared to previous world models. And all this on a single GPU. Curious readers can check out the team’s paper here. We believe.
Cynthia Murrell, May 12, 2021