The artificial intelligence called Agent57 has learned to play all 57 Atari video games in the Arcade learning environment, a collection of classic games that researchers use to test the limits of their deep learning models. Developed by DeepMind, Agent57 uses the same deep reinforcement learning algorithm to achieve superhuman gameplay levels even in games that previous IAs have struggled with. Being able to learn 57 different tasks makes Agent57 more versatile than previous game AIs.
The artificial intelligence that plays video games
The truth is that video games are a great way to test your AI. They provide a variety of challenges that force an AI to devise a number of strategies, yet they have a clear measure of success, a score to train against.
There are four Atari games in particular that have proven to be more difficult to overcome. In “Montezuma’s Revenge and Trap”, the AI must try many different strategies before finding a winner. And in Solaris and Ski, there can be long waits between action and reward, making it difficult for an AI to learn which moves pay off the most.
To address these challenges, Agent57 brings together multiple improvements DeepMind has made to its Deep-Q network, the AI that first beat a handful of Atari games in 2012, including a form of memory that allows it to base its decisions on things it has seen in the game before, and reward systems that encourage the AI to explore its options more thoroughly before deciding on a strategy. These various techniques are managed by a meta-controller, which balances the trade-offs between pursuing a particular strategy and doing more exploration.
Why has it been a challenge for an artificial intelligence to play video games?
Despite their success, the best models of deep learning we have today are not very versatile. Most tend to be good at one thing and one thing only. Training an AI to excel at more than one task is one of the biggest open challenges in deep learning. The ability to learn 57 different tasks makes Agent57 more versatile than previous AI games, but it still can’t learn to play more than one game at a time. Agent57 can learn to play 57 games, but it cannot learn to play 57 games at once. It needs to retrain for each new game although it can use the same algorithm to do so. So Agent57 is similar to AlphaZero, DeepMind’s deep reinforcement learning algorithm, which can learn to play chess, go and shogi, but again, not all at once.
In short, true versatility, which comes so easily to a human child, is still far beyond the reach of AIs.