The gamer punches in play after endless play of the Atari classic Space Invaders. Though an interminable chain of failures, the gamer adapts the gameplay strategy to reach for the highest score. But this is no human with a joystick in a 1970s basement. Artificial intelligence is learning to play Atari games. The Atari addict is a deep-learning algorithm called DQN.
This algorithm began with no previous information about Space Invaders—or, for that matter, the other 48 Atari 2600 games it is learning to play and sometimes master after two straight weeks of gameplay. In fact, it wasn't even designed to take on old video games; it is general-purpose, self-teaching computer program. Yet after watching the Atari screen and fiddling with the controls over two weeks, DQN is playing at a level that would humiliate even a professional flesh-and-blood gamer.
Volodymyr Mnih and his team of computer scientists at Google, who have just unveiled DQN in the journal Nature, says their creation is more than just an impressive gamer. Mnih says the general-purpose DQN learning algorithm could be the first rung on a ladder to artificial intelligence.
"This is the first time that anyone has built a single general learning system that can learn directly from experience to master a wide range of challenging tasks," says Demis Hassabis, a member of Google's team. The algorithm runs on little more than a powerful desktop PC with a souped up graphics card. At its core, DQN combines two separate advances in machine learning in a fascinating way. The first advance is a type of positive-reinforcement learning method called Q-learning. This is where DQN, or Deep Q-Network, gets its middle initial. Q-learning means that DQN is constantly trying to make joystick and button-pressing decisions that will get it closer to a property that computer scientists call "Q." In simple terms, Q is what the algorithm approximates to be biggest possible future reward for each decision. For Atari games, that reward is the game score.
Knowing what decisions will lead it to the high scorer's list, though, is no simple task. Keep in mind that DQN starts with zero information about each game it plays. To understand how to maximize your score in a game like Space Invaders, you have to recognize a thousand different facts: how the pixilated aliens move, the fact that shooting them gets you points, when to shoot, what shooting does, the fact that you control the tank, and many more assumptions, most of which a human player understands intuitively. And then, if the algorithm changes to a racing game, a side-scroller, or Pac-Man, it must learn an entirely new set of facts. That's where the second machine learning advance comes in. DQN is also built upon a vast and partially human brain-inspired artificial neural network. Simply put, the neural network is a complex program built to process and sort information from noise. It tells DQN what is and isn't important on the screen.
Nature Video of DQN AI
Via Dr. Stefan Gruenwald