The news: AlphaStar, the company’s latest learning algorithm, defeated professional Starcraft II players for the first time, scoring 10 wins and one loss against the pros, called TLO and MaNa. The popular real-time strategy game involves players competing as one of three races to building structures and engaging in combat across a sprawling battlefield.
Practice, practice: AlphaStar learned to play within an environment called the AlphaStar League. A large neural network first observed replays of expert human games. It was then pitted against versions of itself, using a machine-learning technique called reinforcement learning to improve over time. Importantly, the program’s speed of action, and its view of the battlefield, were limited so that it didn’t have an unfair edge over humans.
Who gives a Zerg? AlphaStar had to display new kinds of intelligence in order to master the game. The techniques developed for playing the game could potentially prove useful in many practical situations where complex strategy is required: think trading or even military planning.
Higher score: Starcraft II is not only extremely complex. It is also a game of “imperfect information,” meaning players cannot always see what their opponents are up to. There is also no single best strategy for playing. And it takes time for the results of a player’s actions to become clear, making it harder for an algorithm to learn through experience. DeepMind’s team used a very specialized neural network architecture to address these issues.
Game theory: DeepMind is most famous for developing the software that learned to beat the world’s best Go and chess players. But before that, the company developed several algorithms that learned to play simple Atari games. Playing video games is a neat way to measure progress in artificial intelligence, and to compare computers with humans. It is, however, also a very narrow test—AlphaStar, like its predecessors, can only do one task, albeit incredibly well