Pro gamers, look out — for the first time ever, a world champion e-sports team has lost to an AI team.
In a series of live competitions between the reigning Dota 2 world champion team OG and the five-bot team OpenAI Five, the AI won two matches back-to-back, settling the best-of-three tournament. With 45,000 years of practice at Dota 2 under its belt, the system looked unstoppable — deftly navigating strategic decisions and racing to press its advantages with uncannily good judgment.
It’s a milestone that the top AI labs in the world have been frantically pushing toward in the last few years. Online games offer an opportunity to show off the strategic decision-making, coordination and long-term planning skills of their creations. Powerful new techniques mean that AI can do things regarded, less than a decade ago, as nearly impossible.
Combat strategy games are a great way to showcase everything AI can do
Dota 2 is a multiplayer online battle arena game, a style of strategy game where players coordinate to achieve strategic objectives — like destroying or conquering enemy towers, ambushing enemy units, and upgrading and improving their own defenses.
In Dota 2, each team has five players, each player directing their own “hero” with unique abilities. It’s a complex game with computer-controlled units on both sides, more than 100 potential “heroes” with different abilities, computer-controlled neutral units, an in-game economy, and a big map where players fight to destroy the enemy base (their “Ancient”) while protecting their own.
OpenAI Five plays a simplified version of the game, with a limited subset of its heroes and without a few game features such as summons (where the player creates and controls additional units) and illusions (where the player can create copies of themself). The OpenAI researchers I spoke to pointed out that excluding summons and illusions actually helps the humans — controlling the detailed movements of lots of units is the kind of thing AIs are very good at.
Within the simplified bounds of the game, OpenAI Five was an astounding triumph. One thing to look for in evaluating the performance of an AI system on a strategy game is whether it’s merely winning with what gamers call “micro” — the second-to-second positioning and attack skills where a computer’s reflexes are a huge advantage.
OpenAI Five did have good micro, but it also did well in ways that human players, now that they’ve seen it, may well choose to emulate — suggesting that it didn’t just succeed through superior reflexes. The commentators watching the game criticized OpenAI Five’s eagerness to buy back into the game when its heroes died, for example, but the tactic was borne out — maybe suggesting that human pros should be a bit more willing to pay to rejoin the field.
And OpenAI had a deeper strategic understanding of the board than the human commentators. When the commentators were asserting that the game looked evenly-matched, OpenAI would declare that it perceived a 90% chance of victory. (It turns out that soberly announced probability estimates make for great trash talk, and these declarations frequently rattled their opponents OG). To us the game may have seemed open, but to the computer, it was obviously nearly over.
Of course, that’s still an example of the computer leveraging skills that computers are good at — like making accurate predictions and tracking lots of information about the world. But those are skills with far broader applicability than fast reflexes and good attack timing, so seeing them demonstrated is significantly more impressive.
AIs are picking up new abilities at an astonishing pace
In 2016, when the brand-new Elon-Musk-founded AI nonprofit announced they were going to teach a computer to play DotA, they were promising something that no one had ever done before. AI systems were capable of some cool stuff — speech recognition was rapidly advancing, AlphaGo had just won 4 out of 5 matches with a top Go player, and companies were optimistic that they could make progress on tough problems like autonomous vehicles and translation.
But playing complex strategy games as well as the top professionals was beyond them. OpenAI had to start on highly simplified versions of the game — 1 player versus 1 instead of 5 versus 5, only a handful of heroes available, major game elements stripped out for the sake of simplicity.
And the AI still wasn’t that great. As late as last year, it lost its exhibition matches at major DotA tournament The International.
But AI has kept advancing, at a breakneck pace, and our understanding of what it’s capable of just keeps changing. OpenAI’s competition triumph against a Dota 2 pro team isn’t even the first such event this year; in January, their competitor DeepMind rolled out a bot that competes with the pros at Starcraft, which won its matches 10-1.
Anyone inclined toward cynicism about these advances still has grounds to be unimpressed. OpenAI Five plays with only 17 of the game’s 115 heroes, and restricts some major, game-altering abilities. Skeptics of DeepMind’s AlphaStar observed that the computer, despite being rate-limited, was still winning with micro that a human couldn’t compete with. And OpenAI took 45,000 years of Dota 2 gameplay to reach its current level of ability — so humans still learn faster.
But it’s impossible to deny that AIs are casually achieving things that would have been unimaginable only a few years ago.
OpenAI wants to tell us that AI is our ally, not our enemy
Competitive games are a great environment to show off what AI can achieve. But there’s a downside to showing the wider world what AI can achieve only via exhibition matches in which AI crushes humans — it makes it feel like AI is our steadily advancing enemy. OpenAI argues that, far from it, AI should be thought of as a resource for humans.
Toward that end, the team invited me to do a demo of a new OpenAI Five feature — where human players play the game alongside some AI bots, named “Friend 1”, “Friend 2”, “Friend 3” and “Friend 4”. While I clumsily moved my dragon around the screen — I am very far from a pro Dota player — my teammates swooped around coming to my rescue in ambushes. (The public will be able to try this out in a few weeks through OpenAI Arena.)
In the public coop match a little later, the humans were sometimes impressed with, and sometimes frustrated with, their AI allies. It was, as promised, a different experience of AI’s potential.
That’s what OpenAI’s researchers want. The team hopes that as AI becomes more advanced, it’ll be used to assist human decisionmaking — its probability estimates helping us interpret medical scans, its modeling abilities helping us understand how proteins fold so we can develop new drugs.
Some people might question, if we want AI to be a friendly ally in improving the world, whether teaching AIs to conquer and kill their enemies at war strategy games is a great idea.
It’s not as ill-advised as it might sound. These AIs are taught with reinforcement learning, meaning they have a ‘reward function’ — a picture of what actions in the world are rewarded. They learn, through practice, how to maximize it. The AIs aren’t learning the general concept of ‘conquest’ and ‘killing’ — just learning what actions increase their odds of winning the games.
The techniques that are used to train systems like OpenAI Five and AlphaStar are powerful and generalizable, but the reward functions themselves are super specific, and won’t be inspiring Skynet. There really isn’t anything to fear from OpenAI Five itself — except, maybe, what it portends about the pace of AI progress.
When it comes to AI advances more generally, though, there’s a lot that gives researchers pause. Many experts believe that, as AI systems become more powerful, we’re opening ourselves up to potentially-deadly mistakes. We might design AI systems with goals that don’t accurately reflect what we want, or systems that are vulnerable to external attackers. If those mistakes occur with moderately-powerful systems, they might cause stock market crashes, power system failures, and costly accidents; if they occur with extremely-powerful systems, the effects could be much worse.
These are, no doubt, all problems we can solve with time. But AI is advancing so quickly that some AI policy analysts worry we won’t have spent enough time on safety and policy planning by the time powerful systems are deployed.
In an interview last week, OpenAI CTO Greg Brockman compared the ways AI will transform society to the ways the internet did: “In a lot of ways, we’ve had 40, 50 years to have the internet play out in society. And honestly that change has still been too fast. You look at recent events and — it’d just be nice if we’d spent more time to understand how this would affect us.”
But AI, he notes, will transform the world much, much faster than that. It’s been eight months since OpenAI Five struggled at the Dota competition The International. Now it’s nearly unbeatable.
“It hurts. We’re doomed,” OG player Johan Sundstein said after the second loss. On Twitter, he added, “Just hope they remember how nice and mannered we were once they own the planet.”
Sign up for the Future Perfect newsletter. Twice a week, you’ll get a roundup of ideas and solutions for tackling our biggest challenges: improving public health, decreasing human and animal suffering, easing catastrophic risks, and — to put it simply — getting better at doing good.