Go is a two-player board game that originated in China more than 2,500 years ago. The rules are simple, but Go is widely considered the most difficult strategy game to master. For artificial intelligence researchers, building an algorithm that could take down a Go world champion represents the holy grail of achievements.
Well, consider the holy grail found. A team of researchers led by Google DeepMind researchers David Silver and Demis Hassabis designed an algorithm, called AlphaGo, which in October 2015 handily defeated back-to-back-to-back European Go champion Fan Hui five games to zero. And as a side note, AlphaGo won 494 out of 495 games played against existing Go computer programs prior to its match with Hui — AlphaGo even spotted inferior programs four free moves.
“It’s fair to say that this is five to 10 years ahead of what people were expecting, even experts in the field,” Hassabis said in a news conference Tuesday.
Deep Blue took humans to the woodshed in chess. IBM’s Watson raked inwinnings in Jeopardy! Silver and Hassabis in 2015 unveiled an algorithm that taught itself to conquer classic Atari games. Every year, it seems, humanity waves fewer and fewer title belts over computers in the world of games.
In March, 32-year-old Lee Sedol — the greatest Go player of the decade — will represent mankind in a Kasparov-like battle of wits against AlphaGo in Seoul, South Korea. Should Sedol fall, consider Go yet another game flesh and blood has relinquished mastery to silicon.
But there’s one prize that computers will struggle to take — for a while, at least — from humans: a World Series of Poker bracelet. Ten-player, no-limit poker is the final vestige of our recreational supremacy, and the reasons computers struggle to win this game illustrate a big-picture problem that AI researchers are working to solve. AlphaGo was a step in that direction.
The 2006 WSOP bracelet. (Credit: flipchip/LasVegas.com/via Wikimedia)
Go represented the ultimate AI challenge because it’s a game with an outstanding number of possible moves on a given turn. For example, in chess a player can consider 35 moves on a given turn. In Go, a player has more than 300 moves to consider. The sheer volume of scenarios to contemplate each turn earned Go its holy grail designation.
To conquer Go, Hassabis and Silver combined deep learning with tree search capabilities to pare down the amount of information AlphaGo needed to sift through. Deep learning algorithms rely on artificial neural networks that operate similarly to the connections in our brain, and they allow computers to identify patterns from mounds of data at a speed humans could never obtain.
Hassabis and Silver started by feeding AlphaGo a collection of 30 million moves from games played by skilled human Go players, until it could correctly predict a player’s next move 57 percent of the time; the previous record was 44 percent. Then AlphaGo played thousands of games against its own neural networks to improve its skills through trial and error. AlphaGo’s success is in its combination of two networks: a value network and a policy network.
(Credit: Saran Poroong/Shutterstock)
“The policy network cuts down the number of possibilities that you need to look at with any one move. The valuation network allows you to cut short the depth of the search,” says Hassabis. “Rather than looking all they way to the end of the game, you can look at a certain move in the game and judge who is winning.”
This is the key breakthrough. To this point, solving games like chess or checkers involved throwing more resources at the problem to search deeper. Past algorithms have relied on more and more computing power to run ever more simulations of a game all the way to the end — or brute force — to optimize a strategy. A chess program like Deep Blue used brute force, but combined that tactic with windowing techniques to narrow the search and spend less time examining bad moves. However, pruning moves at shallow levels of the search can lead to errors. But AlphaGo is different. It uses deep learning networks to evaluate board positions in isolation and determine who’s winning — without any look-ahead searching. Researchers published their results Wednesday in the journal Nature.
“They were able to build an evaluation function that assesses its position that is much more accurate than anything we’ve seen before. And that’s amazing,” says Jonathan Schaeffer, dean of faculty of science at the University of Alberta.
Why Poker Poses a Challenge
Games like chess, checkers and Go are played within a framework of well-defined rules. Players have “complete information” on any given turn: You can see the whole board, and the situation is clear. Computer algorithms thrive in this environment. On the other hand, in a game like no-limit poker, players are working with incomplete information.
“You may not know what card your opponents have. There’s uncertainty. Those are the games where we have the most challenge — games where there’s chance and incomplete information,” says Toby Walsh, an artificial intelligence professor at Australia’s University of New South Wales and Data61. “Apart from the other variants of poker — uncertainty and randomness — there’s a third feature: psychology.”
Bluffing, reading opponents for ticks and other tells are key skills for top-notch poker players. Psychology, communication and collaboration still pose challenges for machines. Understanding this information requires troves of knowledge about the world. These are things that humans can do instantaneously.
“I can look at a face of a friend and recognize they are my friend, even if they’re in a funny pose,” says Subbarao Kambhampati, an artificial intelligence researcher and professor at Arizona State University. “If you play chess and you win, you can provide a reasonable explanation based on the rules of the game. But how did you know that person was your friend? You have a much harder time explaining.”
The Next Big Step
Teaching an algorithm to go beyond well-defined rules to make assessments about its environment is the next big step in artificial intelligence. This is what Cornell University computer science professor Bart Selman calls “common sense understanding,” or computers that see the world like we do. An algorithm with common sense could be the giant leap that ties disparate technologies together. Think of delivery drones and cars that interpret feedback from the environment to navigate, or a super-Siri that never says, “I don’t quite understand that.”
“Imagine running Uber without human drivers, or a truly useful virtual assistant. If I’m the first to do that, there’s an enormous hidden capital there,” says Selman.
A little common sense will go a long way to help current technology make the next leap forward. That’s the big prize, so it’s no wonder companies and universities are investing heavily in AI research.
“We will spend more of our lives interacting with (computers), and it will be important for them to understand our emotional states,” says Walsh. “For computers to be truly intelligent, they’ll have to have emotions.”
No Reason to Fear
AlphaGo is a step toward an “enlightened” AI because, as Schaeffer says, AlphaGo is a first example of an AI with “general intelligence.”
“There’s nothing in the algorithms that are fundamental to the game of Go. You could apply it to other games,” says Schaeffer. “That allows us to move toward a more general AI — one that can play games, drive a car or do poetry. We aren’t there, but this paper represents an advancement.”
But all this talk of machines with emotions and common sense can send some into apocalyptic fever dreams. Elon Musk and Stephen Hawking have provided their own doomsday warnings about the power of AI. But the experts who weighed in on the latest and greatest AI achievement aren’t so worried.
“People shouldn’t be fearing this. This program has no autonomy. It has no desires to do anything other than play Go,” says Walsh. “The challenge here is not intelligence; you can have really smart computers and no ethical challenge. The problem is autonomy, systems that can act in the real world.”
For Walsh and others, the more immediate concern is the impact on people’s jobs — especially those with well-defined tasks and outcomes. That’s where the conversation should begin, they say. Still, advances in CRISPR and AI should spur the world’s top minds to have a discussion on ethics, and that’s what’s happening at AI conferences around the world.
“We already had the ability to annihilate the world without intelligence systems, and I think these systems will only improve our ability to control that sort of damage,” says Kambhampati. “The what-ifs are probably overblown, and they make for more interesting press, but I don’t think anyone who is thinking about these issues is worried about AI taking over the world.”