ACM

Communications of the ACM

Home/News/Poker-Playing AI Beats Top Human Players/Full Text

ACM News

Poker-Playing AI Beats Top Human Players

By R. Colin Johnson
February 7, 2017
Comments

View as: Print Mobile App Share:

Viewing poker through artificial intelligence. — The artificial intelligence Libratus came out on top in a 20-day Heads-Up No-Limit Texas Hold-em poker tournament.

Credit: Thinkstock

An artificial intelligence developed at Carnegie Mellon University (CMU) beat four top professional poker players late last month.

The win constituted a rematch, although this was the first time the AI was pitted against top poker players. Back in 2015, CMU professor Tuomas Sandholm and his graduate students pitted the Claudico AI they had developed against four top human players at Rivers Casino in Pittsburgh for 13 days in April and May. The humans won that tournament, which led Sandholm and his students to completely scrap Claudico and start over again.

In January, Sandholm and Ph.D. student Noam Brown returned to Rivers Casino with a new AI they had developed and named Libratus (which means ‘balanced’ in Latin), to face four top-ranked human opponents (including two from the initial tournament) in a 20-day rematch. The Libratus AI beat the humans in a 20-day marathon Heads-Up No-Limit Texas Hold'em poker tournament that included 120,000 hands.

Libratus beat four of the world’s top poker professionals by 14.7 big blinds per hundred hands, making its win statistically significant to the 99.98 percentile, according to Brown.

As play continued, the algorithms that address weaknesses in the AI’s strategy in the background kicked in, foiling the professional poker players whose basic strategy was to find those weaknesses in Libratus. Human opponents’ weaknesses are often "hard-wired" into their strategies and maybe even be subconscious; Libratus, on the other hand, constantly evaluated the success of its strategies, changing them when they were not productive.

During the match, the humans at times did well against the AI, but eventually succumbed to its relentless onslaught.

Dong Kim took first place among the humans, for which he expressed pride after the match, even though he said, "there may never be a next time." Placing second among the humans, Daniel McAulay added, "it was a pleasure battling, but in the end we really got beat." Third-place human player Jimmy Choi said, "This has been the most challenging experience in my life."

The remaining human, Jason Les, said it was very demoralizing to lose this badly to a machine. "I thought it would be a lot closer, so to sum it up—it was a truly monumental achievement in AI," said Les. "The computer won by $1.78 million, but we are not paying up." (In reality, the players split the $200,000 in prize money; each received a minimum of $20,000, with the balance of the money allocated based on how well they played against the AI.)

Powering Libratus was the Pittsburgh Supercomputer Center's newest Bridges supercomputer, which supported it with about 25 million core hours of computations, both during play and at night (when it would learn from the previous day’s play).

Making the difference in the tournaments’ results, Sandholm said, was that "We have much better algorithms in Libratus than we did in Claudico,” including “the algorithms that compute approximations of game-theoretic strategies ahead of the match; the endgame-solving algorithms that refine the strategy in real time during the play of a hand; and the algorithms that fix weaknesses in the AI’s strategy all the time in the background."

Sandholm said Libratus was designed not just to play poker, but also to assist humans in any strategic imperfect-information situation, so he hopes to apply its capabilities to more serious domains. "I mean any multi-agent setting that can be formalized," said Sandholm. "More generally, this shows that the best AI's ability to do strategic reasoning under imperfect information has surpassed that of the best humans.”

The AI's algorithms are “game-independent,” Sandholm said, and “have many potential applications in negotiation, cybersecurity, military setting, auctions, finance, strategic pricing, as well as steering evolution and biological adaptation. On many of these applications I have already worked on for years; others are newer."

Libratus’ victory in the poker tournament “was a landmark step for AI,” Sandholm said. “This is the first time that AI has been able to beat the best humans at Heads-Up No-Limit Texas Hold'em. In all the other games where AI research has been seriously done for decades, the best AI has surpassed the best humans, such as in checkers, chess, heads-up limit Texas Hold’em, and Go."

The victory demonstrated, according to Sandholm, that a human's best efforts in any game can be out-played by AIs, which can take advantage of expertise in strategic reasoning under conditions of imperfect knowledge (as Libratus did).

In addition, he says, AI can now reason better about imperfect information for people in all walks of life, beginning a revolution of using AI to improve our own skills.

Sandholm claims his implementation is unique and superior to other, better-funded efforts. IBM's Watson, written by thousands of programmers, for instance, claims to be pioneering this trail, but according to Sandholm, Libratus is vastly superior in its ability to perform strategic reasoning, not just quickly sifting through Big Data to come up with answers to questions.

"Libratus and Watson do totally different things. Libratus is a system for strategic reasoning; Watson is a question-answering system."

R. Colin Johnson is a Kyoto Prize Fellow who has worked as a technology journalist for two decades.

No entries found