To become the best, you have to learn from the best. Poker players aiming to be at the top of their game will definitely learn a thing or two from Cepheus but they should tuck away any hopes they might have of winning.
Developed by a team from the University of Alberta, Cepheus is a computer algorithm designed to beat Texas Hold'em. Specifically, it is made to play "heads up limit hold'em," a two-player version of the poker game and also the simplest one. The team has been working on Cepheus for the last 10 years but it only took two months for the computer algorithm to learn everything it needs to become the ultimate Texas Hold'em expert.
The algorithm that Cepheus uses is called CFR+, an improved version of counterfactual regret minimization technique. CFR algorithms have tried to solve poker in the past by utilizing several steps at every decision point but have never actually tried solving full heads up limit Texas Hold'em games, or any poker game for that matter, because massive memory was required.
CFR+ is a much more efficient version of CFR, taking fewer but bigger steps towards the best possible solution. Applying compression, researchers were able to bring the memory requirement from 262 terabytes to less than 11 for storing counterfactual values and a mere 6 for main strategy computing. Memory requirements were also spread across 200 computation nodes, where each node stored values locally on 1-TB disks. Each node also had 32 GB RAM and 24 AMD cores rated at 2.1 GHz.
Aside from calculating for the best solutions to heads up limit Texas Hold'em, Cepheus also proved that there is truth in the belief that dealers have significant advantage in winning the game. Additionally, a common poker strategy where bets are raised on the first action instead of just calling by matching the highest bet was also confirmed.
The researchers are hoping to use their gains with the CFR+ algorithm to take on more complex games of poker involving more players but the algorithm may also offer advantages in real-life applications of game theory. For example, CFR+ may be used for developing security strategies, allowing officials to efficiently determine which resources should be deployed to particular areas at varying times during the day without becoming predictable.
"Just like poker, you have to bluff to play it perfectly. If you don't bluff, an opponent or attacker can take advantage of that," said Neil Burch, one of the researchers.