Complexity of Checkers and Draughts on board sizes 6x6/8x8/10x10/12x12/14x14The complexity of games is usually expressed by two numbers: the state-space complexity (SSC) and the game-tree complexity (GTC). The SSC is the number of positions that can be reached from the initial position. The GTC is usually estimated by b^d, where b is the branching factor and d is the average depth (game length in ply).
I'm interested in the complexity of checkers on board sizes 6x6 (small) and 8x8 (normal), and the complexity of draughts on board sizes 6x6 (small), 8x8 (Russian/Brazilian), 10x10 (International), 12x12 (Canadian) and 14x14 (South-African, a.k.a. Dumm). So we have 2 sets of rules and 5 board sizes.
Attachment:
5_sizes_dia.png [ 13.33 KiB | Viewed 2725 times ]
State-Space ComplexityIn [1] the SSC of 8x8 checkers is calculated, and we can generalize the given formula to:
Attachment:
SSC formula.png [ 7.82 KiB | Viewed 2725 times ]
for NxN boards, where n = max. # of pieces on the board, m = N(N-2) / 4 = max. # of pieces per color, b = # black men, B = # black kings, w = # white men, W = # white kings, f = # black men on the first row, R = N/2 = # dark squares on 1 row, and S = N2 / 2 = # dark squares on the board.
Applying this formula on the 5 board formats, we obtain (for 1 color to move):
Code:
Board size SSC <=
6x6 63838220543 = 6.4 x 10^10
8x8 500995484682338672639 = 5.0 x 10^20 [1]
10x10 2301985676687843186738387267616767 = 2.3 x 10^33
12x12 5732895173337388545809530684864049716501476802559 = 5.7 x 10^48
14x14 4969247364259229765636136727875248396725305632355107910329548406783 = 5.0 x 10^66
These are only upper bounds; the real numbers are (much) lower, even if we take into account that we need the number for both colors to move (a factor 2). For 6x6 checkers and 6x6 draughts I calculated the exact numbers, as they can be counted using a simple non-recursive function that traverses the entire game tree, keeping a large array in RAM with 1 bit per position (for both colors to move: 63838220543 x 2 = 127676441086 positions).
Code:
Game SSC % of all Max.Depth Time Nodes/sec
6x6 checkers 27,364,174,047 21.4% 407,888 14:07:29 538,146
6x6 draughts 10,075,772,480 7.9% 123,536 5:33:57 502,858
In [2] an estimate is given for the SSC of 10x10 draughts: 10^30 (no calculation given).
Game-Tree ComplexityIn [2] estimates are given for the GTC of 8x8 checkers (b^d = 2.8^70 = 10^31) and 10x10 draughts (b^d = 4^90 = 10^54). These are still often cited, see e.g. [4]. However, in [3] the GTC of 8x8 checkers was corrected to b^d = 6.14^50 = 10^40, using several games databases.
10x10 draughts needs a similar correction. Using all 500,000+ games from the database Turbo Dambase [5], I find
b^d = 8.22^101 = 2.5 x 10^92. So like checkers much higher than previously estimated.
The GTC of the other variants is unknown. To estimate these, we need to know the branching factor and the average game length. These are determined experimentally. To this end, I developed a basic engine (named 'General') that can play all games at a reasonable level. Properties: data structure: single-dimensional array; search: principal variation search, transposition table, history heuristic, repetition check, single-move extension, quiescence search (side captures only), single thread, no forward pruning; evaluation: material (king = 1.5 men for checkers, king = sqrt(N) men for NxN draughts, runaways, outpost safety, balance, center control, mobility, holes, drawish material, trapped kings for checkers, mainly missing element: locks; parameters are hand-tuned; no opening book; no endgame databases; randomized: shuffle root move list before search. Tested for 10x10 draughts, the version with positional evaluation scored 78.5% against the material-only version (158 games, 5 sec/move).
The resulting program can beat me, a former 10x10 draughts club player, in all variants. The 10x10 version scores about 1 draw per 20 games against Maximus (equal time), which amounts to expert level.
Before starting the experiment (playing a large amount of games) the following questions have to be answered.
1. When do we stop a game that will probably end in a draw?
2. How much time (or depth) do we give the program, for games to be representative?
3. How many games do we play?
For draughts there is a natural moment to stop a game: if the search score is near zero and we have drawish material: the player who is behind has a king and the other player has 3 or less pieces (10x10, 12x12, 14x14) or 2 or less pieces (6x6, 8x8; exception: 1 king against 3 is also a draw when the king-alone is on the main diagonal - if I am not mistaken).
For checkers there is no such moment. 2 kings win against 1, and sometimes also 1 against 1. The solution is to play with a high max. game length, and later truncate the games when a certain number of pieces is left. This corresponds to stopping when a virtual endgame database is reached (and we only want to know the game length, not the result). And this corresponds roughly to human players agreeing to a draw or resigning when they can predict the outcome. For 8x8 checkers the game length corresponds to the game length in human games if a 10-pieces (virtual) database is used - the largest feasible database that actually exists. For 10x10 draughts this is 8-pieces, for 12x12 draughts this would be 7-pieces, and for 14x14 draughts 6-pieces. For 6x6 checkers and 6x6 draughts I generated the 8-pieces database, but truncating games at 8 pieces results in artificially short games so I used 6-pieces in these cases (a bit arbitrary).
It is unclear which time or depth would be appropriate, so I chose to use incremental depth, which also makes it possible to observe the trend. I decided to play 10,000 games at each depth from 0 to at least 7, giving at least 70,000 games per variant. The games were stored, later truncated, duplicate games were removed, then the averages per game and per position were calculated. In total, over 600,000 games were played, of which over 450,000 unique (182 CPU days; 21 calendar days on 3 multi-core computers).
Attachment:
Branching Factor.png [ 18.1 KiB | Viewed 2725 times ]
Each datapoint represents <= 10,000 unique games.
Attachment:
Game Length.png [ 24.61 KiB | Viewed 2725 times ]
The branching factors found for 8x8 checkers and 10x10 draughts are higher than the previously found values for human games (checkers: 6.32 vs. 6.14; draughts: 8.95 vs. 8.22). This is influenced by the value of the evaluation mobility parameter. A lower value for mobility results in a lower branching factor. I tried several values and decided to use the one that resulted in the strongest play. Also the game length for 10x10 draughts is 10 ply more than in human games. But human games contain blunders, losing on time and grandmaster-draws, so the question is which number is more reliable.
Anyway, it means that we should take the following results with a grain of salt, they are ballpark figures. For the branching factor I took the average over all games with search depth 3 and higher, for the game length I took the average over all games with the 3 highest search depths. This gives:
Code:
Game Trunc.#p. b^d GTC >=
6x6 checkers 6 4.37^38 = 2.2 x 10^24
8x8 checkers 10 6.32^49 = 1.7 x 10^39
6x6 draughts 6 3.75^19 = 8.1 x 10^10
8x8 draughts 10 6.13^43 = 7.3 x 10^33
10x10 draughts 8 8.95^110 = 5.0 x 10^104
12x12 draughts 7 13.36^200 = 1.4 x 10^225
14x14 draughts 6 18.42^320 = 7.8 x 10^404
6x6 checkers and 6x6 draughts were also weakly solved (both are a draw), by building the 8-pieces WDL (win/draw/loss) databases and by using PNS (proof-number search), without a transposition table to avoid the GHI problem. But even the 4-pieces db's are sufficient for a quick proof (a few seconds). These games were previously solved by Rein Halbersma [6], using alpha-beta and a transposition table but without endgame databases (and disregarding GHI).
The 8-pieces DTW (distance-to-win) databases for 6x6 checkers and 6x6 draughts were also built. The longest win in 6x6 checkers is 147 ply (with 68 consecutive king-slide-moves), in 6x6 draughts this is 83 ply.
ConclusionsThe following figure shows the results found, compared to some popular games from literature.
Attachment:
Complexity.png [ 22.14 KiB | Viewed 2725 times ]
Concluding, we can say that:
1. 6x6 checkers and 6x6 draughts are weakly solved.
2. 8x8 checkers is weakly solved [3]. 8x8 draughts is a candidate to weakly solve, and it can be solved much faster than checkers.
3. 10x10 draughts is more complex than currently "known" in the literature. The complexity is closer to chess than to checkers. Weakly solving 10x10 draughts is far away.
4. 12x12 draughts is more complex than chess, and very tactical (is my impression; I lost all test games).
5. 14x14 draughts is more complex than Go, and very strategic (is my impression; I won a few test games).
More complex means a game has a higher GTC, not necessarily it is more difficult. Especially 14x14 draughts will be a tough game for computers. Methods that are known to be highly successful for 8x8 checkers and 10x10 draughts will start to break down. There is virtually no use for an opening book and endgame databases, search depth will be strategically insignificant (compared to game length), and evaluation becomes increasingly difficult (to learn). So maybe AlphaGo-like techniques start to get useful here. And we don't have to get bored in the years to come.
Jan-Jaap
References[1] Schaeffer & Lake: Solving the Game of Checkers (1996)
[2] Allis: Searching for Solutions in Games and Artificial Intelligence (1994)
[3] Schaeffer: Game over, Black to play and draw in Checkers (2007)
[4]
https://en.wikipedia.org/wiki/Game_complexity[5]
http://www.turbodambase.nl/tdamhome.php[6]
http://www.laatste.info/bb3/viewtopic.php?f=53&t=4013&start=698