Internet engine matches

Discussion about development of draughts in the time of computer and Internet.
Post Reply
BertTuyt
Posts: 1328
Joined: Wed Sep 01, 2004 19:42

Re: Internet engine matches

Post by BertTuyt » Wed Jul 29, 2015 18:15

Maybe in several years we only remember the answer , which is 17, but forgot the question.
We are lucky that the Elo difference is not 42.

Bert

Fabien Letouzey
Posts: 285
Joined: Tue Jul 07, 2015 07:48
Real name: Fabien Letouzey

Re: Internet engine matches

Post by Fabien Letouzey » Wed Jul 29, 2015 20:24

Rein Halbersma wrote:So Kingsrow is only about 17 ELO points better than Scan.
I guess 17 Elo is a lot in draughts, although this would have little impact in a tournament.

I also assume that King's Row is not using large-scale machine learning, so this is important: ML as an alternative rather than a must-have as was in Othello. It seems that one can use either maths or knowledge/intuition + a lot of testing; I find that interesting.

Rein Halbersma
Posts: 1635
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: Internet engine matches

Post by Rein Halbersma » Wed Jul 29, 2015 22:09

Fabien Letouzey wrote:
Rein Halbersma wrote:So Kingsrow is only about 17 ELO points better than Scan.
I guess 17 Elo is a lot in draughts, although this would have little impact in a tournament.
Actually, it's a bit more complicated. The normal ELO system uses a binomial logistic distribution to model predicted scores. In draughts, the drawing margin is so high, that a trinomial logistic distribution is more appropriate. Then if F[x] = 1/(1+10^(-x/400)) (usual ELO formula), you have to fit F[-draw+delta] == WIN_RATE and F[draw+delta]== LOSS_RATE. Plugging in the numbers, I (well, Mathematica...) find that draw == 438 and delta = 62 for the above match of 981 games.

So in effect, the high drawing margin is hiding a much greater strength difference (62 vs 17 points) than if the drawing margin would have been around 0. The drawing margin of over 400 points is really killing the game.

As a consequence, I don't agree with Bert's conjecture that draughts programs are near perfect play. Even with much smaller difference, I still think there is a lot of missing knowledge in draughts programs that leads to suboptimal positional play. It's just not exposed because every opponent is also missing that knowledge. High drawing margin only means that programs are equally efficient in their current search/knowledge, not with respect to a hypothetical perfect play standard.
I also assume that King's Row is not using large-scale machine learning, so this is important: ML as an alternative rather than a must-have as was in Othello. It seems that one can use either maths or knowledge/intuition + a lot of testing; I find that interesting.
I wonder what the limits of machine learning are. It seems ideally suited to exploit local patterns (in particular locks, perhaps breakthroughs), but global concepts (total terrain advancement, tempo development etc.) seem hard to fit with overlapping patterns. What are your thoughts on it?

Fabien Letouzey
Posts: 285
Joined: Tue Jul 07, 2015 07:48
Real name: Fabien Letouzey

Re: Internet engine matches

Post by Fabien Letouzey » Thu Jul 30, 2015 07:29

Rein Halbersma wrote:I wonder what the limits of machine learning are. It seems ideally suited to exploit local patterns (in particular locks, perhaps breakthroughs), but global concepts (total terrain advancement, tempo development etc.) seem hard to fit with overlapping patterns. What are your thoughts on it?
One big limit of supervised learning is that it ignores search/eval interaction. It's possible that reinforcement learning doesn't share this problem.

Actually I think the main strength is in the little things: a higher-order PST. Patterns are not just approximations of the standard features. I make a connection with human vision.

Global concepts are out of reach of learning only if they are highly non-linear like left/right balance. Tempi for instance can be calculated exactly. An example of non-linear global concept would be position type.

Regarding how to improve it, there are many directions. Michel went on to use bigger patterns, and Michael Buro was also following that road in the GLEM paper. Using more non-linearity is another way, approaching ANNs in structure. See for instance Hannibal in Othello (http://satirist.org/learn-game/systems/ ... nibal.html). And then there's how to generate and weight the examples (in the supervised case). It has a big impact because statistical methods are very sensitive to correlation.

MichelG
Posts: 244
Joined: Sun Dec 28, 2003 20:24
Contact:

Re: Internet engine matches

Post by MichelG » Thu Jul 30, 2015 09:32

Rein Halbersma wrote: I wonder what the limits of machine learning are. It seems ideally suited to exploit local patterns (in particular locks, perhaps breakthroughs), but global concepts (total terrain advancement, tempo development etc.) seem hard to fit with overlapping patterns. What are your thoughts on it?
If you look at the currect developments in machine learning and vision (with neural nets, you can identy that a dog of a certain breed is catching a frisbee), it is easily imaginable that global concepts can be learned as well. Whether (slow) complex neural nets it would work better than the simple but fast local patterns is an interesting question.

BertTuyt
Posts: 1328
Joined: Wed Sep 01, 2004 19:42

Re: Internet engine matches

Post by BertTuyt » Sun Aug 02, 2015 21:34

I wonder what the limits of machine learning are. It seems ideally suited to exploit local patterns (in particular locks, perhaps breakthroughs), but global concepts (total terrain advancement, tempo development etc.) seem hard to fit with overlapping patterns. What are your thoughts on it?
Im convinced that machine learning finally will surpass any form of human learning. And Im really glad that programs like Scan and Dragon showed us, that this is the way to move forward.
In the beginning of Computer Draughts, initially, Truus (Stef keetman) and Flits (Adri Vermeulen) dominated this domain, and a little later Buggy (Nicolas Guibert). These 3 players had in common that next to programmer skills they were very good in Draughts (all 3 high ranked draughts players). Nowadays this is different, think Klaas Bor and Ton Tillemans are very good draughts players themselves, but it is not longer a boundary condition to create the best program.

So I really hope that others in Computer Draughts will follow on the ML path, and share their results and methods.
Anyway in my case, I will terminate any handtuning activities in the evaluation function of Damage, and will only focus on ML.

Bert

Krzysztof Grzelak
Posts: 782
Joined: Thu Jun 20, 2013 17:16
Real name: Krzysztof Grzelak

Re: Internet engine matches

Post by Krzysztof Grzelak » Tue Aug 04, 2015 10:22

Match KINGSROW - SCAN (3-move ballots - 988 games)

Kingsrow 1.56 vs. Scan 2.0 96 wins, 26 losses, 863 draws, 3 unknowns

Kingsrow

Opening Book = Best Moves
HashTable Size = 128 MB
DB cache Size = 10000 MB 6P
CPU 4 = 8 cores
Pondering = Off
Time = 5 Min / 80 Moves

Scan

Opening Book = book-margin = 0
DB cache Size = 2000 MB, 6P
CPU 4 = 8 cores
Pondering= Off
Time = 5 Min / 80 Moves

Match played on a computer with the equipment.

Processor - Intel Core I7 2670QM, 2.2GHz
Hard disc - SSD Samsung 840 Evo 1 TB
Memory of frames - 16 GB DDR3 1333
System - Windows 7 Home Premium 64 bit Service Pack 1 PL


It was the last test program Kingsrow and Scan. Thanks to Ed'a Gilberta and Fabiena Letouzey.
Attachments
dxpgames.pdn
(1.02 MiB) Downloaded 51 times

Ed Gilbert
Posts: 768
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: Internet engine matches

Post by Ed Gilbert » Wed Aug 05, 2015 20:16

Hi Krzysztof,

Thank you for posting these match results, but I think there might be some problem with your configuration of scan. I just now got scan running (none of my computers support popcount so I had to download VS2015 and recompile it using a software emulation for popcount). I'm getting results different than yours, with scan looking stronger than kingsrow. Can you post your scan.ini file here, and also check that you have all of its endgame db files? Maybe Fabien or others that are more familiar with scan can suggest other checks.

-- Ed

mtaktikos
Posts: 6
Joined: Wed Aug 05, 2015 17:56
Real name: Michael Taktikos

Re: Internet engine matches

Post by mtaktikos » Wed Aug 05, 2015 20:42

Ed Gilbert wrote: I just now got scan running (none of my computers support popcount so I had to download VS2015 and recompile it using a software emulation for popcount).
-- Ed
Hi Ed, today I had success with compiling Mobydam for non-popcnt computers (the compile is shared in the Mobydam thread). Would you be so kind to share your non-popcnt compile of Scan with us?

Regards,
Michael Taktikos

Ed Gilbert
Posts: 768
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: Internet engine matches

Post by Ed Gilbert » Wed Aug 05, 2015 21:15

http://edgilbert.org/InternationalDraug ... ources.zip

New files and changed files (for Windows compile). Also add bitcount.cpp to the project.

Krzysztof Grzelak
Posts: 782
Joined: Thu Jun 20, 2013 17:16
Real name: Krzysztof Grzelak

Re: Internet engine matches

Post by Krzysztof Grzelak » Wed Aug 05, 2015 21:26

Ed Gilbert wrote:Hi Krzysztof,

Thank you for posting these match results, but I think there might be some problem with your configuration of scan. I just now got scan running (none of my computers support popcount so I had to download VS2015 and recompile it using a software emulation for popcount). I'm getting results different than yours, with scan looking stronger than kingsrow. Can you post your scan.ini file here, and also check that you have all of its endgame db files? Maybe Fabien or others that are more familiar with scan can suggest other checks.

-- Ed
Ed am using 6 endgame db. File ini I have so saved.

book = true
book-margin = 0
threads = 4
tt-size = 6
bb-size = 6

dxp-server = true
dxp-host = 127.0.0.1
dxp-port = 27531
dxp-initiator = false
dxp-time = 5
dxp-moves = 80
dxp-board = true
dxp-search = true

Ed Gilbert
Posts: 768
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: Internet engine matches

Post by Ed Gilbert » Wed Aug 05, 2015 21:34

Krzysztof Grzelak wrote:Ed am using 6 endgame db. The file that you ask put below.

http://www83.zippyshare.com/v/eNG6DLYf/file.html
To access the file I must first install FreeGamesZone on my PC, which I do not want to do. The scan.ini file should be just a few lines. Can you copy/paste to here?

I am particularly interested to see the "threads =" entry. You wrote,
CPU 4 = 8 cores
Does that mean that your scan.ini entry is "threads = 8"?

-- Ed

Ed Gilbert
Posts: 768
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: Internet engine matches

Post by Ed Gilbert » Wed Aug 05, 2015 21:43

tt-size = 6
This is the problem. The transposition table is much too small. Change it to

tt-size = 24

and I think you will get much different results.

-- Ed

Krzysztof Grzelak
Posts: 782
Joined: Thu Jun 20, 2013 17:16
Real name: Krzysztof Grzelak

Re: Internet engine matches

Post by Krzysztof Grzelak » Wed Aug 05, 2015 21:57

Inscription CPU 4 = 8 cores means that the program uses CPU 4 total in your computer supports 8 cores. I think that when tt-size = 24 is too much. Fabien use that figure but use all 4 endgame db. Once I changed to 30 and the program for a long time thought the party ends.

Ed Gilbert
Posts: 768
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: Internet engine matches

Post by Ed Gilbert » Wed Aug 05, 2015 22:05

The number of transposition table entries is 2^tt-size. That means 2 raised to the power of tt-size. If tt-size is 6, the number of entries is 2^6 = 64. If tt-size is 24, then number of entries is 2^24 = 16.7 million. Each tt entry is 16 bytes, so 16.7 million entries is 256mb. I don't know how scan behaves with different tt sizes, but clearly 6 is much too small. I would guess a value of 23 would be good for matches of 3 to 5 minutes. It's probably not critical (but 6 is very bad).

-- Ed

Post Reply