Flits self-learning mode

Discussion about development of draughts in the time of computer and Internet.
Fabien Letouzey
Posts: 295
Joined: Tue Jul 07, 2015 07:48
Real name: Fabien Letouzey

Re: Flits self-learning mode

Post by Fabien Letouzey » Sat May 23, 2020 20:05

For me, the key is how much Ed's code is actually contributing to Bert's evaluation weights. From the following quote from Bert (that I have no reason to doubt), it's focusing on the maths part. I feel that it's fine in this case, and this also goes in the direction Rein is advocating, that many 3rd party libraries are also providing the maths part to everybody anyway. It's not like they have "pattern learning" as an algorithm (maybe one day).
The tool was with source, so I could change the input, as I wanted that the optimization tool had no knowledge about the evaluation, so I separate them. All the knowledge about the evaluation is in the tool i made (called cpn) that reads the .pdn files, derives the relevant positions, and for every position a feature vector is constructed. I have 2 types of features, pattern-features (where I only provide the pattern index to the optimization tool), and property features which have a numerical value.
(BTW I also insist on separating them, which is why I assumed from the beginning that "optimizer" could only refer to the maths part, but it was only a guess).

If, in a parallel universe, Ed's code were actually doing everything (convert draughts games/positions to a final weight vector), then I would feel it's (way) too much the same as Kingrow; fine for personal experiments, though.

While I felt uneasy after congratulating Bert perhaps a bit quickly (not being sure what Ed's code was doing), from this quote I now see that Bert has done the work I expect of any game programmer. I'm not sure that the missing gradient piece is so important anymore (and might become a common developper tool in the future).

Although I don't use 3rd party libraries, I'll take a shot in the dark and give Jan-Jaap a possibly-missing piece of the puzzle. At first sight, it seems far fetched that those libraries could help learning patterns at all. However, from what I consider a historical coincidence, Natural language processing (NLP) also requires sparse vectors (or at least used to). For this reason, possibly alone, ML libraries usually have some way to cope with sparse features. Rein found mention of "wide learning" (if I remember correctly) in TF for instance.

Fabien.

Rein Halbersma
Posts: 1664
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: Flits self-learning mode

Post by Rein Halbersma » Sat May 23, 2020 21:16

Ed Gilbert wrote:
Sat May 23, 2020 19:08
The gradient descent code itself is probably best done using a professional library.
Rein, I'm sure you know much more about this than I do, but from my point of view it was easier to write my own. I looked at a few of the big packages, and I was intimidated by what looked like a non-trivial exercise to figure out how to use them. I was not familiar with some of the terminology. In the past some of my biggest headaches have been trying to understand why some big black-box software doesn't do what I want it to, like with the GDI interface in Windows, and some other Windows APIs that can be hard to use and are not always very well documented. And gradient descent seemed simple enough. In retrospect, gradient descent was a little more difficult than I first estimated, but I still think I made the right decision. What difficulties I had were primarily getting the right cost function, and finding the right way to create training games. My optimizer can converge on all the weights using 200M training positions in about 1 hour, using 8 threads. Would Tensor Flow be able to do that? I don't know. But at least I know what my code is doing, and I don't have to fight with it.

I can also observe that Fabien wrote his own GD optimizer, rather than use an existing library, and he seems to have a lot of experience with ML. He calls it a "right of passage", and he may be right. If these ML libraries were the easiest way to create eval weights, why is it that no draughts programmer has used them yet?
I didn't say a library is the easiest way, but it is most definitely the most accurate way. One reason it isn't easy is that the pattern weights are a very special case not easily adapted to existing ML frameworks. The way most ML libraries work is that you have either continuous variables (e.g. material counts) or binary features (a specific pattern instance). With the Scan-like patterns, each N-squared pattern has 3^N values. Normal ML libraries would expect 3^N columns in your data file with 0/1 (called "dummy" variables) whether that specific pattern instance is present. They then compute the cost function as a sum of squares of linear combinations of weights times the dummy, so e.g. (result - sum_i weight_i * dummy_i)^2, with i running from 1 to 3^N for each pattern.

With N = 8 and a dozen patterns and 200M positions, your data file explodes and you won't be able to hold it in RAM. <EDIT>: not to mention that with P patterns, you only want to do load P different weights, not run the sum over all patterns instances that are not present! </EDIT> A few years ago (remember Tensorflow was released at the end of 2015, a few months after Scan 1.0!) doing batched I/O for large data sets was all very cumbersome. Nowadays, Tensorflow or Keras (a layer on top of TensorFlow) have user-friendly batched I/O operations that will take care of terabytes of data. I should go back and check this myself.

So the way to use ML libraries for sparse patterns is to read in the pattern indices, and attach the appropriate pattern instance inside the cost function to the particular weight. So e.g. something like (result - weight[index])^2. For most ML libraries that I know, this requires a non-trivial amount of coding the cost function and especially encoding the gradient (the gradient problem is solvable by using a library that can use automatic differentiation). In Tensorflow, they have a special feature called "wide learning" that does the index-bookkeeping for you. But I have little experience with that library and extracting the weights is a bit of a black art. Tensorflow expects you to run the fitted model also from within Tensorflow, whereas for draughts you just want the weights file, not call Tensorflow during a search.

In any case, very specific feature requirements and perhaps a lack of familiarity with high-quality numerical software might be the main reasons for lack of use.
Last edited by Rein Halbersma on Sat May 23, 2020 21:34, edited 1 time in total.

Rein Halbersma
Posts: 1664
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: Flits self-learning mode

Post by Rein Halbersma » Sat May 23, 2020 21:30

Rein Halbersma wrote:
Sat May 23, 2020 21:16
In any case, very specific feature requirements and perhaps a lack of familiarity with high-quality numerical software might be the main reasons for lack of use.
Another reason might be the cognitive load from learning new tools and also the tight coupling of different parts of a software program that is a common pitfall. Take e.g. your own Kingsrow GUI. It looks beautiful and works like a charm, and I have nothing to criticize about it. But I bet now that you have written a KR-hub version, that you wish you had decoupled your engine from your GUI from the start, am I right? You could still do that: rewrite your existing GUI with the same look and feel and let it load KR-hub. If you would succeed in that, you have a GUI that others can plug their hub-engines into (I'd love to have Scan in the KR GUI!), and others like Klaas Bor can plug in the KR-engine into their own GUI. It would also make it a lot easier to port KR to Linux e.g., and then others can write a Hub-GUI there (use Fabien's GUI or Wieger's Dambo applet).

I have fallen victim to the trap of tight coupling myself as well. I'm now starting a Stratego program, and I should have been able to re-use many components from my draughts library. Except I wasn't, because e.g. the board code was completely coupled to draughts specific stuff. I have now factored out all board geometry stuff into a separate library called Tabula (https://github.com/rhalbersma/tabula). Using a more general approach allows me to create draughts boards, Stratego boards, or even chess, Shogi and Othello boards. I'm now also porting the legal move tables to this library, so that I can generate legal moves for all kinds of pieces (draughts man/king, chess pieces, Stratego pieces) by sliding/jumping over the generalized board framework. For my DCTL library, I only need to keep the jump-generation and majority capture precedence rules since that is very specific to draughts. All sliding / moving pieces can benefit from a general library. I guess other people trying to support multiple boards for draughts have had the same experience that hardcoded square numberings are hard to fix.

I suspect that as people develop programs, they start by adding their own small pieces of specialized code, rather than learning new libraries and also having to maintain these dependencies. And once the initial investment in your own code has been done, it pays off to keep adding to it. Whereas if you look at the end product, in hindsight it might have been a better idea to make the big upfront investment. These trade-offs are very hard to get right. Heck, I even went the other way in my Tabula library, removing a dependency on a very high-quality Boost library for 80 lines of my own code. But I still kept the same naming conventions and API for everything that I replaced from the Boost library, so that I can easily go back to it should I ever need to (just replace my own tabula:: namespace identifier with boost::).

Ed Gilbert
Posts: 792
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: Flits self-learning mode

Post by Ed Gilbert » Sat May 23, 2020 22:29

Another reason might be the cognitive load from learning new tools and also the tight coupling of different parts of a software program that is a common pitfall. Take e.g. your own Kingsrow GUI. It looks beautiful and works like a charm, and I have nothing to criticize about it. But I bet now that you have written a KR-hub version, that you wish you had decoupled your engine from your GUI from the start, am I right?
Yes, definitely. Hub is a nice clean interface, I like it a lot. I wish I had the foresight years ago to create something like it.
You could still do that: rewrite your existing GUI with the same look and feel and let it load KR-hub. If you would succeed in that, you have a GUI that others can plug their hub-engines into (I'd love to have Scan in the KR GUI!)
Yes, I have considered doing that, mainly to make my own developing easier by not having to support 2 interfaces into the engine. As a general purpose GUI, kingsrow falls far short I'm afraid. Other engines will want different engine options, for example. Writing a good GUI is a whole lot of work, maybe more than writing a strong engine. We need some talented UI developers to come along and write something for draughts, similar the way they have been developed for chess.

Krzysztof Grzelak
Posts: 886
Joined: Thu Jun 20, 2013 17:16
Real name: Krzysztof Grzelak

Re: Flits self-learning mode

Post by Krzysztof Grzelak » Sun May 24, 2020 10:50

Ed Gilbert wrote:
Sat May 23, 2020 22:29
Yes, definitely. Hub is a nice clean interface, I like it a lot. I wish I had the foresight years ago to create something like it.
Hub is not the best solution for draughts programs.

Ed Gilbert
Posts: 792
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: Flits self-learning mode

Post by Ed Gilbert » Sun May 24, 2020 13:08

Recreating another engine's book from black box reverse engineering seems entirely above board to me. So not looking at the binary files or doing de-compilation, but just playing 1.5M openings and logging when the reply is instant should give you the entire variation tree but without the various eval score per node. Doing your own dropout expansion on this variation tree and computing your own engine's eval scores on it and trying to find best responses to the Kingsrow book, to me is part of the competitive process.
That's exactly what I was referring to when I said a book could be copied with a little bit of programming. If I'm correctly interpreting the replies so far, Bert, JJ, and I are opposed to it, Rein doesn't see a problem. I could see this leading to some kind of book wars, where at each iteration one program one-ups the opponent's book, but only if those books are publicly available. Maybe I'll run some tests to see just how much elo difference a book makes. I'm guessing 1 or 2 elo at most, but who knows?

-- Ed

Rein Halbersma
Posts: 1664
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: Flits self-learning mode

Post by Rein Halbersma » Sun May 24, 2020 13:56

Ed Gilbert wrote:
Sun May 24, 2020 13:08
That's exactly what I was referring to when I said a book could be copied with a little bit of programming. If I'm correctly interpreting the replies so far, Bert, JJ, and I are opposed to it, Rein doesn't see a problem. I could see this leading to some kind of book wars, where at each iteration one program one-ups the opponent's book, but only if those books are publicly available. Maybe I'll run some tests to see just how much elo difference a book makes. I'm guessing 1 or 2 elo at most, but who knows?
Hi Ed,

It's not about copying the opening book, it's about countering another program. Suppose Kingsrow were only available from a website, but with HUB or DXP API to play against. And suppose you would mask the response time from book moves to make them look like Kingsrow was actually thinking about each move. Suppose I play 1M games against this Kingsrow version, and I build a dropout expansion book from these 1M games to find better moves the next time I face Kingsrow in a tournament. Would you consider anything wrong with that?

And I posed this question earlier in this thread: would it be wrong for you to pool all games you ever played against Scan, Truus, Flits, Damage and do dropout expansion on each particular opponent? To me, what you call a "book war" is just part of the natural competitive process.

I fully agree that copying a book verbatim to use it as your own book in tournaments or to enhance the sales of your own program is a lame move. But using published play of an opponent has always been part of competition (both for humans and machines). There are several passages in One Jump Ahead where the Chinook team is searching for "cooks" against Tinsley's published play.

Rein

Rein Halbersma
Posts: 1664
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: Flits self-learning mode

Post by Rein Halbersma » Sun May 24, 2020 14:10

Ed Gilbert wrote:
Sun May 24, 2020 13:08
I'm guessing 1 or 2 elo at most, but who knows?
My hunch is that it is indeed very little Elo-wise. And if it does gain a program without a prior book some Elo it is probably not that a Kingsrow book would suggest radically different moves than a program would make on its own, but mainly the fact that a book pushes some part of the calculations offline, so that you save thinking time during games. In that sense, copying a book is not really copying move quality but copying time spent calculating. A copycat programmer should at the very least contribute to your electricity bill :-)

Ed Gilbert
Posts: 792
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: Flits self-learning mode

Post by Ed Gilbert » Sun May 24, 2020 15:33

It's not about copying the opening book, it's about countering another program. Suppose Kingsrow were only available from a website, but with HUB or DXP API to play against. And suppose you would mask the response time from book moves to make them look like Kingsrow was actually thinking about each move. Suppose I play 1M games against this Kingsrow version, and I build a dropout expansion book from these 1M games to find better moves the next time I face Kingsrow in a tournament. Would you consider anything wrong with that?
I wouldn't do it. I would only do dropout expansion with my own engine. Anyway, now I'm curious to see if the book makes any measurable difference. I suspect that the book in a program like scan makes very little difference, because scan makes very few errors, so the book is not going to prevent many errors, only buy a little bit of extra thinking time. For flits it might be different. In my short test, flits was losing half the games. If a book could prevent say 20% of those losses, it would be a significant elo improvement. But it would still lose a lot of games.

Sidiki
Posts: 171
Joined: Thu Jan 15, 2015 16:28
Real name: Coulibaly Sidiki

Re: Flits self-learning mode

Post by Sidiki » Sun May 24, 2020 16:22

Hi Ed

Can we know what's the maximal depth of your book, and if it's generated by self-play or there's a special engine dedicated for it.

I hink that, as Kryztof mentionned it, after the end of a book, each program has his own eval ,and will play with his "own" strenght. A opening book can it contain a source code or eval algorithm of an engine?

The flits's book sent to you , Jaap ans Bert continue even when Kingsrow's book finish.

Once again it's for fun and sorry if this will give idea to someone else to counter Scan, Kingsrow, Maximus, Damage or Dragon.

Speaking of Dragon, the 4.6.2 version it's available.

Friendly,

Sidiki.

jj
Posts: 180
Joined: Sun Sep 13, 2009 23:33
Real name: Jan-Jaap van Horssen
Location: Zeist, Netherlands

Re: Flits self-learning mode

Post by jj » Sun May 24, 2020 18:32

Fabien Letouzey wrote:
Sat May 23, 2020 20:05
Although I don't use 3rd party libraries, I'll take a shot in the dark and give Jan-Jaap a possibly-missing piece of the puzzle. At first sight, it seems far fetched that those libraries could help learning patterns at all. However, from what I consider a historical coincidence, Natural language processing (NLP) also requires sparse vectors (or at least used to). For this reason, possibly alone, ML libraries usually have some way to cope with sparse features. Rein found mention of "wide learning" (if I remember correctly) in TF for instance.
Fabien, thanks for your shot in the dark. Not sure which puzzle you mean though. I also don't use 3rd party libraries.

Just to establish some rules, would you say it is allowed if I take Scan's eval, change some features but still use your eval file, put that in my own search and use that to play my first million training games?

Jan-Jaap

Fabien Letouzey
Posts: 295
Joined: Tue Jul 07, 2015 07:48
Real name: Fabien Letouzey

Re: Flits self-learning mode

Post by Fabien Letouzey » Mon May 25, 2020 13:10

jj wrote:
Sun May 24, 2020 18:32
Just to establish some rules, would you say it is allowed if I take Scan's eval, change some features but still use your eval file, put that in my own search and use that to play my first million training games?
I consider that using ML at all, especially when home-written, is enough to justify some personal contribution (i.e. the process itself will affect the output). You and I would only start from 0 knowledge, for various reasons, but I don't see why everybody should be constrained by this.

AlphaGo, like all Go programs of that time, used human games to learn shapes (policy network). Does that taint the result? Sure, and they worked hard afterwards to remove that dependency. Should it be banned? Why?

Of course, who contributed to the data should be mentioned.

Fabien.

jj
Posts: 180
Joined: Sun Sep 13, 2009 23:33
Real name: Jan-Jaap van Horssen
Location: Zeist, Netherlands

Re: Flits self-learning mode

Post by jj » Mon May 25, 2020 19:22

Fabien Letouzey wrote:
Mon May 25, 2020 13:10
Of course, who contributed to the data should be mentioned.
I'm not saying everybody should be constrained to zero knowledge and that all dependencies should be banned. I suppose it is normal to train on games played by previous versions of your engine, on games played against other available engines, etc. I do like the zero knowledge approach myself.

Personally, I think that if engine B is solely trained on games by engine A that is a borderline case. Then you can say that B is bootstrapped by A, maybe even that A acts as a blueprint for B. If A is a bad engine then your ML results will be poor, but if A is a world-class engine you will have more success. This success is borrowed from A. (A separate matter is if B also uses A's optimization code.)

I am surprised that this would be allowed, i.e. to call the result your original work. But if the community and in particular the lead programmer think it is fair then who am I to complain? Since the topic of fairness was brought up I wanted to include ML in the discussion.

So to make it concrete: should Bert mention Ed as co-author of Damage now or is it sufficient to mention him in the fine print?

Fabien Letouzey
Posts: 295
Joined: Tue Jul 07, 2015 07:48
Real name: Fabien Letouzey

Re: Flits self-learning mode

Post by Fabien Letouzey » Tue May 26, 2020 08:22

jj wrote:
Mon May 25, 2020 19:22
...

I am surprised that this would be allowed, i.e. to call the result your original work. But if the community and in particular the lead programmer think it is fair then who am I to complain? Since the topic of fairness was brought up I wanted to include ML in the discussion.

So to make it concrete: should Bert mention Ed as co-author of Damage now or is it sufficient to mention him in the fine print?
I'm not sure that writing down formal rules like this is possible; it feels like trying to predict all the possible futures. I only answered because you insisted on counting votes. In other communities, TDs decide on the rules for instance; programmers are not all-knowing and should not be all-deciding.

Here are a few things that I suggest you take into account before making a proposal:
- beginner friendliness (the draughts community is aging)
- what is already done in other games: people using the same game file or generating games, when not outright eval scores, using Stockfish (I'm just showing examples, not saying those are OK); I guess this is connected to the "beginner" branch
- there are A0-like libraries around: just input your game rules and press "start" (and cough up hundreds of dollars, but who's counting?)
- think hard about "original work"; I don't consider the current draughts engines "original" enough for healthy competition (hopefully one day), but your rules would make them so, IMO by construction

> So to make it concrete: should Bert mention Ed as co-author of Damage now or is it sufficient to mention him in the fine print?

I take into account the "distance" in the dependency graph. Each transformation like learning and search affect the data, making it less and less like the orginal (assuming those steps are not copied from the same source, of course). To answer your question, I feel that fine print would be enough in this case (and then each TD can decide whether there is a conflict of interest). Make sure to ask others.

Rein Halbersma
Posts: 1664
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: Flits self-learning mode

Post by Rein Halbersma » Tue May 26, 2020 11:32

Fabien Letouzey wrote:
Tue May 26, 2020 08:22
- think hard about "original work"; I don't consider the current draughts engines "original" enough for healthy competition (hopefully one day), but your rules would make them so, IMO by construction
I agree that Scan, Kingsrow, Maximus, Damage etc. who are now using the Scan-like patterns + MTD-f / PVS search, opening books, Kingsrow like endgame dbs and bitboard infrastructure are all very close in the design space. There are no high-performance AlphaZero draughts programs yet.

In any case, insisting on zero knowledge bootstrapping and self-written or public library optimizers will not generate diversity by itself. Using the same patterns + search will eventually lead to very similar programs. Maybe doing RL instead of SL can squeeze out a bit of extra Elo, but I doubt it.

And the question is, is that bad? Back in 2007, Ed pushed the frontier by his huge dbs and opening book. Gerard/Michel/Bert caught up with the dbs. Then you came along with the patterns. Again, others caught up, at different rates of progress. There will be other innovations, which will be copied by everyone to catch up. Until the next innovation.

Is this any different in chess? How different are Stockfish/Komodo/Houdini?

Post Reply