Breakthrough Draughts

 Posts: 1669
 Joined: Wed Apr 14, 2004 16:04
 Contact:
Re: Breakthrough Draughts
Ah OK, thanks I understand. From what I learnt from Ed's description, he generates the full results for all positions, including captures, and then after generation he sets the capture positions to the most common value (e.g. a draw) and then he compresses the databases. With that method, you can still generate capture statistics just before compression.TAILLE wrote:I don't quite understand your question.Rein Halbersma wrote:
Hi Gerard,
Is it possible for you to modify your routine so as to print the capture positions after the last generation pass? And then drop them for the final compression stage? Or would this be a too invasive change of your generator?
Rein
Basically my routine looks like:
for (db = first db; db < last db; db = following db)
{
for (p = first position of the db; p < last position; p = following position)
{
if (p == capture position) continue;
else {generate successors; evaluate p; save result in memory db}
}
compress memory db and store on disk;
}
Of course I added some complexity in order to have a multithread routine
Re: Breakthrough Draughts
I did a noncapture count of my databases, and the numbers matchTAILLE wrote:Hi Michel,
Good news to see your interesting by this beautitul draughts variant.
As with Bert I have the same difficulty with you : I did not generate the capture positions. Bert managed to add statitistics on non capture position in order to compare our figures but I do not know if it is easy for your implementation to do the same.
FYI my figures are
1x1 NCW1= 1 028 NCL1= 837 NCW2=  NCL2= 
2x1 NCW1= 30 993 NCL1= 6 865 NCW2= 11 268 NCL2= 26 595
2x2 NCW1= 424 883 NCL1= 285 789 NCW2=  NCL2= 
3x1 NCW1= 446 168 NCL1= 55 534 NCW2= 113 391 NCL2= 388 438
3x2 NCW1= 6 778 061 NCL1= 1 948 633 NCW2= 3 775 435 NCL2= 4 952 702
3x3 NCW1= 63 546 643 NCL1= 35 985 469 NCW2=  NCL2= 
4x1 NCW1= 4 518 137 NCL1= 363 194 NCW2= 927 678 NCL2= 3 954 768
4x2 NCW1= 67 423 737 NCL1= 11 426 249 NCW2= 28 548 951 NCL2= 50 312 655
4x3 NCW1= 652 076 102 NCL1= 184 976 683 NCW2= 438 218 089 NCL2= 398 830 286
4x4 NCW1= 4 486 519 068 NCL1= 2 080 557 566 NCW2=  NCL2= 
How big a database did you build?
Michel
Re: Breakthrough Draughts
Hi Michel,MichelG wrote:I did a noncapture count of my databases, and the numbers matchTAILLE wrote:Hi Michel,
Good news to see your interesting by this beautitul draughts variant.
As with Bert I have the same difficulty with you : I did not generate the capture positions. Bert managed to add statitistics on non capture position in order to compare our figures but I do not know if it is easy for your implementation to do the same.
FYI my figures are
1x1 NCW1= 1 028 NCL1= 837 NCW2=  NCL2= 
2x1 NCW1= 30 993 NCL1= 6 865 NCW2= 11 268 NCL2= 26 595
2x2 NCW1= 424 883 NCL1= 285 789 NCW2=  NCL2= 
3x1 NCW1= 446 168 NCL1= 55 534 NCW2= 113 391 NCL2= 388 438
3x2 NCW1= 6 778 061 NCL1= 1 948 633 NCW2= 3 775 435 NCL2= 4 952 702
3x3 NCW1= 63 546 643 NCL1= 35 985 469 NCW2=  NCL2= 
4x1 NCW1= 4 518 137 NCL1= 363 194 NCW2= 927 678 NCL2= 3 954 768
4x2 NCW1= 67 423 737 NCL1= 11 426 249 NCW2= 28 548 951 NCL2= 50 312 655
4x3 NCW1= 652 076 102 NCL1= 184 976 683 NCW2= 438 218 089 NCL2= 398 830 286
4x4 NCW1= 4 486 519 068 NCL1= 2 080 557 566 NCW2=  NCL2= 
How big a database did you build?
Michel
I generated also 5x1, 5x2, 5x3, 5x4, 5x5, 6x1, 6x2, 6x3 and 6x4.
My goal was to reach 6x6 but during the 6x5 generation I noted my db reached 300Gb and the 6x5 db was just half generated. I stopped it because I conclude I will not be able to generate the 6x6 db due to the space needed on the disk. In addition the time needed becomes also quite large.
Now I make a pause on the generation process and I work on the eval function.
Later I expect to find a new compression mechanism in order to solve disk space issue.
Gérard
Re: Breakthrough Draughts
A small update from my part.
I have added the breakthrough rules to dragon.
First i played with the original evaluation function (from international draughts) against scan 3.0. This scores about 10%.
Currently i am learning a new set of evaluation patterns for breakthrough. The initial results are encouraging; after about 80 hours of learning, dragon wins about 33%. (using 15field patterns)
Before increasing the learning time, there is some more research to do. For example selecting the right positions to feed the learner, and finding the best hyperparameters.
Dragon now uses 7 piece endgame databases, but it seems that these increase strength only by a relatively small amount.
Michel
I have added the breakthrough rules to dragon.
First i played with the original evaluation function (from international draughts) against scan 3.0. This scores about 10%.
Currently i am learning a new set of evaluation patterns for breakthrough. The initial results are encouraging; after about 80 hours of learning, dragon wins about 33%. (using 15field patterns)
Before increasing the learning time, there is some more research to do. For example selecting the right positions to feed the learner, and finding the best hyperparameters.
Dragon now uses 7 piece endgame databases, but it seems that these increase strength only by a relatively small amount.
Michel

 Posts: 937
 Joined: Thu Jun 20, 2013 17:16
 Real name: Krzysztof Grzelak
Re: Breakthrough Draughts
I'm sorry to ask you Michel, I understand that you have added updates on your homepage program.
Re: Breakthrough Draughts
Hi Michel,MichelG wrote:A small update from my part.
I have added the breakthrough rules to dragon.
First i played with the original evaluation function (from international draughts) against scan 3.0. This scores about 10%.
Currently i am learning a new set of evaluation patterns for breakthrough. The initial results are encouraging; after about 80 hours of learning, dragon wins about 33%. (using 15field patterns)
Before increasing the learning time, there is some more research to do. For example selecting the right positions to feed the learner, and finding the best hyperparameters.
Dragon now uses 7 piece endgame databases, but it seems that these increase strength only by a relatively small amount.
Michel
I continue to work on breakthrough draugths. I am trying to understand reinforcement learning, I confess it is quite hard for me but I am really motivated though my progress are rather slow!
Could I ask you one or two questions Michel? I tried my first reinforcement learning on a very simple base : what is the value of a material advantage ? In my classical evaluation I decided that the value of one man was +100 and the value of a win was equal to the value of 20 men i.e. +2000. With my first experiments with reinforcement learning and keeping the value +2000 for a win the programm told me after a reinforcement learning that the value of an advantage of one man is a value near from +1000, the value of an advantage of 2 men is a value near form 1600 etc. and the value of en advantage of 10 men is very near from +2000.
Clearly the values I estimated represent the probability to win and obvioulsly it could not be a linear function. Do you also have a estimation function representing the probablity of winning (this sounds a great simplication due to the fact that the draw does not exist) and do you have also such nonlinear results?
Gérard
Gérard
Re: Breakthrough Draughts
Hi Gérard,
The learning that i implemented, uses the logistic function to map the sum of features to a winning rate:
winrate = 1/(1+exp(f))
The winrate is the number of points you can expect to gain from the position (loss=0, draw=0.5, win=1.0), f is the sum of the feature weights. The learner finds the weight_of_feature values that minimises sumof(expected_winrate  actual_score) for all positions.
If you apply this to the learning mechanism, you end up with a sum of features (f) for each position, and from that you can calculate the win rate.
For example, suppose you have 2 features:
feature1 = number of my man  number of opponent man:
feature2 = my_mobility  oppponent_mobility
It might figure out the following:
weigth_of_feature1= 0.9
weigth_of_feature2= 0.1
and f=1 * 0.9 + 2 * 0.1 = one point one
In a position where you are 1 man ahead and have 2 mobility difference, the winrate becomes:
winrate = 1/(1+exp((1 * 0.9 + 2 * 0.1) )) = 0.75
In this hypothetical evaluation function, the winrate gets closer to 1 as the advantage gets bigger.
Note that the feature weights are close to zero. In dragon, i make these integers for faster processing by multiplying them by 1000 or so. So weigth_of_feature1= 900, weigth_of_feature2= 100 and winrate = 1/(1+(exp(0.001*f)))
Also, you don't need to actually compute the winrate. Since the logistic function is monotone, you can use use the sum of features as score function.
If you want to learn more, i think it is best to read some articles on logistic regression techniques.
One fun thing i noticed when experimenting, is that my evaluation function can work without counting the mandifference, and be based purely on patterns alone. It's only a minor disadvantage.
Michel
The learning that i implemented, uses the logistic function to map the sum of features to a winning rate:
winrate = 1/(1+exp(f))
The winrate is the number of points you can expect to gain from the position (loss=0, draw=0.5, win=1.0), f is the sum of the feature weights. The learner finds the weight_of_feature values that minimises sumof(expected_winrate  actual_score) for all positions.
If you apply this to the learning mechanism, you end up with a sum of features (f) for each position, and from that you can calculate the win rate.
For example, suppose you have 2 features:
feature1 = number of my man  number of opponent man:
feature2 = my_mobility  oppponent_mobility
It might figure out the following:
weigth_of_feature1= 0.9
weigth_of_feature2= 0.1
and f=1 * 0.9 + 2 * 0.1 = one point one
In a position where you are 1 man ahead and have 2 mobility difference, the winrate becomes:
winrate = 1/(1+exp((1 * 0.9 + 2 * 0.1) )) = 0.75
In this hypothetical evaluation function, the winrate gets closer to 1 as the advantage gets bigger.
Note that the feature weights are close to zero. In dragon, i make these integers for faster processing by multiplying them by 1000 or so. So weigth_of_feature1= 900, weigth_of_feature2= 100 and winrate = 1/(1+(exp(0.001*f)))
Also, you don't need to actually compute the winrate. Since the logistic function is monotone, you can use use the sum of features as score function.
If you want to learn more, i think it is best to read some articles on logistic regression techniques.
One fun thing i noticed when experimenting, is that my evaluation function can work without counting the mandifference, and be based purely on patterns alone. It's only a minor disadvantage.
Michel
Re: Breakthrough Draughts
Hi Michel,MichelG wrote:Hi Gérard,
The learning that i implemented, uses the logistic function to map the sum of features to a winning rate:
winrate = 1/(1+exp(f))
The winrate is the number of points you can expect to gain from the position (loss=0, draw=0.5, win=1.0), f is the sum of the feature weights. The learner finds the weight_of_feature values that minimises sumof(expected_winrate  actual_score) for all positions.
If you apply this to the learning mechanism, you end up with a sum of features (f) for each position, and from that you can calculate the win rate.
For example, suppose you have 2 features:
feature1 = number of my man  number of opponent man:
feature2 = my_mobility  oppponent_mobility
It might figure out the following:
weigth_of_feature1= 0.9
weigth_of_feature2= 0.1
and f=1 * 0.9 + 2 * 0.1 = one point one
In a position where you are 1 man ahead and have 2 mobility difference, the winrate becomes:
winrate = 1/(1+exp((1 * 0.9 + 2 * 0.1) )) = 0.75
In this hypothetical evaluation function, the winrate gets closer to 1 as the advantage gets bigger.
Note that the feature weights are close to zero. In dragon, i make these integers for faster processing by multiplying them by 1000 or so. So weigth_of_feature1= 900, weigth_of_feature2= 100 and winrate = 1/(1+(exp(0.001*f)))
Also, you don't need to actually compute the winrate. Since the logistic function is monotone, you can use use the sum of features as score function.
If you want to learn more, i think it is best to read some articles on logistic regression techniques.
One fun thing i noticed when experimenting, is that my evaluation function can work without counting the mandifference, and be based purely on patterns alone. It's only a minor disadvantage.
Michel
First of all thank for your answer.
I fully understand your "winrate = 1/(1+exp(f))" function. Because this function is monotone it is clear you do not need to calculate it explicitly; you can use only the f fonction as an eval function, without calculating the winrate. I am only surprise by seeing your function is non linear.
Let's take the breakthrough draughts. We all know that the initial position is either a winning one or a losing one. Suppose now that after 1000 games you observed 600 wins and 400 loss. That means that the winrates is 60%. In my implementation where a win is +2000 and a loss is 2000 the value of the intitial position will be about +400 which is the linear value of 60% over the range [2000, +2000].
Because you function f is not a linear function of the winrate I have some difficulty to see its real meaning though I see clearly that you can easily derive the winrate from this f function.
Gérard
Re: Breakthrough Draughts
I don't think there is any 'human' meaning to the function. It just maps infinite to +infinite to the [0,1] range, and does so with some particular nice properties.
One thing to consider is that when for example you are 5 pieces ahead and say f=2.5, winning another piece doesn't increase the winrate by much because the derivative of the logistic function is small for high f. This correspondence allows simple relations between for example the piecedifference and score.
In your example, a 60% win rate would give f=0.405. Mathematically, you can't map this a another range, like [2000, 2000], because f could in theory be infinitely large.
In practice, this is not the case though; f is usually in the range [5, +5] because otherwise the game is over within a few moves.
In dragon, i multiply f by 500, and limit it to the range [29000, +29000]. With f=58, the eval says it has a 0.9999.. winrate anyway.
Michel
One thing to consider is that when for example you are 5 pieces ahead and say f=2.5, winning another piece doesn't increase the winrate by much because the derivative of the logistic function is small for high f. This correspondence allows simple relations between for example the piecedifference and score.
In your example, a 60% win rate would give f=0.405. Mathematically, you can't map this a another range, like [2000, 2000], because f could in theory be infinitely large.
In practice, this is not the case though; f is usually in the range [5, +5] because otherwise the game is over within a few moves.
In dragon, i multiply f by 500, and limit it to the range [29000, +29000]. With f=58, the eval says it has a 0.9999.. winrate anyway.
Michel
Re: Breakthrough Draughts
MichelG wrote:I don't think there is any 'human' meaning to the function. It just maps infinite to +infinite to the [0,1] range, and does so with some particular nice properties.
One thing to consider is that when for example you are 5 pieces ahead and say f=2.5, winning another piece doesn't increase the winrate by much because the derivative of the logistic function is small for high f. This correspondence allows simple relations between for example the piecedifference and score.
In your example, a 60% win rate would give f=0.405. Mathematically, you can't map this a another range, like [2000, 2000], because f could in theory be infinitely large.
In practice, this is not the case though; f is usually in the range [5, +5] because otherwise the game is over within a few moves.
In dragon, i multiply f by 500, and limit it to the range [29000, +29000]. With f=58, the eval says it has a 0.9999.. winrate anyway.
Michel
Hi Michel,
Maybe I am wrong but it appears to me that by calculating your f function you consider that the piecedifference is associated only one feature. Is it true?
FYI in my implementation each piecedifference value (from 1 to 20) is associated to one feature among 20. That way the learning process can evaluate precisely a piecedifference.
The advantage is the following: suppose you have the choice between a move winning a man (but with a poor resulting positionnal position) and a move allowing to reach a good positionnal position. The choice will really depend of the current piecedifference: with a very small piecedifference Damy will certainly choose to win a man but with a bigger piecediffernce Damy will choose the positional move which sounds clever for a human point of view doesn't it?
Gérard
Re: Breakthrough Draughts
Ah, i understand how you implemented the piece difference feature. I can see the advantages of doing it that way.TAILLE wrote:
Hi Michel,
Maybe I am wrong but it appears to me that by calculating your f function you consider that the piecedifference is associated only one feature. Is it true?
FYI in my implementation each piecedifference value (from 1 to 20) is associated to one feature among 20. That way the learning process can evaluate precisely a piecedifference.
The advantage is the following: suppose you have the choice between a move winning a man (but with a poor resulting positionnal position) and a move allowing to reach a good positionnal position. The choice will really depend of the current piecedifference: with a very small piecedifference Damy will certainly choose to win a man but with a bigger piecediffernce Damy will choose the positional move which sounds clever for a human point of view doesn't it?
In dragon, there is only 1 feature for piecedifference and it is indeed linear with the amount of difference.
The problem i see with having 20 features, is that you might have very little example positions for certain differences. For instance if you value +1 piece at 400, +2 pieces at 600, etc, +14 pieces might still be 0, because +14 does not happen in your learning examples.
Then, when the engine does happen to stumble upon +14, it will value it wrong because it has never seen a +14 before.
There is a compromise you can make: +1 is a feature, +2 is a feature, and then there is a third feature which is linear to the amount of difference above 2.
Currently dragon only has 3 fixed features plus the patterns. I haven't really looked into designing the fixed features yet; they appear to be not very important. I plan to do some more experiments with it later though.
Michel
Re: Breakthrough Draughts
In order to solve the problem you mentionned I implemented two points:MichelG wrote:Ah, i understand how you implemented the piece difference feature. I can see the advantages of doing it that way.TAILLE wrote:
Hi Michel,
Maybe I am wrong but it appears to me that by calculating your f function you consider that the piecedifference is associated only one feature. Is it true?
FYI in my implementation each piecedifference value (from 1 to 20) is associated to one feature among 20. That way the learning process can evaluate precisely a piecedifference.
The advantage is the following: suppose you have the choice between a move winning a man (but with a poor resulting positionnal position) and a move allowing to reach a good positionnal position. The choice will really depend of the current piecedifference: with a very small piecedifference Damy will certainly choose to win a man but with a bigger piecediffernce Damy will choose the positional move which sounds clever for a human point of view doesn't it?
In dragon, there is only 1 feature for piecedifference and it is indeed linear with the amount of difference.
The problem i see with having 20 features, is that you might have very little example positions for certain differences. For instance if you value +1 piece at 400, +2 pieces at 600, etc, +14 pieces might still be 0, because +14 does not happen in your learning examples.
Then, when the engine does happen to stumble upon +14, it will value it wrong because it has never seen a +14 before.
There is a compromise you can make: +1 is a feature, +2 is a feature, and then there is a third feature which is linear to the amount of difference above 2.
Currently dragon only has 3 fixed features plus the patterns. I haven't really looked into designing the fixed features yet; they appear to be not very important. I plan to do some more experiments with it later though.
Michel
1) I initialised the value of my 20 features to a value near from +2000
2) As soon as I want to modify a value I take into account a handwritten constraint saying that these values have to be ordered.
Concerning the fixed features my feeling is that these features, like piecedifference, have not a linear behavior. As a consequence I will try to have same approach by dividing the feature in 20 or more features.
For the moment I have identified also 3 fixed features: piecedifference, tempodifference and rightleft balance.
I am also analysing what could be the relevant patterns to consider (I do not like very much the patterns used by Fabien but we can never be sure can we?)
Gérard
Re: Breakthrough Draughts
5 years ago, I wrote an optimizer for finding the best patterns. However, it takes a long time to do so since you have to go through the learning procedure hundreds of times.TAILLE wrote:
In order to solve the problem you mentionned I implemented two points:
1) I initialised the value of my 20 features to a value near from +2000
2) As soon as I want to modify a value I take into account a handwritten constraint saying that these values have to be ordered.
Concerning the fixed features my feeling is that these features, like piecedifference, have not a linear behavior. As a consequence I will try to have same approach by dividing the feature in 20 or more features.
For the moment I have identified also 3 fixed features: piecedifference, tempodifference and rightleft balance.
I am also analysing what could be the relevant patterns to consider (I do not like very much the patterns used by Fabien but we can never be sure can we?)
So i only did this based on a small amount of examples, on 10field patterns.
When compared to the original (squarelike) patterns, there definitely is an improvement. In one test, the winrate went from 60% to 62%.
I don't yet have a mechanism to optimize the bigger (15 fields and up) patterns. The problem is that it takes so much CPU time, i feel that it is better to spent this CPU time on finding more training examples. I will try and optimize 12field patterns soon.
Below is one of the patterns that came out of the optimizer (out of 24)
Code: Select all
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 1 1 0 0
0 1 1 0 0
0 1 1 1 0
0 0 1 0 0
0 0 1 0 0
0 1 0 0 0
0 0 0 0 0

 Posts: 798
 Joined: Sat Apr 28, 2007 14:53
 Real name: Ed Gilbert
 Location: Morristown, NJ USA
 Contact:
Re: Breakthrough Draughts
Hi Michel,
 Ed
I never tried that experiment, but it is not intuitive to me that it would only be a minor disadvantage. Without an eval term for man difference, I would expect the engine might toss a piece just to gain some minor positional advantage. Do you understand why this doesn't seem to be a problem?MichelG wrote: One fun thing i noticed when experimenting, is that my evaluation function can work without counting the mandifference, and be based purely on patterns alone. It's only a minor disadvantage.
Michel
 Ed
Re: Breakthrough Draughts
My guess is that the patterns themselves can do the count. Maybe patterns with more man get higher values and it is regulated that way, albeit with some loss of accuracy.Ed Gilbert wrote:Hi Michel,
I never tried that experiment, but it is not intuitive to me that it would only be a minor disadvantage. Without an eval term for man difference, I would expect the engine might toss a piece just to gain some minor positional advantage. Do you understand why this doesn't seem to be a problem?
 Ed
In one experiment, i trained my eval on 16m positions and got a score of 0.7 (out of 1.0) against some reference eval function. Removing the piececount lowered it to 0.67.
I also tried eval features for 1 piece, +1 piece, 2 pieces, +2 pieces, somewhat similar to Gérard plans on this. That added about 45 elo points (e.g. barely measureable after 50k games), but it adds memory and time requirements for the optimizer.
So as a very rough gideline (in breakthrough):
 each doubling of the amount of training positions adds about 4060 points of elo.
 Fixed feature: piececount : about 3050 elo points
 Fixed feature: specific piececount: about 45 elo points (compared to linear piececount)
 Fixed feature: tempo difference: so small it is not measureable after 50k games.
 Using an optimised set of patterns instead of squares: 2040 elo.
For international draughts, just divide elo points by 10.
Michel