World Draughts Forum

It is currently Wed Jan 17, 2018 15:28

All times are UTC+01:00




Post new topic  Reply to topic  [ 203 posts ]  Go to page Previous 110 11 12 13 14
Author Message
PostPosted: Fri Dec 29, 2017 12:29 
Offline

Joined: Sat Apr 28, 2007 14:53
Posts: 732
Real name: Ed Gilbert
Location: Morristown, NJ USA
Quote:
My guess is that the patterns themselves can do the count. Maybe patterns with more man get higher values and it is regulated that way, albeit with some loss of accuracy.

Thanks, that sounds like a reasonable explanation. I had added a tempo eval and seen only a tiny improvement. Now I understand why.

-- Ed


Top
   
PostPosted: Sat Jan 06, 2018 22:32 
Offline

Joined: Thu Apr 26, 2007 18:51
Posts: 822
Location: FRANCE
MichelG wrote:
Ed Gilbert wrote:
Hi Michel,

I never tried that experiment, but it is not intuitive to me that it would only be a minor disadvantage. Without an eval term for man difference, I would expect the engine might toss a piece just to gain some minor positional advantage. Do you understand why this doesn't seem to be a problem?

-- Ed

My guess is that the patterns themselves can do the count. Maybe patterns with more man get higher values and it is regulated that way, albeit with some loss of accuracy.

In one experiment, i trained my eval on 16m positions and got a score of 0.7 (out of 1.0) against some reference eval function. Removing the piececount lowered it to 0.67.

I also tried eval features for -1 piece, +1 piece, -2 pieces, +2 pieces, somewhat similar to Gérard plans on this. That added about 4-5 elo points (e.g. barely measureable after 50k games), but it adds memory and time requirements for the optimizer.

So as a very rough gideline (in breakthrough):
- each doubling of the amount of training positions adds about 40-60 points of elo.
- Fixed feature: piececount : about 30-50 elo points
- Fixed feature: specific piececount: about 4-5 elo points (compared to linear piececount)
- Fixed feature: tempo difference: so small it is not measureable after 50k games.
- Using an optimised set of patterns instead of squares: 20-40 elo.

For international draughts, just divide elo points by 10.

Michel


Hi Michel,

Concerning tempo difference do you also consider the patterns themselves can do the count or do you think that this feature is irrelevant?

_________________
Gérard


Top
   
PostPosted: Sun Jan 07, 2018 14:59 
Offline

Joined: Sun Dec 28, 2003 20:24
Posts: 244
TAILLE wrote:

Hi Michel,

Concerning tempo difference do you also consider the patterns themselves can do the count or do you think that this feature is irrelevant?


I think the question is whether you can estimate the tempo difference by using patterns alone. The answer is a definite yes here. In fact, you can create the weights by hand if you would want to.

I believe, for the fixed features to be of much use, it should not be possible to have a pattern predict their value.

There is 1 exception: if it the fixed feature is very important, like piece-difference, adding it makes it easier for the weights to converge on the right values, especially if you are using L2 regularization; all energy goes into 1 weight instead of thousands.

Here are some examples of features that are not predicable by patterns:
piece_difference * tempo_difference
tempo_difference > constant
tempo_difference cubed
distance_to_promotionLine_difference
etc

I tried some of these, but i have yet not found a particular useful fixed feature.

Michel


Top
   
PostPosted: Sun Jan 07, 2018 19:46 
Offline

Joined: Thu Apr 26, 2007 18:51
Posts: 822
Location: FRANCE
MichelG wrote:
TAILLE wrote:

Hi Michel,

Concerning tempo difference do you also consider the patterns themselves can do the count or do you think that this feature is irrelevant?


I think the question is whether you can estimate the tempo difference by using patterns alone. The answer is a definite yes here. In fact, you can create the weights by hand if you would want to.

I believe, for the fixed features to be of much use, it should not be possible to have a pattern predict their value.

There is 1 exception: if it the fixed feature is very important, like piece-difference, adding it makes it easier for the weights to converge on the right values, especially if you are using L2 regularization; all energy goes into 1 weight instead of thousands.

Here are some examples of features that are not predicable by patterns:
piece_difference * tempo_difference
tempo_difference > constant
tempo_difference cubed
distance_to_promotionLine_difference
etc

I tried some of these, but i have yet not found a particular useful fixed feature.

Michel


Hi Michel,

It seems I have a different view (but I have not your experience in reinforcement learning !)

I agree with you that the tempo_difference can be estimated by the patterns alone but does that mean that we should avoid to use such fixed feature?
My view is the following : because the various patterns could be highly correlated I surely need some regulation to help convergence. Now the point is the following : tempo_difference is a global feature with no major correlation with other features. I think in this case that by handling the tempo-difference with a separate feature I can handled this separate feature without regulation hoping to reach a better convergence with patterns handled with regulation.

Of course I have to run a lot of tests to confirm this approach!

_________________
Gérard


Top
   
PostPosted: Fri Jan 12, 2018 20:32 
Offline

Joined: Sun Dec 28, 2003 20:24
Posts: 244
TAILLE wrote:

Hi Michel,

It seems I have a different view (but I have not your experience in reinforcement learning !)

I agree with you that the tempo_difference can be estimated by the patterns alone but does that mean that we should avoid to use such fixed feature?
My view is the following : because the various patterns could be highly correlated I surely need some regulation to help convergence. Now the point is the following : tempo_difference is a global feature with no major correlation with other features. I think in this case that by handling the tempo-difference with a separate feature I can handled this separate feature without regulation hoping to reach a better convergence with patterns handled with regulation.

Of course I have to run a lot of tests to confirm this approach!

I agree it can potentially help with regularization.

But suppose you can express the tempo-difference of a position in patterns. (which can be done in an exact manner depending on how your patterns are organized)

Then you can add the value of tempo-difference-patterns to the rest of your patterns and discard your original tempo-weight estimate:

oldEval=tempoWeight*tempo + sum(1,N)[patternWeightEval(x)]
and
tempoWeight*tempo = sum(1,N)[patternWeightTempo(x)]
this means
oldEval=sum(1,N)[(patternWeightEval(x)+patternWeightTempo(x)]

Which saves you time in the eval function, since you don't need to compute the tempo anymore, and you can add these weights offline.

Michel


Top
   
PostPosted: Fri Jan 12, 2018 23:29 
Offline

Joined: Thu Apr 26, 2007 18:51
Posts: 822
Location: FRANCE
MichelG wrote:
TAILLE wrote:

Hi Michel,

It seems I have a different view (but I have not your experience in reinforcement learning !)

I agree with you that the tempo_difference can be estimated by the patterns alone but does that mean that we should avoid to use such fixed feature?
My view is the following : because the various patterns could be highly correlated I surely need some regulation to help convergence. Now the point is the following : tempo_difference is a global feature with no major correlation with other features. I think in this case that by handling the tempo-difference with a separate feature I can handled this separate feature without regulation hoping to reach a better convergence with patterns handled with regulation.

Of course I have to run a lot of tests to confirm this approach!

I agree it can potentially help with regularization.

But suppose you can express the tempo-difference of a position in patterns. (which can be done in an exact manner depending on how your patterns are organized)

Then you can add the value of tempo-difference-patterns to the rest of your patterns and discard your original tempo-weight estimate:

oldEval=tempoWeight*tempo + sum(1,N)[patternWeightEval(x)]
and
tempoWeight*tempo = sum(1,N)[patternWeightTempo(x)]
this means
oldEval=sum(1,N)[(patternWeightEval(x)+patternWeightTempo(x)]

Which saves you time in the eval function, since you don't need to compute the tempo anymore, and you can add these weights offline.

Michel


Hi Michel,

I understand but I do not see clearly how the tempo_difference can be estimated by the patterns.
Let's take a simple example. Suppose you are dealing only with classical game. With a separate feature and by dividing the tempo_difference in for example 20 possible values from -10 to +9 you might find that:
it is disadvantage to have a tempo_difference between -10 and -6
it is an advantage to have a tempo_difference between -5 and -1
it is disadvantage to have a tempo_difference between 0 and 4
it is an advantage to have a tempo_difference between 5 and 9
How could you code this in the pattern's values?

_________________
Gérard


Top
   
PostPosted: Sat Jan 13, 2018 17:52 
Offline

Joined: Sun Dec 28, 2003 20:24
Posts: 244
TAILLE wrote:

Hi Michel,

I understand but I do not see clearly how the tempo_difference can be estimated by the patterns.
Let's take a simple example. Suppose you are dealing only with classical game. With a separate feature and by dividing the tempo_difference in for example 20 possible values from -10 to +9 you might find that:
it is disadvantage to have a tempo_difference between -10 and -6
it is an advantage to have a tempo_difference between -5 and -1
it is disadvantage to have a tempo_difference between 0 and 4
it is an advantage to have a tempo_difference between 5 and 9
How could you code this in the pattern's values?

You can't...

Using patterns, you could only encode a linear function of the tempo, e.g. value=W*tempo

That is the both the strength and limitation of the entire approach; it only encodes linear functions. If you want non-linearity, you need to add features or structure manually. And that can get very complex fast. Maybe it is an advantage to have between +4 and +8 tempo when you are 1 man ahead, and have 2 pieces on row 6.

There is one way around it; use deep neural nets. Those can learn the non-linear relations as well. However the neural nets come with a big disadvantage; it takes a huge amount of computing power to evaluate them.

Google seems to be getting away with that in go and chess, but that might be more difficult for us simple programmers.

Michel


Top
   
PostPosted: Sat Jan 13, 2018 19:29 
Offline

Joined: Thu Apr 26, 2007 18:51
Posts: 822
Location: FRANCE
MichelG wrote:
TAILLE wrote:

Hi Michel,

I understand but I do not see clearly how the tempo_difference can be estimated by the patterns.
Let's take a simple example. Suppose you are dealing only with classical game. With a separate feature and by dividing the tempo_difference in for example 20 possible values from -10 to +9 you might find that:
it is disadvantage to have a tempo_difference between -10 and -6
it is an advantage to have a tempo_difference between -5 and -1
it is disadvantage to have a tempo_difference between 0 and 4
it is an advantage to have a tempo_difference between 5 and 9
How could you code this in the pattern's values?

You can't...

Using patterns, you could only encode a linear function of the tempo, e.g. value=W*tempo

That is the both the strength and limitation of the entire approach; it only encodes linear functions. If you want non-linearity, you need to add features or structure manually. And that can get very complex fast. Maybe it is an advantage to have between +4 and +8 tempo when you are 1 man ahead, and have 2 pieces on row 6.

There is one way around it; use deep neural nets. Those can learn the non-linear relations as well. However the neural nets come with a big disadvantage; it takes a huge amount of computing power to evaluate them.

Google seems to be getting away with that in go and chess, but that might be more difficult for us simple programmers.

Michel


Hi Michel,

I agree with all what you said Michel. In particular I think also that patterns can only encode linear functions.
In addition I do not want to handle explicitly non-linear function. For that reason I will continue to explore having several features for dealing with tempo_difference depending of the tempo_difference itself and depending of the type of game (typically classical or non-classical game).

_________________
Gérard


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic  [ 203 posts ]  Go to page Previous 110 11 12 13 14

All times are UTC+01:00


Who is online

Users browsing this forum: No registered users and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Limited