|
|
|
||||||
It seems counter-intuitive to think that endgame tablebases can hurt the performance of a chess engine. There may be arguments about how much a full set of 3-4-5 men tablebases help ,50 elo according to some, as low as 20 elo or even nothing according to others (Results by Robert Allgeuer and Les Fernandez using Crafty and Yace show no significant difference in playing strength.) depending on how much endgame knowledge is programmed in already but it seems that they should never hurt. Later you will see that in fact, in some positions, use of tablebases actually lead to weaker play, or may cause the engine to spend more time before finding the best move.
Even if you disregard such positions as rare freak positions, the small gains when using tablebases, cries out for some explaining, since on the surface tablebases should improve play by quite a bit since chess engines can use them in search and not only when the position appears on the board. It seems to me that perhaps some factors are holding back the chess engines which offset or even more compensate the innate advantages of using tablebases. Currently, we have only complete tablebase sets for 5 men and below, so we shall be discussing only those.Almost everyone agrees the current set of pawnless 6 men tablebases available are probably not worth using. Here are some possibilities (from least serious to most serious) on how tablebases can hurt performance.
The first possibility is a result of playing with incomplete tablebases. This is very well known and easy to avoid both by using all tablebases, or by programming work-arounds by the programmer. See this long explanation on how this works.
Another possibility that might occur results from the fact that the Nalimov tablebases used do not take into account castling.
In this position, Queen side castling is still available for white. Crafty 18.13 (in Winboard) with relevant 3,4,5 men tablebases will play 1..Kxd7? Strangely it thinks that 2 0-0-0+ with mate in 17 (for white) is the best line!In reality without endgame tablebases Crafty will play 1.. Kd8 where 2.Ra8+ (there is nothing better given the threat of h1=Q mate) Kxd7 and it's a simple tablebase mate in 15 for black.So the given position is actually a forced mate in 17 for black!
In my view such positions are trivial and very rare (when was the last time you reached an endgame with castling rights?] and not much of a problem to worry about. Besides it's possible for programmers to just add a small workaround (by checking for castling rights) to avoid this problem.
A much more serious problem in my view is that Chess engines don't take into account the fifty move rule.
In the above position with the endgame tablebase KBRKR enabled, at short time controls, Crafty wants to play 1.Rxf5 Rxf5 with a mate in 66. While the tablebase position after 1.Rxf5 Rxf5 is a forced mate in 65, unfortunately, under FIDE rules, black can claim a draw under the 50 moves rule.This hold true for all tablebase wins that require more than 50 moves (assuming no captures,pawn moves of course) to force a win.Note: in the past FIDE had allowed special exceptions to the 50 moves rule for certain cases (eg KNNKP ), but eventually removed them all.
But does White have better or is the game drawn anyway? In
fact, if you leave Crafty think long enough it starts to see
shorter mates from a mate in 46 to 23 (with 1.Kd8) to 1.Kc6 and
declares a mate in 20. Gandalf actually finds a mate in 17 (with
1.Kd6), so we know that it is at least a mate in 17!* Many
commercial Chess engines though stop searching once a mate is
found.
5k2/3K4/8/R4b2/1R6/8/3B4/5r2 w - - 0 1
* Note, the PV line if you are interested is 1.Kd6 Kg7 2.Rf4! Rxf4 3.Bxf4,this is the shortest mate I have seen (assuming it is correct) so far though there could be a shorter one.
Notice though this is a problem because Nalimov endgame tablebases is a Distance to Mate (DTM) tablebase. Distance to Conversion (DTC) tablebase theoretically could solve this problem, just by checking that mates that end in the same tablebase do not exceed 50 moves.) However the Thompson tablebases current doesn't account for this.
DTZ (distance to zeroing the fifty move counter) tables have being proposed but this is not without problems. For a more technical discussion , take a look at Guy Haworth's post on ICCJA articles . http://chess.jaet.org/endings/ now provides access to searching DTZ and DTZ50 tablebases.
There doesn't seem to be any obvious things a Chess engine author can do (besides building his own tablebases) that is 100% fool-proof, except to continue searching when endgame table mates (and how does the engine know that the tablebase win fails to the 50 moves rule?) longer than 50 are found. This is only a partial solution, however this assumes that there is a quick win within the search of the engine and that the engine has enough time to find it.
This results mainly from positions when one side with tablebases resigns too early when it sees a loss in the tbs when the other side (human or program without tbs) might not be able to win anyway. Other similar problems may be caused earlier up in the search (depending on how much egtb probing takes place), where the side with tbs avoids a promising and perhaps 99% winning move because the opponent can force to a draw (with perfect defence) but the position is actually really difficult to handle without the use of tablebases. Similarly the refusal to go to a "certain" (if you have tbs) lost endgame, preferring to give up material which is obviously but not certainly lost (from the view point of the program) instead is the flip side of the same problem.
Crafty has a swindling mode, that causes it to try to win theoretical drawn positions or draw theoretically won positions, but it only seems to work when we reach the actual endgame position.
Strictly speaking though, this isn't a real problem caused by tablebases if you are interested in 100% objective play. The inability to "play the odds" or swindle, is something all chess programs have regardless of whether egtbs are used or not.
Probing the tablebases slows down the engine because it needs to access the hard-disk (despite tb hashes) this can be very costly depending on how much the engine probes in search1. This is most difficult to assess. We know that probing tablebases lowers the speed in nodes/seconds and usually the depth displayed in the principle variation. This isn't a problem though since even with a smaller depth, it plays better by finding stronger moves. The question is does the gain from probing and hence getting accurate results on some branches offset the overall slower speed? This overall lower speed means much shorter depth in non terminal tablebase positions. If the probes are on done on useless positions that are never reached or can be avoided even with tablebases it might hurt a lot.
For example in the 4+1 tbs,when chess engines reach a position in search say KRRNK this is quickly assessed as a easy win on purely material grounds which is sufficient information already,there is no need to waste time probing the tablebase to get the exact mate score. This is a clear cut situation and Yace does not use them. Still, a very small number of such positions can be difficult for chess engines without tablebases. For example, some rare cases like KBPPK with wrong rook pawn. However the right knowledge (which Yace has) can cure this problem." It becomes less clear when we talk about some K+2 piece versus K+1 piece. For example KRR versus KP is probably not necessary to use tablebases but what about KRB versus KR? The latter is probably a draw and you can hard code the engine to recognise this, but there are a rare positions that will cause a loss, and you need to defend accurately against other tbs equipped engines.
This question might vary from engine to engine depending on how knowledge they already have, how much they probe tablebases etc and can be settled only by formal tests.
Robert Allgeuer has recently [22 July 2003]
posted a very interesting experiment.
Yace with various settings (tablebases, bitbases, different degrees of probing) was used with a
endgame cache of 8M against a variety of opponents. In summary he found that
1) Overall there was no significant difference in strength, among the different versions of Yace
using tablebases ,bitbases, or none at all.
2) However,he then "divided the games into three classes: games that have reached positions with 5 or fewer pieces (i.e. they reached positions contained in the 5 piece EGTBs),
games that are 60 or fewer moves long
(i.e. games that probably have already been decided during middlegame) and finally
those games that are longer than 60 moves, but did not reach positions with 5
or fewer pieces. The idea is that the last group of games are those
that were with a high probability decided during the endgame, where
probing tablebases and bitbases during the search plays an important role."
3)He found that using only games that reached positions with 5 or fewer pieces, there was a significant difference in strength." Yace versions without tablebase support had considerably weaker ratings than their overall rating, while engines with tablebase support lead the pack. Also less aggressively probing Yace versions have a lower rating than more aggressively probing ones. For Yace the difference between the version with no tablebase and bitbase support and full support is 115 ELO points and statistically significant." This looked logical, since such positions would allow endgame tablebases to be used to full effect without much ill-effects.
4. More importantly he found that taking into account only games longer than 60 moves that did not reach positions with 5 or fewer pieces, "Yace probing bitbases and tablebases results in a statistically significant decrease of its rating by 94 ELO points for such games."
5. A minor result was that bitbases didn't seem to make any difference, mainly because they covered only 3-4 pieces.
What does this mean? "Obviously (and logically) engines probing tablebases tend to play into won 5 piece positions (and avoid lost 5 piece positions), hence their high rating in such games. However, in the preceding endgame phase with still more pieces on the board tablebases somehow must be a sort of liability, so that engines using them lose games during this phase over proportion. Apparently statistically for each game they successfully maneuver into a won 5 piece position, they also lose another one during the preceding endgame, so that overall this effect more or less cancels out the advantages of tablebases."
Well put! This experiment clearly teases out the 2 different effects of using endgame tablebases. The stages where the chess engine probes but does not enter the 5 piece tablebase shows the cost of probing them and the games where the chess engine enters the positions covered the tablebases shows positions where they are 100% gain.
In a article published by Mr. Günter Rehburg, in August 2001 CSS computer chess magazine using 10 Nunn endgame positions and 7 commercial engines, score for the side using all the 5 men endgame tablebases versus the side with none was 70.5-69.5
However, it was noticed that in some positions, the side with no endgame tablebases fared better! Removing just one of the tablebases position (where the tablebase side won 10.5 to 3.5), leads to a score of 60.0 to 66.0! This means in the overall score for 9 positions, the none tablebase side did better!Remember this is a Nunn test, so conditions are exactly equal exact for the tablebase use.
The best scoring position for the none tablebase side is the one on the right.
The score was 9.5 for the none tbs using side to 4.5.
3R4/1p6/2b5/2P1k2p/p3p2P/P6r/1P2KB2/8 b - - 0 1
(Thanks to Mike Scheidl for description of article contents)
Of course, as we can see above (where leaving out 1 position changes the total score significantly), the choice of positions to study can has a big effect on the results. Still, it is significant that the above experiment does show that for some engines and positions, the use of all 3 to 5 men tablebases actually lowers performance rather then being merely useless.
In my view more tests needs to be done, on more positions and using more engines. I don't have the breakdown results for each engine, so it's possible that some Chess engines with better tuning might actually show superior results when using tablebases for all if not most positions.
Secondly, it compares the use of 5 men tablebases against the use of none use of tablebases. Chances are, the use of a subset of tablebases (while avoiding the incomplete table base problem of course) might be optimal. It's likely use of 4 men tablebases is "safe" , and maybe the commonly reached KRPKR endgame, when chess engines still can't handle properly today.
One thing to consider is that all the 5 factors considered above (with the exception of maybe the incomplete tablebase problem) that weaken play, actually result only when the tablebases are probed during searches. It's obvious that if the tablebases are used only when the actual position is reached (in other words, it takes control only when the position is reached, but doesn't supply the evaluation of a possible position), there would be no problems at all.
As such Mike Scheidl has suggested that it would be interesting if a Chess engine could allow the user to decide which tablebase sets should be probed during searches (ideally those that the engine cannot handle or evaluate correctly using normal search and knowledge) and which to be used only when the positions are reached.(everything else).
It also has be suggested that bitbases (which stores results either win/loss but not length of win and hence indirectly the optimal path ) can be helpful since it allows the engine to have correct evaluations at almost no cost since the bitbases are small enough to be held in memory, unlike accessing full tablebases which are generally slower. Still, it's one thing for an engine to know that a given position is won, another to actually find the moves. Also see above experiment which suggests bitbases might not be helpful.
It's interesting to speculate how the relative position will shift again when all the 6 men tablebases are complete. Will then the use of all 6 tablebases be optimal?
I hope I haven't frightened you away from using tablebases. In my view, all the factors above except the last (and the incomplete tablebase problem) is unlikely to be significant and on the balance even the effect of the last factor is hard to evaluate without much more tests. I personally will continue to use the full tablebase set.
Addendum [21-01-2002], Endgame tablebases generally use 1 byte to represent each position so it can represent values of up to either about 127 (nalimov) or 255 (De Koening). Hence this can be a problem if there is a forced mate longer than that. So far at least with the 5 piece tablebase set, this is not a problem because the longest mate is exactly 127.6 Piece tablebase sets will begin to use 2 bytes to represent each position. There is also a small chance that the method used to generate the tablebases are flawed, although there is some limited verification between different types of tablebases.
Sincerely
Aarontay
30-06-2002
My PGP - DH/DSS 4096/1024 Key
Acknowledgements : The majority of above discussion resulted from discussion by various posters in CCC,RGCC as well as a interesting email conversation with Mike Scheldi. Any errors however remains mine.
|
Home | Articles | Links | Guestbook | Poll | Email | Bookmark| |
|