Winboard and Chess engines FAQ

Section C - Opening books and Endgame tablebases

Section C

[C] Opening books, Endgame Tablebases and Hashtables

[C.1] What is a opening book?
[C.2] What is the book learning?
[C.3] How do I turn the opening book on?
[C.4] Is there a way to set uniform opening books for all Chess engines through a GUI book?
[C.5] Where can I download more opening books?
[C.6] What are endgame Tablebases [egtbs]? What is the difference between Nalimov,and Edward Tablebases?
[C.7] How many Tablebases do I need? Can I download only some of them?
[C.8] How do I turn the endgame tablebases on?
[C.9] Where can I get (buy,generate,download) more endgame tablebases?
[C.10] Okay. I followed the instructions but why is the endgame Tablebases still not working?
[C.11] There's a bug in the program, Why does Amy, TCB etc refuses to promote the pawn?
[C.12] What are Transposition/Hash tables?
[C.13] How much memory should I allocate in transposition table?

[C] Opening books, Endgame Tablebases and Hashtables

[C.1] What is a opening book?

Just as human players memorise opening moves, Chess programs have opening books which is a database of stored moves and positions. This allows the Chess program to move instantly when a move is available in the book. Needless to say, this is a very great advantage that allows the program to save time on thinking. On the other hand, poor Opening books can blindly lead the program to poor positions that results in a quick loss.

Most Winboard Engines possess opening books. Nearly all are external files [typically with bk.extensions] separate from their programs, while a few like Crux have internal opening books.

[C.2] What is book learning?

Simply put, while there are various forms of book learning they are generally used to let Chess programs learn from their losses and discourages them from repeating openings lines that constantly lose. "Aggressive" book learning not only avoids lines that have lost, but also repeat lines that have won in the future.

One form of book learning involves statistical analysis. The engine keeps track of it's win/loss ratio with various openings and avoids openings with poor results. The weakness with this form of learning is that often the result of a game is not related to the opening. Against a superior opponent, the engine would quickly "learn" that no opening was viable!

Another form of book learning, involves keeping a record of the engine's evaluation once it is out of book. For example, if the engine calculates and finds that the score of the position once out of book is -3 (or a piece down), it can conclude that this is a bad opening line to be avoided. Of course, in many cases the evaluation is less clear cut (say -1.x), is this really a bad position or one that the engine doesn't understand (perhaps a gambit with compensation beyond the engine's horizon)? Either way , arguably, the opening line should be avoided. Crafty tries to reduce this horizon effect by averaging the score of the first 10 moves out of book to give a more accurate assessment.

Not all Winboard programs have book learning. Some advanced programs like Crafty not only write into the book file but also produce a "learn file" that reflects the amount and type of lines "learnt", allowing you to exchange and import such learning.

[C.3] How do I turn the opening book on?

For most programs , this merely involves, downloading the opening book into the same directory has the execute file followed by a modification of the Chess programs's ini file [see above]. Ensure that Opening book is set "On" and you are done.

[C.4] Is there a way to set uniform opening books for all Chess engines through a GUI book?

Currently, there is no uniform set opening book format and each engine (with a few exceptions) uses their own opening book format.However, some users may want to ensure all engines use exactly the same opening book to ensure a level playing field. While most engines allow you to generate your opening books from PGN files, different book building options means that it's impossible to create exactly the same opening books even if built from the same PGN file. This doesn't even take into account the trouble of generating separate opening books from the same PGN file several times.

It's also possible with some effort to convert some popular opening formats used by commercial engines to each other. For example a converter that converts Chessmaster formated books from OPK to PGN is available. Also see Mogen Larsen's article on the use of 2 opening books converters FGCRWNEW and fbk2rbm.

For the benefit of many users who want to convert chessmaster books (obk format) to Fritz books (ctg format) directly let me explain further.

In Fritz there is an option to import older books. This apparently includes books in the OBK,MVS and FBK (old fritz book) formats. However unfortunately for some unknown reason while you can import books in MVS and FBK formats this doesn't work for OBK.

What if you still want to convert Chessmaster OBK books to CTG? You then need to use FGCRWNEW which can convert opening books in FBK,CTG,MVS,Wchess and genius to each other. What you do is to convert OBK to fbk books. The use Fritz itself to import the books in FBK format. As far as i can tell though this method works only for Chessmaster 4000 to 8000. Even though chessmaster 9000's books are still named OBK, there apparently is some change in format , and FGCRWNEW doesn't work on them.

It seems that there is a way around this problem though.Drexel,Michael from CCC writes

"You need a Hex Editor to change the first line in CM9000.OBK
from:
42 4F 4F 21 6E 59 04 00  00 00 00 00 0C DC ...
-- -- -- --              -- -- -- --
  modify                    delete

to:

55 47 57 53 6E 59 04 00  0C DC ..."

Is there a easier way?

A GUI controlled opening book which all engines can use (as in Fritz for example) would makes thing a lot easier.(Another way is to use Nunn tests,but you need batch files for this in Winboard, see Section [F.2]) Currently though the Winboard protocol (unlike UCI and hence some UCI engines like Shredder require the GUI to provide a opening book) does not have specific provision for separating the opening book from the engine. One idea has being to provide "book engines" which play the openings for the main engines before passing control over to the main engine.

This has being discussed before on the Winboard protocol mailing list and a possible modification to allow Xboard/Winboard to handle the switch between a "book engine" and the "main engine" may be introduced in Winboard protocol 3. Here's the thread on book engines.

For now, though there is one book engine that handles the switch by itself - Bookthinker (which comes with the Winboard engine Thinker) . Bookthinker is a book engine that plays the opening before handing control over to the engine. The way to do it is as follows.

Download and install Thinkboard (engine and interface), then transfer the files bookthinker.exe (the book engine) and thinker.dat (the actual book file) into the directory with the engine you want to use it with. For this example, I will be using bookthinker with List - a Winboard engine with no opening book that is in c:\chesseng \list
While thinker.dat cannot be renamed, you are free to rename bookthinker.exe. In my example I have renamed it to listbook.exe
For example mine now reads "Listbook list" /fd=c:\chesseng\list because I have renamed bookthinker.exe to listbook.exe
You can also create custom opening books for use with bookthinker.

A problem is that while Bookthinker allows you to create your own book from a file of PGN games, there is no way to prioritize moves using NAGS like ! or ?, so you should clean the pgn file of such symbols because using it to create a book.

However Winboard engines that do not support force command (only a few of them) will not work with Bookthinker. There is no way currently to send command line or parimeters to engines using Winboard because Winboard.ini does not support quotes. Hence engines that send init strings or need command line parameters probably will not have problems too. Some engines may also have time allocation problems.

For some engines not sending the command parameters will have minor effects (eg sending book learn commands,hash table sizes) ,for others it can be a big problem. For example The King will be almost useless since the OPK string will not be sent. This is a pity since The King when used as a Winboard engine relies on a external GUI book. Nejemet will also have a problem since the /xb command is not sent.

So what are the solutions? Because 90% of all of such problems involve using bookthinker with Chessmaster/The King, here we shall assume that you want to run Chessmaster with a opening book outside the Chessmaster GUI.

First you can use another interface that provides support for Winboard (or even UCI if you want to play with Polyglot!) and provides a opening book. Arena is the obvious one. Either use Arena's built in opening book (which has some flaws because it doesn't support transposition in older versions of Arena) or use it with bookthinker as described here . But what if you don't like Arena?
The problem with quoting, only occurs if you run engines via the Winboard.ini file. If you prefer to run them using the command line options, then you create a shortcut as described here .
Much of the problem with The King can be avoided if you use the patch to disable the OPK check of course.

Jason Kent has created a package to make use of The King (all versions) with FICS (direct link 1006K). It includes support with Bookthinker but you need to read the instructions (direct link to text file) carefully. It avoids the problem with bookthinker by disabling the opk check.

The last method favoured by me would involve using inbetween. See my section on Chessmaster and bookthinker.

It's probably wise to turn off any other opening books that the main Chess engine may possess when using them with Bookthinker.

Though Shredder is a UCI engine, you can use it via a uci2wb adaptor in Winboard. It will have no opening book though, so you will need bookthinker. Volker Pittlik describes here now to use Shredder as a Winboard engine with bookthinker.
However, there are some timing problems, see here . The alternative adaptor PolyGlot might work better though,particularly version 1.3 which provides opening book support,so you no longer need bookthinker.

[C.5] Where can I download more opening books?

If the opening book that comes with the program is not enough for you , you can attempt to create your own opening books using the opening book editor/utility that come with some programs.

Here's a short list of programs that come with the option of making your own books. Amy , Bringer , Crafty , Green Light Chess , Goliath , Quark , Yace etc. Leo's Winboard information page has a longer list. (Look at the category ,"create book")

The few programs that do not allow you to build your own opening books are Anmon , Francesa and SOS ,for those you are generally limited to the default/author provided ones.

Building your own opening book generally involves supplying a file of GM games, to draw lines from . Dann Corbit's p2600u.zip is a nice file with high quality games between players rated 2600 and above.You might also want to refer to A beginner's guide to building Opening books .

Self made opening books for some popular programs can also be found on Mogens Larsen's home page You can also find opening books by Arturo Ochoa.

[C.6] What are endgame tablebases[egtbs]? What is the difference between Nalimov and Edward tablebases?

Please note, I have moved everything except for the bare minimal regarding Endgame tablebases to a new Web page on endgame tablebases.

Basically endgame tablebases are database files of stored endgame positions [calculated using retrospective analysis] that can be used by Chess programs to play perfect Chess in the endgame as long as the available file is available. For example, if you have the file KNNKP, Chess programs that can use endgame tablebases, will play that endgame perfectly. As such endgame tablebase files are specific to only one kind of endgame. You don't automatically get the 3 men tablebases even if you have all 4 men tablebases for example. This can lead to problems , see Section A.9

There are various kinds of Tablebase formats, including Ken Thompson,Steven J. Edwards And Eugene Nalimov Tablebases. Dr Robert Hyatt [Author of Crafty] explains the differences between them as follows [in a posting to rec.games.chess.comp 26/10/2000]

"Edwards (Tablebase): Distance to mate values stored. The main problem with these is that they are larger than the others.

Nalimov (Tablebase): Distance to mate values just like Edwards, but Eugene's files are compressed, and they may be used in the compressed form, with no penalty of any kind. Rather than way over 30 gigabytes for all of the 3-4-5 piece files, you end up with about 7.5 gigs.

Thompson (Tablebase): Distance to conversion (capture which takes you to a smaller ending class). These are difficult to use in their compressed form, from inside a chess engine. They also provide different info than Eugene's database... i.e. it tells you something, but it doesn't differentiate between losing and drawing as the Nalimov files do.

Best choice: Nalimov. Nearly every chess engine supports those... "

Nalimov tablebases are nearly "perfect" since they take into account en Passant. However, they don't take into account castling, but this flaw is probably of interest only to Chess problem Hobbyists.

In general, though almost all modern Chess programs [including most Winboard programs] use Nalimov Tablebases partly because they are non-propriety and partly because they are more efficient. Also some of the 6 pieces Tablebases are now available in the Nalimov format. The Nalimov tablebases comes in 2 forms, uncompressed and compressed. The compressed ones end with the extension "emd".

Most of the modern Chess programs(Crafty supports the use of the compressed Nalimov format from versions 16.5 onwards) can use the tablebases in compressed form by uncompressing and using them on the fly as needed. One exception I'm aware of is the Esc , a Winboard program released on 4 Feb 2001 which uses only the uncompressed form. If you generated the Nalimov tablebases yourself (see Section [A.8] ), they will already in the uncompressed form before applying datacomp.exe. Datacomp can also be used in reverse to get uncompressed Nalimov tablebases from compressed ones.

[C.7] How many tablebases do I need? Can I download only some of them?

The more tablebase files you install, the stronger the program will be. However, a full set of 3,4 and 5 Tablebase files takes about 8 GB of Hard disk space! Most people download a full set of 4 Piece databases and select only a few of the 5 pieces.

Generally the 5 piece endgames with rooks are reached most frequently and should be downloaded . However, there are some pitfalls that you should be aware of.

If you want to use the KRPKR tablebase [and assuming you have all the 3 and 4 tablebases], make sure that you have the following endgame tablebases as well, KQRKR, KRBKR, KRRKR, KRNKR, . This is to ensure that the promotion cases are included.

If you lack say the KQRKR tablebase, some programs refuse to queen the pawn in the KRPKR even if that would lead to a win, because such a move, would cause the program to drop from a position flagged as "win", to a position that is uncertain since they lack the relevant endgame table.

A similar problem can result if you download only the KQPKQ tablebase without KQQKQ , KQRKQ etc..

Some programs like Crafty and Yace are "smart" enough to avoid this problem, but most like Amy , or The Crazy Bishop cannot handle this.

Also take note that not all programs support all the Tablebases. Yace for example, currently does not use the 4+1 tablebases [ King and 3 pieces/pawns on a side versus a alone King] , because such positions are easily won most of the time barring rare cases [Like double rook pawns, wrong coloured bishop and King vers King]. Also I think only Crafty currently, supports 6 piece tablebases. Szots Gabor also reports that Wildcat , Capture and perhaps Inimichess cannot handle incomplete 5 men tablebase

[C.8] How do I turn the endgame tablebase on?

For most Chess engines, you only need to add a path to the tablebase directory in the configuration files of the relevant Chess engine. For example, you add only add the line tbpath=c:\chesseng\tb into the Crafty.rc file if you want Crafty to use tablebases. Just change the path to where-ever your tablebases are. Mine is at c:\chesseng\tb.

[C.9] Where can I get (buy,generate,download) more endgame tablebases?

You have basically 3 choices. You can

Buy them [More information]
Download them [More information]
Generate them yourself [More information]

[C.10] Okay. I followed the instructions but why is the endgame tablebase still not working?

There are some possible reasons

You specified the wrong pathway for the endgame tablebases.
File is corrupted or in the wrong form , they should be in the form KXXKXX.nbw.emd or KXXKXX.nbb.emd [for compressed Nalimov]
One of the files either KXXKXX.nbw.emd or KXXKXX.nbb.emd is missing. You need both of them.
You are using the wrong type of endgame table bases. E.g. using Nalimov for Bionic .

[C.11] There's a bug in the program, Why does Amy,TCB etc refuses to promote the pawn?

Most probably this is due to the lack of a complete set of tablebases covering the promotion cases. See Section C7

[C.12] What are Transposition/Hash tables?

When a Chess engine begins analysing a position, it will often "try" out moves in different orders but which reach the same position. As the name "transposition table" implies, the program stores such positions in memory with their evaluations, so it can save time whenever it comes upon the same position that has being reached before albeit with a different sequence of moves. Most chess engines store a 'hash' of the position and hence transposition tables are also commonly known as hashtables. See also a more detailed explanation on the subject by Dann Corbit at the Winboard forum.

There are various types of Transposition Tables used depending on the program. Firstly, there are endgame hashtables [actually caches would be more accurate], that "works similarly to a disk cache ,and just avoids many disk accesses, and therefore can speed up the program in late endgames." [ Dieter Buerssner, Author of Yace ].

Some programs [e.g. Goliath ] have only one main hashtable besides the egtb Hash while others [e.g. Crafty] use a Pawn Hash table besides the main hash table . Bringer has a total of three [four including one for endgame tables] by using Pawn, evaluation and Position hashtables.

[C.13] How much memory should I allocate in transposition tables?

General advice

Allocating more memory to Transposition tables tends to improve quality of play. At longer time controls and with a faster processor, the hash table fills up quickly and it helps to have a bigger transposition table so that entries can be retained without being substituted out. This help speeds up search of course, as you don't have to waste time searching moves that are already stored in the transposition table. However, at very fast time controls, transposition tables aren't filled up quickly enough and allocating large transposition tables might not be too helpful though it should not hurt. That said however, you should always ensure that you avoid setting a hash table that is too big, such that disk swapping occurs because you don't have enough RAM.When that happens access to transposition tables will be a lot slower, since you are accessing the physical hard disk rather than memory.

How much memory should you assign to avoid disk swapping? The idea roughly speaking is to take into account the RAM used by your operating system plus overheads and allocate most of the rest to transpostion tables. The general advice is that you should allocate up to half of your system's total memory available to the Chess engine.The rest is used up by the Windows operating system.This is to avoid disk hashing. If you are doing engine versus engine matches on one machine, the total memory allocated to both programs should be half that of your system's total memory. This also assumes you aren't running too many other resource hungry applications at the time of course.

I must add that this advice is limited only to people with low [say below 256 MB] amounts of RAM. Given that the amount of memory resources needed by Windows is somewhat fixed ,if you have large amounts of RAM, you do not need to follow the "50% rule" above. For example, a 512 Mb Ram machine can probably support a total of 420Mb for the tournament without any hashing occurring.

Even if disk swapping does not occur, some engines (certain versions of Crafty, Chessmaster8K for example) clear hash tables each move, which can lead to a few seconds delay for large transposition tables. (The patched CM8K and CM9K don't suffer from this).This could be bad for lightning time controls of course.

To sum up. Make your transposition tables as big as possible while keeping in mind the following

Ensure that the transposition table doesn't cause disk swapping
If the engine used clear hash entries every move, there might be some slow downs with a larger transposition table, since the time taken to do this is dependent on the size of the transposition tables. This can cost a couple of seconds each move and can be bad for very fast time controls.
In general larger transposition tables are more likely to be useful in longer time controls.

The above advice is not controversial.(Altough see Other considerations below) What is controversial is how much gain results from a larger transposition table. There appears to be no clear answer, because engines differ according to the implementation of transposition table and search techniques, but currently (March 2003), it appears that increasing transposition table size beyond 64-128 MB RAM leads to minimum gains.

Formulas for size of transposition tables

There are a few forumulas floating around that are supposed to help find the optimal transposition table size, but they are best looked upon as a guide and not used as a hard and fast rule.

Steve Lopez (writing for chessbase) recommends using memory in hash table of 2* processor speed (Mhz) * average seconds per move (s) which yields a hashtable size in kilobytes which you can then convert to megabytes by dividing by 1000. For example if you running a blitz match where the engine has 5 minutes to play 40 moves, the recommended hashtable size on a 1 Ghz computer according to the above formula = 2 * 1000 * (5*60/40) =15,000 Kb or 15 Mb.

However, such a formula leads to the same Transposition table size for each engine without regard to how fast (in Nodes per second) each engine searches. John Merlino writes that for Chessmaster, the formula they use to decide the size of the transposition table is as follows
2*Nodes per second * average seconds per move.

Although this is recommended for Chessmaster only, it seems to me such a formula is better suited (compared to Lopez's formula) for adapting to most engines, since it directly measures the number of nodes searched and hence the size of the Transposition table needed. Still this is but a rough guide. I've found that this formula usually leads to a lower figure than Steve Lopez's formula,but whether it's more accurate or not I don't know.

Crafty 19.3 onwards itself includes a feature called "Adaptive hash size" which automatically assigns size of the hash tables
according to the node per seconds, time per move within limits you set .

Other considerations

Some writers out there , typically people who are not chess programmers, have promoted the view that larger hash tables can hurt based on the misunderstanding that it takes longer to 'search' a larger hash table than a smaller one. This is wrong, since a hash table lookup or probe takes exactly the same amount of time whatever the size of the hash table.Roughly speaking, there is an exact 'address' which they can look it up immediately as compared to looking through the whole hash table for a match which is much slower.

Typically they show 'evidence' for this view in two ways.

Firstly they observe that using larger the transposition table results in lower NPS (node per second) search figures. In other words the engine reports it searches less moves per second. This is in fact the same faulty argument made by Steve Lopez using Fritzmarks (which is made on Nodes per seconds). The fact that larger hash tables lead to lower NPS/Fritmark is a well known effect, however it does not however mean the engine is weaker. The NPS figure shown is fewer, because it no longer has to search nodes which are already storied in the transposition table. In fact, 99% out of the time, the engine with more hash will find the right solution to a test position faster than the same engine with lower hash despite lower NPS. This shows that NPS does not tell the full story.
Secondly, it has being observed that for some positions, solving tests might in fact be slower when you use more hash. This one is much harder to explain. The information that is stored in the transposition table, will tend to affect the shape of the search tree and the order in which moves are searched. It may be that the information in the transposition table may sometimes cause the engine to spend more time looking at the less important part of the tree before refuting the move. Such positions are termed pathological and while not being all that rare tend to be swamped out in most cases in positions where more information reduces the search tree. In any case, this effect ('hash luck') has nothing to do with slows down due to searching larger transposition tables.

A more difficult question would be to decide how much memory to allocate among each transposition table (if there are more than 2 types and/or endgame caches]. There are I believe though no hard and fast rules to allocate memory. Therefore these must be handled by trial and error, and on a engine by engine basis.

Note that depending on the program, you might not have full control for the allocation process. Some programs [e.g. Francesa ] use only one fixed hash upon compiling and cannot be adjusted at all. While others might allow almost any finite allocation of memory. Lastly, some programs like Crafty lie in between, by allowing you to adjust the size of Hash Tables but in discrete increments.

Therefore it may not be possible to be totally "fair" in the allocation of hash for engine versus engine matches on the same machine.