Giraffe new engine

by MZ Thu Sep 03, 2015 9:16 am

Giraffe-a new engine based on deep learning ... from talkchess and author ...
Hello!
I have been working on a new engine for the past few months as my Master's thesis, and I think it's about time for the first release, since while it's not very strong, it sure is entertaining!
The goal is to create a chess engine that learns how to play chess using temporal-difference reinforcement learning, with as little hand-coded chess knowledge as possible.
The first stage of the project is to replace the evaluation function with a deep neural network, and use TDLeaf[1] to train it. That has been done, and is where the project is at right now.
The current evaluation architecture is a neural network with 3 hidden layers. About 270,000 parameters. It is bootstrapped using a materials-only eval, and trained by looking at millions of random positions from the CCRL dump. Essentially for each position it would first do a search, and then play against itself for a few more moves. Then it would adjust the model so that the eval of the original position (more precisely the leaf of the search from original position) gets closer to evaluations of subsequent positions.
I am using the Strategic Test Suite to test its positional knowledge. With the initial neural net (material-only) it scores about 4500/15000 at 0.3s per move. After about 8 hours of training (160 CPU-hours), it now scores about 6100/15000. I have played a few games against it, and it's clear that it has learned a thing or two about positions!
The main problem with evaluation right now is that I am only using 1 evaluation function for all phases of the game. That's why it tends to over-extend pawns and get the queen out too early in the opening, and probably doesn't know how to play end game (that hasn't been tested). Statistically speaking, most training positions are in middle game, so the current evaluation function is pretty heavily tuned for middle game.
I have some ideas on how to fix that, and that's what I am working on right now.
A few things that may be of interest:
Eval scores are not in centi-pawns. They are probability-based, and not anywhere close to linear to material at all. 10000 means it thinks white will win for sure, -10000 means it thinks black will win for sure, etc. If search actually finds a mate it will output (30000 - moves) or (-30000 + moves), like normal engines. Score from the start position is about 2000. Don't be alarmed Smile. It means the engine thinks white has roughly 60% chance of winning (whether that actually makes sense or not is another matter).
Searches are slow (about 5 seconds to depth 7), because eval is slow. This is intentional. The search is also very basic, with only A-B, PVS, Null move, killer, TT. I am intentionally not including any forward pruning (besides NM) or move-count based stuff (LMR), because I am planning to use neural nets to make that kind of decisions later on. Profiling shows that it spends 97% of the time in eval, which is pretty nice because that means I don't have to bother optimizing anything else at all...
There is no support for opening books or EGTBs. This is intentional. I want it to learn how to play openings and end games.
No support for pondering or SMP yet. This is... not intentional. Just lack of time Smile.
Sorry the xboard protocol part hasn't been tested very much, since I do all testing/training using my own tools. I just tried it with xboard and Arena, though, and it does seem to work.
I have absolutely no idea how strong it is! It's much stronger than me (human) for sure, but that's not saying much.
It would be highly appreciated if someone wants to give it a try and let me know what you think!
Download:
[You must be registered and logged in to see this link.]
It is closed source for now while the thesis is ongoing, but will be released under the GPL (or another open source license) at the conclusion of the thesis in October, if not earlier.
It is Windows-only for now even though I do all my testing and training on Linux, because releasing binaries for Linux is not very straight forward (glibc versions, etc), and I don't think many people here use Linux as their primary OS anyways.
Acknowledgements:
Professor Duncan Gillies, thesis advisor
Imperial College High Performance Computing Service, for providing all the computational power required for this project
Libraries/Code Borrowed:
Eigen linear algebra library ([You must be registered and logged in to see this link.])
Pradyumna Kannan's magic move generator