A python CLI tool to filter chess positions from large evaluation datasets (.jsonl, .jsonl.zst) by centipawn evaluation and active colour.
The intial motivation behind this was to curate a .epd file of 500 balanced starting positions (15cp - 25cp) for SPRT involving deterministic engines. The original dataset used was the Lichess cloud evaluations database, however this should work on any dataset providing it follows the below schema and is a supported file type.
{
"fen": // the position FEN only contains pieces, active color, castling rights, and en passant square.
"evals": [ // a list of evaluations, ordered by number of PVs.
"knodes": // number of kilo-nodes searched by the engine
"depth": // depth reached by the engine
"pvs": [ // list of principal variations
"cp": // centipawn evaluation. Omitted if mate is certain.
"mate": // mate evaluation. Omitted if mate is not certain.
"line": // principal variation, in UCI_Chess960 format.
}With pip:
git clone https://github.com/Brooklyn-Dev/chess-position-filter.git
cd chess-position-filter
python -m venv .venv
source .venv/bin/activate # On Windows: .venv/Scripts/activate
pip install -e .With uv:
git clone https://github.com/Brooklyn-Dev/chess-position-filter.git
cd chess-position-filter
uv venv
source .venv/bin/activate # On Windows: .venv/Scripts/activate
uv pip install -e .