Skip to content

Commit

Permalink
add npm ignore, update package json and move test files (#55)
Browse files Browse the repository at this point in the history
* prepping npm pkg

* add npm ignore, update package json and move test files

* bump version

* fix readme typo

* cleanup

* version chg

* update pkg lock

* delete bad merges
  • Loading branch information
EllAchE authored Apr 1, 2024
1 parent 4294a7d commit 39dec44
Show file tree
Hide file tree
Showing 9 changed files with 166 additions and 131 deletions.
9 changes: 9 additions & 0 deletions .npmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
.github
.vscode
cdv
data
tests
src/visuals
tsconfig.json
package-lock.json
src
27 changes: 20 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
# Overview

Welcome! We (@EllAchE and @bennyrubanov) are chess amateurs who also have interests in statistics, programming and sillyness. This repo brings those interests together. Here, we find and visualize some of the sillier (and perhaps less useful) chess statistics no one has bothered to calculate before. This project is open source, so we encourage contributions and suggestions! 😊
Welcome! We (@EllAchE and @bennyrubanov) are chess amateurs who also have interests in statistics, programming and sillyness. This repo brings those interests together. Here, we find and visualize some of the sillier (and perhaps less useful) chess statistics no one has bothered to calculate before. This project is open source (MIT License), so we encourage contributions and suggestions! 😊

## Preliminary Results
These results have yet to be visualized, and the quantity of games analyzed will be increased, but here is a result on 450k games from November 2013!

[Results](https://bennyr.notion.site/450k-games-analysis-external-facing-8aeb101453c64cfeaef1130ae10e68e3?pvs=4)
The project is under development, with the goal to one day look at the entirety of the Lichess games database. However, we have already analyzed a subset of the data. Here's a deep dive into those: [Analyzing 5 billion chess games](https://elehche.com/Analyzing-5-billion-chess-games-f10f3f6125144f6f9978a431f16b3e70)

and if that's not enough, here is a result on 450k games from November 2013! [Results](https://bennyr.notion.site/450k-games-analysis-external-facing-8aeb101453c64cfeaef1130ae10e68e3?pvs=4)

# Methodology

Expand All @@ -15,9 +16,10 @@ Data is sourced from the public [Lichess games database](https://database.liches

## Credits

We have taken advantage of some of the helpful methods in [chess.js](https://github.com/jhlywa/chess.js/blob/master/README.md) to save ourselves a little time.
We have taken advantage of some of the helpful methods in [chess.js](https://github.com/jhlywa/chess.js/blob/master/README.md) to save ourselves a little dev time.

## Definitions

- Unambiguous Piece (UAP): e.g. pawn that started on a2
- Ambiguous pieces: pawn, bishop, knight, rook, queen, king

Expand All @@ -26,6 +28,7 @@ We have taken advantage of some of the helpful methods in [chess.js](https://git
Current implementation is **bolded** where multiple options exist:

### Kills/Deaths/Assists

- If two pieces simultaneously checkmate a king each is credited with 0.5 kills/mates (not currently implemented)
- A checkmate is considered a "death" for the king and a "kill" for the mating piece
- An "assist" is counted for a piece if: the move before the checkmate is that piece's move, and it is a "check"
Expand All @@ -35,29 +38,33 @@ Current implementation is **bolded** where multiple options exist:
### Distances

#### Knight

- **knight move counts as 2: one diagonal and one hor/vert**. An alternative would be to count as 3: 2 horizontal/vertical + 1 hor/vert
- To calculate when a knight "hops" a piece we do the following.
- To calculate when a knight "hops" a piece we do the following.
- A knight can take 2 paths to its destination, if either of those paths is clear we assume it takes the "easier path" and does not hop a piece.
- If there is no clear path we count ALL pieces blocking both paths, then divide the aggregate by 2. There are two reasons for this:
- We want deterministic outputs from processing games
- We want to avoid selection bias.
It would be possible to use a deterministic rule (i.e. short distance first when odd moves) to determine the knight's path, however that or similar decisions could introduce bias when considering common opening patterns. A randomness rule (i.e. generate a random number to choose the path) would avoid this but would lead to nondeterminisic results.
It would be possible to use a deterministic rule (i.e. short distance first when odd moves) to determine the knight's path, however that or similar decisions could introduce bias when considering common opening patterns. A randomness rule (i.e. generate a random number to choose the path) would avoid this but would lead to nondeterminisic results.

#### Bishop

- **Diagonal moves count as 1**. An alternative would be diagonal moves count as 2: 1 horizontal + 1 vertical

#### Castling

- Castling counts as a move for a rook as well as the king
- The distance a rook covers during a castle move is also tracked

### Promotions

- After an unambiguous piece is promoted:
- It is treated as the new piece
- **It is treated as the original piece (i.e. if the pawn that started the game on a2 is promoted to a Queen, it is still treated as the pawn on a2 for the purposes of calculating distance functions, etc)**

# Priority (& Silly) Questions to answer

Kills/Deaths/Assists
Kills/Deaths/Assists

thesis: beginners get forked by knights and lose a lot of high level pieces. Will be answered by KD ratio by piece value

Expand All @@ -71,17 +78,20 @@ thesis: beginners get forked by knights and lose a lot of high level pieces. Wil
- heat map of which squares are "battleground", i.e. have the most captures

Distances/Moves

- average distance each piece has traveled
- furthest distance a piece has traveled in a single game
- average number of moves by piece
- the game with the furthest collective distance moved
- the game with the most moves played

Promotions

- how often a piece is promoted to different pieces (q, n, b, r)
- how often each unambiguous pawn promotes

Openings/Endings/Wins/Losses

- how often do games end with 3 fold repetition? Stalemate? Ties in general? Insufficient material? Loss on time? Lack of pawn advancement?
- top 5 most used/popular openings
- number of times various openings (e.g. bongcloud 😁) are played ♙
Expand All @@ -90,6 +100,7 @@ Openings/Endings/Wins/Losses
- which side wins more often for each of the top 5 most used/popular openings

Dataset facts

- number of games analyzed
- average rating of players
- quantity of games played by time control category (e.g. bullet/blitz/rapid/classical)
Expand All @@ -100,6 +111,7 @@ Dataset facts
- largest rating diff between players in games played

Miscellaneous

- most queens to appear in a game
- en passant count
- most pieces hopped over by a knight
Expand All @@ -123,6 +135,7 @@ Miscellaneous
- how many pieces on average does a Queen take before it gets taken down

# Planned Roadmap Items

- Segment by ELO rating ranges
- Segment all relevant questions by ambiguous pieces and unambiguous pieces (e.g. pawn, bishop, knight, rook, queen, king) vs unambiguous pieces (e.g. pawn that started on a2)
- Segment by wins and losses
Expand Down
File renamed without changes.
4 changes: 2 additions & 2 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "chessanalysis",
"version": "0.1.0",
"private": true,
"version": "0.1.2",
"main": "dist/src/index.js",
"dependencies": {
"async": "^3.2.5",
"async-mutex": "^0.4.1",
Expand Down
124 changes: 5 additions & 119 deletions src/index.ts
Original file line number Diff line number Diff line change
@@ -1,119 +1,5 @@
import { Chess } from '../cjsmin/src/chess';
import { gameChunks } from './fileReader';
import { CaptureLocationMetric } from './metrics/captures';
import { convertToVisual } from './visuals/convertToVisual';
import { KDRatioMetric, MateAndAssistMetric, KillStreakMetric } from './metrics/captures';
import { MoveDistanceMetric } from './metrics/distances';
import { MetadataMetric } from './metrics/misc';
import {
GameWithMostMovesMetric,
PieceLevelMoveInfoMetric,
MiscMoveFactMetric,
} from './metrics/moves';
import { PromotionMetric } from './metrics/promotions';
import * as fs from 'fs';
import * as path from 'path';
import * as lockfile from 'proper-lockfile';

/**
*
* @param path
* @returns
*/
export async function main(path: string) {
console.time('Total Execution Time');
await gameIterator(path);
console.timeEnd('Total Execution Time');
return results;
}

let results = {
'Number of games analyzed': 0,
}

/**
* Metric functions will ingest a single game at a time
* @param metricFunctions
*/
async function gameIterator(path) {
const cjsmin = new Chess();

const gamesGenerator = gameChunks(path);
const kdRatioMetric = new KDRatioMetric();
const killStreakMetric = new KillStreakMetric();
const mateAndAssistMetric = new MateAndAssistMetric();
const promotionMetric = new PromotionMetric();
const moveDistanceMetric = new MoveDistanceMetric();
const gameWithMostMovesMetric = new GameWithMostMovesMetric();
const pieceLevelMoveInfoMetric = new PieceLevelMoveInfoMetric();
const metadataMetric = new MetadataMetric(cjsmin);
const miscMoveFactMetric = new MiscMoveFactMetric();
const metrics = [
metadataMetric,
kdRatioMetric,
killStreakMetric,
mateAndAssistMetric,
promotionMetric,
moveDistanceMetric,
gameWithMostMovesMetric,
pieceLevelMoveInfoMetric,
miscMoveFactMetric,
];

let gameCounter = 0;
for await (const { moves, metadata } of gamesGenerator) {
gameCounter++;
if (gameCounter % 400 == 0) {
console.log('number of games ingested: ', gameCounter);
}

for (const metric of metrics) {
// with array creation
const historyGenerator = cjsmin.historyGeneratorArr(moves);
metric.processGame(Array.from(historyGenerator), metadata);
}
}
results['Number of games analyzed'] = gameCounter;
let metricCallsCount = 0;
for (const metric of metrics) {
metricCallsCount++;
results[metric.constructor.name] = metric.aggregate()
}
}

// for use with running index.ts with test sets and print to console
// if (require.main === module) {
// main(`data/11.11.23 3 Game Test Set`).then((a) => {});
// }

// for use with running index.ts with test sets & writing to results.json
if (require.main === module) {
main(`data/11.11.23 3 Game Test Set`).then(async (results) => {
const now = new Date();
const milliseconds = now.getMilliseconds();

const analysisKey = `analysis_${now.toLocaleString().replace(/\/|,|:|\s/g, '_')}_${milliseconds}`;
const resultsPath = path.join(__dirname, 'results.json');

let existingResults = {};
if (fs.existsSync(resultsPath)) {
const fileContent = fs.readFileSync(resultsPath, 'utf8');
if (fileContent !== '') {
existingResults = JSON.parse(fileContent);
}
}

existingResults[analysisKey] = results;

// Use lockfile to prevent concurrent writes
const release = await lockfile.lock(resultsPath);
try {
fs.writeFileSync(resultsPath, JSON.stringify(existingResults, null, 2));
} finally {
release();
}


console.log(`Analysis ${analysisKey} written to ${resultsPath}.`)
});
}
export * as cjsmin from '../cjsmin';
export * from './constants';
export * as metrics from './metrics';
export * from './types';
export * from './utils';
Loading

0 comments on commit 39dec44

Please sign in to comment.