refactor tournament wins/draws/losses, adjust displays #30

maconard · 2023-11-21T23:00:43Z

TLDR:

Change definition of "winning" to exclude draws
Now use K=8 for Elo calculations in tournaments, should be more stable/slow changing
Adjust tournament rankings summary tables with new fields and spacing

More specifically on "winning": Now track "wins" for players truly winning alone, without a tie to another player. Then "draws" are for players tied with the highest score, and "losses" are for anyone without the highest score.

Old Summary

Rank Name                       Elo   Games      Score  Avg Score   Wins  Win Rate
   0 LargestPiece               179  100000    7632607      76.33  65003    65.00%
   1 WallCrawler                -46  100000    7034346      70.34  24815    24.82%
   2 Turtle                     -29  100000    6741864      67.42  14071    14.07%
   3 Random                    -104  100000    5999309      59.99   1533     1.53%

New Summary

Rank Name            Elo   Score  Avg Score   Games    Wins   Draws  Losses   Win %
   0 LargestPiece    125  763278      76.33   10000    6023     434    3543  60.23%
   1 WallCrawler      17  702660      70.27   10000    2140     325    7535  21.40%
   2 Turtle          -34  675260      67.53   10000    1219     207    8574  12.19%
   3 Random         -108  599416      59.94   10000     121      43    9836   1.21%

Rank Name            Elo    Score  Avg Score   Games    Wins   Draws  Losses   Win %
   0 LargestPiece    107  4583269      76.30   60069   36179    2801   21089  60.23%
   1 WallCrawler      21  4225533      70.34   60069   12590    2181   45298  20.96%
   2 Turtle          -39  4053151      67.47   60069    7302    1349   51418  12.16%
   3 Random          -88  3605234      60.02   60069     759     247   59063   1.26%

Summary changes:

Now shows Wins, Draws, and Losses
Put all game count columns together, instead of split by the score columns
"Win %" now follows the new definition of wins, excluding draws
Name, score, and game/win/draw/loss count columns will auto-fit to the size of the data to not waste space

Results structure changes:

win_counts is now the new definition of wins
draw_counts new property for the amount of draws
lose_counts new computed property by game_counts - win_counts - draw_counts
win_rates unchanged computed property, but uses the new definition of wins
draw_rates new computed property for the rate of drawing
lose_rates new computed property for the rate of losing
win_draw_rates new computed property for the old definition of winning, (win_counts + draw_counts) / game_counts
elo_delta new computed property for the change in elo from the tournament
Removed all the redundant getter functions, just access the properties and index them since that's all the functions were doing

maconard · 2023-11-21T23:04:46Z

tilewe/tournament.py

+        N = self.total_engines
+        len_names = max(5, min(24, max([len(x) for x in self.engine_names]) + 1))
+        len_score = max(6, max([math.floor(math.log10(max(1, self.total_scores[i])) + 1) for i in range(N)]) + 1)
+        len_games = max(7, max([math.floor(math.log10(max(1, self.game_counts[i])) + 1) for i in range(N)]) + 1)


The number at the front with max(# is effectively the minimum column width. If there is then a min(# that is effectively the maximum column width, otherwise no max width. Then there is a max of the data, usually with + 1 for one additional padding on the max data length

maconard · 2023-11-21T23:05:05Z

tilewe/tournament.py

+                            wins[winners[0]] += 1
+                        else:
+                            for p in winners:
+                                draws[p] += 1 


Distinguish between truly winning and drawing

maconard · 2023-11-21T23:05:56Z

tilewe/tournament.py

-        return self.elo_end[engine]
-
-    def get_delta_elo_by_engine(self, engine: int) -> int:
-        return self.elo_end[engine] - self.elo_start[engine]


These all have properties now, so just use results.property[engine] instead of results.function(engine)

maconard · 2023-11-21T23:07:54Z

tilewe/tournament.py

-            out += f"{'Avg Score':>10} {'Wins':>6} {'Win Rate':>9}\n"
-            ranked_engines = sorted(range(N), key=lambda x: -elos[x])
-            for rank, engine in enumerate(ranked_engines):
+            len_name = max(5, min(24, max([len(x.name) for x in self.engines]) + 1))


I'd like to not have this function duplicated, but it uses so much data it would be really annoying to pass all necessary data into a static version or something. So for now I'm most happy with it duplicated, one here in the running tournament instance and one in the results data class.

refactor tournament wins/draws/losses, adjust displays

43568dc

maconard added the enhancement New feature or request label Nov 21, 2023

maconard requested a review from nhamil November 21, 2023 23:00

maconard self-assigned this Nov 21, 2023

maconard commented Nov 21, 2023

View reviewed changes

nhamil approved these changes Nov 22, 2023

View reviewed changes

nhamil merged commit e50d150 into master Nov 22, 2023
1 check passed

nhamil deleted the conard/pure-wins-data branch November 22, 2023 05:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor tournament wins/draws/losses, adjust displays #30

refactor tournament wins/draws/losses, adjust displays #30

maconard commented Nov 21, 2023 •

edited

Loading

maconard Nov 21, 2023

maconard Nov 21, 2023

maconard Nov 21, 2023 •

edited

Loading

maconard Nov 21, 2023

refactor tournament wins/draws/losses, adjust displays #30

refactor tournament wins/draws/losses, adjust displays #30

Conversation

maconard commented Nov 21, 2023 • edited Loading

maconard Nov 21, 2023

Choose a reason for hiding this comment

maconard Nov 21, 2023

Choose a reason for hiding this comment

maconard Nov 21, 2023 • edited Loading

Choose a reason for hiding this comment

maconard Nov 21, 2023

Choose a reason for hiding this comment

maconard commented Nov 21, 2023 •

edited

Loading

maconard Nov 21, 2023 •

edited

Loading