-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request - Modelling #9
Comments
Thanks for reaching out @gowerc! First, I LOVE your site. Thank you so much for your work on it. I would love to incorporate some of your ideas into my site. I agree that your logistic regression model is a much better model than my naive win rates. I really appreciate your methods section. I'm going to look to incorporate your model into my win rate calculations, if that's ok with you? Are there any other confounding variables that you think would be appropriate to incorporate? I'm also curious if you think the "Averaged Win Rates" as you put it on your website would be more appropriate? I guess it's a bit up to interpretation. My gut says the logistic regression model suffices. I would also love to add some of those graphs that include all of the civs on one plot. I've been planning something similar for a bit so I will definitely use your site as inspiration for some more graphs. Want me to give you credit somewhere on the site? Want a social link posted in the FAQ or footer or the like? |
Thinking about it a bit more, I could do this same thing and add in map as a parameter to get more accurate map win rate results. 🤔 That seems pretty cool. |
Also I'm assuming once you fit your model you asked it to predict the civ's win rate given a elo difference of 0 to get the "overall" win rate? Or am I misunderstanding? |
If you need help with any of the methods do feel free to ask!
I don't own logistic regression 😆 so definitely fine with me 😄
(apologies in advance for the wall of text here) A simple list of things that would realistically affect the outcome of a match
But yer its basically impossible to model the above because there just isn't enough data for each player (plus there are way too many civs). Below are some ideas that I had to try and capture the above (albeit far from perfect):
Just to be clear both win rate types that I show were created by regression models, the difference is basically how you weight them. In terms of which one is better there is no correct answer, they just show different things. The "Averaged Win Rate" basically shows your civs expected win rate assuming your opponent is selecting "Random" whilst the normal win rate shows your civ's win rate assuming your opponent is selecting civ's based upon the observed pick rates (e.g. they are more likely to be Franks 😄 )
That would be awesome if you could!! One of the original inspirations for creating my site was I wanted a more visual representation of the data. Your site was amazing for the raw information but (at least personally) I always found plots better for quick visual comparisons.
O no need for this at all, I mean if you want to feel free but I don't need / want any credit, I am just happy to see you are back developing as the community really benefits from a resource like yours 😄
Thing is you would have to add it as a civ * map interaction term. Which is perfectly doable but the problem I found is the model becomes very hard to fit computationally with so many parameters. I was running it on a 32GB RAM machine and still running out of memory, I had to end up reducing the model and down sampling the data in a few cases. I started looking into more memory efficient implementations but didn't really get anyway 😢 |
Ya pretty much this. Looking back over my code I essentially structured the data as 1 row per player per match with a columns civ , diff in elo, won e.g.
For team games I just difference in mean team Elo (though yer note my above bullet point about how this could be better modified). The model is then Couple of additional points that came to mind:
|
So I've been playing around with this model and while I think my code is correct, I don't see a large difference in the predicted win rates versus the mean win rate. But perhaps I'm doing something incorrectly and not setting up my model right? Here's the output, where
The very naive win rates right now respectively are:
|
Hard to truly say without access to the code & data though nothing you've shown above looks obviously wrong. Must admit I am a bit surprised. Will double check what I was seeing with my historic cuts of the data. Some general thoughts:
|
Doing some quick sanity checks the theoretical win % of someone with a 25 Elo advantage is 53.59% and according to your model its coming out as 54.92% which is very much in the same ball park so would be very surprised if there was a mistake in your code. |
Heya,
I'm the author/developer of ageofstatistics.com . I no longer have the time to maintain the site (especially after the changes / lack of stability with aoe2.net). I was wondering if you would be open to porting across some of the features to your site now that you are back developing again :) ?
Happy to discuss more but I think the main one I would want to stress is the use of logistic regression modelling in order to account Elo + other covariates.
An unfortunate fact about win rates is that if you don't include explanatory variables (i.e. just calculate naive win rates) they are biased towards 50%, given that Elo is such an influential part of matches it means most of the win rates you present will be underestimated (albeit the relative ordering / ranks should be preserved).
But yer if this is of interest to you (plus any other features from my site that you would like to incorporate) I'd be more than happy to chat. If not, then no hard feelings please feel free to ignore and close this issue :)
The text was updated successfully, but these errors were encountered: