The penaltyblog Python package contains lots of useful code from pena.lt/y/blog for working with football (soccer) data.
penaltyblog includes functions for:
- Scraping football data from sources such as football-data.co.uk, FBRef, ESPN, Club Elo, Understat, SoFifa and Fantasy Premier League
- Modelling of football matches using Poisson-based models, such as Dixon and Coles, and Bayesian models
- Predicting probabilities for many betting markets, e.g. Asian handicaps, over/under, total goals etc
- Modelling football team's abilities using Massey ratings, Colley ratings and Elo ratings
- Estimating the implied odds from bookmaker's odds by removing the overround using multiple different methods
- Mathematically optimising your fantasy football team
pip install penaltyblog
To learn how to use penaltyblog, you can read the documentation and look at the examples for:
- Scraping football data
- Predicting football matches and betting markets
- Estimating the implied odds from bookmakers odds
- Calculate Massey, Colley and Elo ratings
- Mark J. Dixon and Stuart G. Coles (1997) Modelling Association Football Scores and Inefficiencies in the Football Betting Market
- Håvard Rue and Øyvind Salvesen (1999) Prediction and Retrospective Analysis of Soccer Matches in a League
- Anthony C. Constantinou and Norman E. Fenton (2012) Solving the problem of inadequate scoring rules for assessing probabilistic football forecast models
- Hyun Song Shin (1992) Prices of State Contingent Claims with Insider Traders, and the Favourite-Longshot Bias
- Hyun Song Shin (1993) Measuring the Incidence of Insider Trading in a Market for State-Contingent Claims
- Joseph Buchdahl (2015) The Wisdom of the Crowd
- Gianluca Baio and Marta A. Blangiardo (2010) Bayesian Hierarchical Model for the Prediction of Football Results