This repo will focus on which predictors are relevant for return forecasting?
Data can be easily downloaded here. It has already been merged with the stock price values using the R code proposed by Open Source Asset Pricing team.
Data cleaned can also directly being downloaded
here.
We used MissForest to impute the data and retrieve the missing values. Please note that for quality and computation
reasons we kept only the more relevant attributes and records. Then the data cleaned is obviously largely smaller, i.e.
6M records and ~200 attributes vs 1.7M records and 131 attributes, respectively data and data cleaned.
Look at main.ipynb to see the different ML algorithms and results.
Our final_report.pdf describing the whole project
Authors: Mina Attia, Arnaud Felber, Milos Novakovic, Rami Atassi & Paulo Ribeiro