A machine learning approach to predict price swings in the electricity grid of the Netherlands.
This project was made by:
Nordin el Assassi
Sakr Ismail
Simen Veenman
Steven Dong
Tycho van Willigen
- Conda
- This GitHub repository.
- Data provided by Eneco.
Follow these instructions if you wish to run any file from the repository, except eneco_deliverable.ipynb.
eneco_deliverable.ipynb is a standalone file and can work outside of this repository,
and thus you may skip the following instructions.
However, eneco_deliverable.ipynb requires the pickle file of the model and the data to be in the same folder.
This requires you to move models/main_model/hgbr.pkl to notebooks/ and the data to be present in notebooks/ aswell.
- Place the raw data into the folder
data/raw. - Open the terminal in the same folder as this file.
- Run the make command
make create_environment(might raise an error). This will create a new conda environment.- NOTE: this might take a very long time due to the amount of Python packages needed to be installed.
If this process is interrupted, run the make commandmake requirements. - If it raised an error, run
pip install python-dotenvafter step 4.
- NOTE: this might take a very long time due to the amount of Python packages needed to be installed.
- Run the conda command
conda activate tweedejaars_project. - Run the pip command
pip install -e .. This installs the package in an editable state. - Run the make command
make data. This will preprocess the data and generate new features. - Navigate to the desired file using the Project Organization below.
├── Makefile <- Makefile with convenience commands like `make data`.
│
├── README.md <- The top-level README for developers using this project.
│
├── data
│ │
│ ├── interim <- Intermediate data that has been transformed.
│ │
│ ├── processed <- The final, canonical data sets for modeling.
│ │
│ └── raw <- The original, immutable data dump.
│
├── models <- Folder containing trained and serialized models.
│
├── notebooks <- Jupyter notebooks.
│ │
│ ├── legacy <- Folder containing old notebooks.
│ │ Some may no longer run without changing the code manually.
│ │
│ ├── autoencoder.ipynb <- Model for reconstructing data and detecting anomalies.
│ │
│ ├── autoregressive_rnn.ipynb <- Autoregressive RNN model for using the entire history.
│ │
│ ├── custom_loss.ipynb <- Simple FNN using a custom loss function.
│ │
│ ├── eneco_deliverable.ipynb <- The final deliverable in the requested format by Eneco.
│ │ This file is made by combining all the code in the source code
│ │ and `main_model.ipynb` into a singular massive file.
│ │ This file is not recommended, use `main_model.ipynb` instead.
│ │
│ ├── eneco_model.ipynb <- A recreation of the model used by Eneco.
│ │
│ ├── main_model.ipynb <- The best model, HistogramGradientBoostingRegressor.
│ │
│ ├── markovian_rnn.ipynb <- Markovian RNN model for using a limited history.
│ │
│ └── price_prediction.ipynb <- Model for predicting `settlement_price_realized`.
│
├── pyproject.toml <- Project configuration file with package metadata for tweedejaars_project
│ and configuration for tools like black.
│
├── environment.yml <- The requirements file for reproducing the analysis environment, e.g.
│ to recreate the conda environment `conda env create -f environment.yml`.
│
├── setup.cfg <- Configuration file for flake8.
│
├── make_dataset.py <- Simple wrapper script to process the data and generate features.
│
└── tweedejaars_project <- Source code for use in this project.
│
├── __init__.py <- Makes tweedejaars_project a Python module.
│
├── config.py <- Configuration of the Python module.
│
├── data <- Scripts to manage data.
│ │
│ ├── dataset.py <- Cleans and processes the data.
│ │
│ ├── features.py <- Creates new features.
│ │
│ ├── fileloader.py <- Scripts for loading and saving.
│ │
│ └── split.py <- Splits the data into training, validation and testing.
│
├── evaluation <- Scripts to evaluate models and features.
│ │
│ ├── adjustment.py <- Script for adjusting predictions.
│ │
│ ├── evaluate_model.py <- High performance code for testing models and features.
│ │
│ └── metrics.py <- Simple metrics and custom metrics.
│
├── utility <- Utility scripts.
│ │
│ └── misc.py <- Script containing miscellaneous functions.
│
└── visualization <- Scripts to visualize the data and metrics.
│
├── analyse.py <- Script containing simple plotting functions.
│
├── plot_metrics.py <- Plot the results of the metrics.
│
└── visualize.py <- Utility visualization functions.