The snakemake configuration for GLUCOSE.
Install snakemake using conda into a new environment called snakemake
:
conda install -c conda-forge mamba
mamba create -c bioconda -c conda-forge -n snakemake snakemake-minimal pandas
Then, activate the environment using conda activate snakemake
on Mac and Linux, or activate snakemake
on Windows.
Now install the other dependencies using pip:
pip install -r requirements.txt
Update the paths in the config/config.yaml
file for datapackage:
and model_file:
keys. These should both hold relative paths to the combined GUI model datapackage.json
and osemosys model file (e.g. osemosys_fast.txt
).
Each of these paths can then point to the repositories outside of the gui_workflow. So deployment to an HPC will involve:
- Copying a release package of OSeMOSYS to the server and unzipping
- Cloning the gui_osemosys repository to the server
- Cloning this repository to the server
- Updating the config so that the paths point to the correct locations
A YAML file config.yaml
must be placed in the config directory.
# Populate the scenarios.csv file with a list of scenario names
# and path (description optional) to the model datapackage
datapackage: config/scenarios.csv
# Tell the workflow which model results to plot
result_params: config/results.csv
agg_results: config/agg_results.csv
# Filetype options: 'csv' or 'parquet' or 'feather'
filetype: csv
# Define the uncertain parameters used to define the Monte Carlo sample
parameters: config/parameters.csv
# Path to the OSeMOSYS model file
model_file: ../osemosys/OSeMOSYS_GNU_MathProg/src/osemosys_fast.txt
# Choose a solver, choices: 'cbc' or 'gurobi'
solver: cbc
# Sampling - how large should the sample be?
replicates: 100
A CSV file containing the following structure should be placed in the config directory:
name,group,indexes,min_value,max_value,dist,interpolation_index,action
CapitalCost,capex,"SIMPLICITY,NGCC",500,1100,unif,YEAR,interpolate
DiscountRate,discountrate,"GLOBAL,NGCC",0.05,0.20,unif,None,fixed
column_name | description |
---|---|
name | the name of the OSeMOSYS parameter file into which the values should be written |
group | the group to which the parameter belonngs (groups of like names are moved together) |
indexes | a string of comma-separated entries matching the set elements for the parameter |
min_value | the minimum value that the parameter will be sampled |
max_value | the maximum value that the parameter will be sampled |
dist | the probability distribution - currently, only 'unif' for uniform is supported |
interpolation_index | the index name over which the values will be interpolated |
action | 'interpolate' will interpolate with a straight-line between the start and end years, where the sample value replaces the end year value; 'fixed' will replace all values in the interpolation index with the sampled value |
A CSV file containing the following structure should be placed in the config directory:
name
ProductionByTechnologyAnnual
This lists the result files which will be generated by the otoole results processing script.
This CSV file allows users to specify parts of the OSeMOSYS result files to extract and aggregate, adding the model run as an index.
resultfile,indices,filename
ProductionByTechnologyAnnual,"SIMPLICITY,NGCC,SEC_EL",electricity_from_gas
ProductionByTechnologyAnnual,"SIMPLICITY,RIVER_2,WATIN",water_from_rivers
A CSV file containing the following structure should be placed in the config directory:
name,description,path
0,"Interconnector Optimised",../gui_osemosys/combined_model/combined_datapackage/datapackage.json
This file is used to point to master models. Importantly, each of the master models is used as a base for the N replicates defined in `config.yaml. If you define 3 master models in this file, and N=100, then 300 model runs will be scheduled, but with the same 100 parameter values.
Use these master models to define macro scenarios - e.g. forcing in and out a key technology.
To run the workflow, using the command snakemake --use-conda --cores 4 --resources mem_mb=16000 disk_mb=30000
You can also change parts of the configuration by adding the --config
flag, followed by the names of one
or more of the config items. E.g.
snakemake --use-conda --cores 4 --config filetype=parquet replicates=100
To visualise the workflow, run the following rule: snakemake plot_dag --use-conda --cores 2
This repository follows the snakemake guidelines for reproducibility.
GLUCOSE # The input data lives here
├── datapackage.json
gui_workflow # The gui_workflow repository
├── .gitignore
├── README.md
├── LICENSE.md
├── modelrun
├── workflow
│ ├── rules
| │ ├── module1.smk
| │ └── module2.smk
│ ├── envs
| │ ├── tool1.yaml
| │ └── tool2.yaml
│ ├── scripts
| │ ├── script1.py
| │ └── script2.R
│ ├── notebooks
| │ ├── notebook1.py.ipynb
| │ └── notebook2.r.ipynb
│ ├── report
| │ ├── plot1.rst
| │ └── plot2.rst
| └── Snakefile
├── config # Files from this repository live in here
│ ├── config.yaml
│ └── some-sheet.csv
├── results
└── resources