Internal OCF application to run PVNet models and (optionally) PVNet summation models for the UK in a live environment. This involves accessing weather data stored in AWS S3 buckets, processing and loading this data using either our ocf-data-sampler
or ocf_datapipes
libraries, pulling pre-trained models from HuggingFace and then producing solar PV power forecasts for the UK by feeding the processed weather data into the model.
The app supports multiple model versions being deployed to live environments and these can be ran with specific configurations which are set via environment variables.
The following environment variables are used in the app:
DB_URL
: The URL for the database connection.NWP_UKV_ZARR_PATH
: The path to the UKV NWP data in Zarr format.NWP_ECMWF_ZARR_PATH
: The path to the ECMWF NWP data in Zarr format.SATELLITE_ZARR_PATH
: The path to the satellite data in Zarr format.
PVNET_V2_VERSION
: The version of the PVNet V2 model to use. Default is a version above.USE_ADJUSTER
: Option to use adjuster. Defaults to true.SAVE_GSP_SUM
: Option to save GSP sum for PVNet V2. Defaults to false.RUN_EXTRA_MODELS
: Option to run extra models. Defaults to false.DAY_AHEAD_MODEL
: Option to use day ahead model. Defaults to false.SENTRY_DSN
: Optional link to Sentry.ENVIRONMENT
: The environment this is running in. Defaults to local.USE_ECMWF_ONLY
: Option to use ECMWF only model. Defaults to false.USE_OCF_DATA_SAMPLER
: Option to use OCF data sampler. Defaults to true.FORECAST_VALIDATE_ZIG_ZAG_WARNING
: Threshold for warning on forecast zig-zag, defaults to 250 MW.FORECAST_VALIDATE_ZIG_ZAG_ERROR
: Threshold for error on forecast zig-zag, defaults to 500 MW.
Here are some examples of how to set these environment variables:
export DB_URL="postgresql://user:password@localhost:5432/dbname"
export NWP_UKV_ZARR_PATH="s3://bucket/path/to/ukv.zarr"
export NWP_ECMWF_ZARR_PATH="s3://bucket/path/to/ecmwf.zarr"
export SATELLITE_ZARR_PATH="s3://bucket/path/to/satellite.zarr"
export PVNET_V2_VERSION="v2.0.0"
export USE_ADJUSTER="true"
export SAVE_GSP_SUM="false"
export RUN_EXTRA_MODELS="false"
export DAY_AHEAD_MODEL="false"
export SENTRY_DSN="https://examplePublicKey@o0.ingest.sentry.io/0"
export ENVIRONMENT="production"
export USE_ECMWF_ONLY="false"
export USE_OCF_DATA_SAMPLER="true"
We run a number of different validation checks on the data and the forecasts that are made. These are in place to ensure quality forecasts are made and saved to the database.
Before feeding data into the model(s) we check whether the data avilable is compatible with the data that the model expects.
We check:
- Whether 5 minute and/or 15 minute satellite data is available
- We check if there are any NaNs in the satellite data, if there are, an error is raised
- We check if there are more that 10% zeros in the satellite data, if there are, an error is raised
- We check whether there are any missing timestamps in the satellite data. We linearly interpolate any gaps less that 15 minutes.
- We check whether the exact timestamps that the model expects are all available after infilling
We check:
- Whether the exact timestamps that the model expects from each NWP are available
Just before the batch data goes into the ML models, we check that
- All the NWP are not zeros. We raise an error if, for any nwp provider, all the NWP data is zero.
- TODO: openclimatefix/PVNet#324
After the ML models have run, we check the following
- The forecast is not above 110% of the national capacity. An error is raised if any forecast value is above 110% of the national capacity.
- The forecast is not above 100 GW, any forecast value above 30 GW we get a warning but any forecast value above 100 GW we raise an error.
- If the forecast goes up and then down more than 500 MW we raise an error. A warning is made for 250 MW. This stops zig-zag forecasts.
- TODO: Check positive values in day: #200
To be able to run the tests locally it is recommended to use conda & pip and follow the steps from the Install requirements section onwards in the Dockerfile or the install steps in the conda-pytest.yaml file and run tests the usual way via python -m pytest
. Note if using certain macs you may need to install python >= 3.11 to get this to work.
It is possbile to run the app locally by setting the required environment variables listed at the top of the app, these should point to the relevant data sources and DBs for the environment you want to run the app in. You will need to make sure you have opened a connection to the DB, as well as authenticating against any cloud providers where data may be stored (e.g if using AWS S3 then can do this via the AWS CLI command aws configure
), a simple notebook has been created as an example.
Thanks goes to these wonderful people (emoji key):
Dubraska Solórzano 💻 |
James Fulton 💻 |
Megawattz 💻 |
Peter Dudfield 💻 |
DivyamAgg24 💻 |
Aryan Bhosale 💻 |
Felix 💻 |
Aditya Sawant 💻 |
Sukhil Patel 💻 |
Ali Rashid 💻 |
Mahmoud Abdulmawlaa 💻 |
Meghana Sancheti 💻 |
Dheeraj Mukirala 📖 |
This project follows the all-contributors specification. Contributions of any kind welcome!