Skip to content

Msturroc/nuclear_impact_prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Nuclear Impact Prediction

XGBoost machine learning models for rapid prediction of whether radioactive fallout from a nuclear power plant incident would impact Ireland, based on meteorological conditions and release parameters.

This code accompanies the paper:

M. Sturrock, R. Ryan, K. Kelleher, "Identification and Verification of Worst-Case Radiological Transport Scenarios for Ireland: A Simulation-Based Approach to Nuclear Emergency Preparedness (2011-2024)", under review at Science of the Total Environment, 2026.

Overview

Site-specific XGBoost binary classifiers are trained on 2.2 million HYSPLIT atmospheric dispersion simulations spanning 2011-2023 and validated on an independent 2024 hold-out dataset. The models predict impact/no-impact on Ireland given current meteorological conditions and release parameters, achieving 85-93% validation accuracy across six nuclear facility sites.

The models are intended for potential operational use in emergency decision support systems, enabling rapid screening of atmospheric conditions without running full dispersion simulations.

Sites

Models are trained independently for six nuclear power plants:

Site Country Type
Wylfa UK (Wales) Proposed SMR site
Heysham UK (England) AGR
Hinkley Point C UK (England) EPR (under construction)
Sizewell B UK (England) PWR
Flamanville France EPR
Paluel France PWR

Method

  1. Data: Weather summary statistics (mean, variance, min, max, median, skewness, kurtosis) for Ireland and the overall domain, extracted from ERA5-derived ARL meteorological fields, combined with release parameters (start hour, duration, height, day of year, month).
  2. Training: Exhaustive grid search over XGBoost hyperparameters with 10-fold year-grouped cross-validation, repeated 10 times for stability assessment.
  3. Validation: Independent temporal hold-out (2024 data never seen during training).
  4. Output: Champion model (JSON), performance report with confusion matrix, and feature importance ranking.

Preprocessing

Raw ERA5 ARL files need to be converted to summary statistics CSVs before training. This step uses ARLReader.jl to read the binary ARL format.

# Generate weather summary statistics from raw ARL files
julia --threads=auto src/preprocess_weather.jl /path/to/Weather_data/ /path/to/summary_weather/

This produces summary_ireland_YYYYMMDD_N.csv and summary_overall_YYYYMMDD_N.csv files for each date and 3-hourly time slice, containing statistics for 26 meteorological variables across surface and 3D levels (10m, 100m, 1000m).

Training

# Set data directories
export NPP_WEATHER_DIR=/path/to/summary_weather/
export NPP_DEP_DIR=/path/to/depositions/

# Train all sites
julia --threads=auto run.jl

# Train specific sites
julia --threads=auto run.jl Wylfa Heysham

Dependencies

Julia 1.10+ with packages: XGBoost, DataFrames, CSV, StatsBase, MLBase, MLUtils, OrderedCollections, Plots, JLD2, JSON, ARLReader.jl.

License

MIT

About

XGBoost models in pure Julia for rapid prediction of radiological impact on Ireland from nuclear power plant incidents. Trained on 2.2M HYSPLIT atmospheric dispersion simulations (2011-2024).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages