normet is a Python package for environmental time-series analysis.
It provides tools for:
- Normalisation / deweathering of pollutant concentrations.
- Counterfactual modelling using AutoML backends (FLAML, H2O).
- Synthetic control methods (ASCM, ML-ASCM).
- Uncertainty quantification via bootstrapping and placebo tests.
- Evaluation metrics tailored for environmental data.
The package is designed for air-quality research, causal inference, and policy evaluation.
---
- High-level pipelines for normalisation and synthetic control.
- Rolling weather normalisation for short-term trend analysis.
- Time-series decomposition separating emissions-driven and meteorology-driven variability.
- Multiple backends: FLAML, H2O AutoML.
- Placebo-in-space and placebo-in-time analyses for robustness checks.
- Bootstrap and jackknife-based uncertainty bands.
- Rich evaluation metrics: RMSE, FAC2, IOA, R², etc.
- Parallel execution for large panel datasets.
---
Basic installation (core functionality, no AutoML backends):
pip install normetOptional backends:
- FLAML (lightweight, recommended for most users): - conda install flaml -c conda-forge 
- H2O (heavier, requires Java): - pip install h2o 
Install both:
pip install normet[all]---
A quick example with the :func:`do_all` pipeline:
import pandas as pd
from normet import do_all, modStats
# Example dataset (must contain datetime + target + predictors)
df = pd.read_csv("example.csv")
# Run the pipeline
out, model, df_prep = do_all(
    df,
    value="pm25",
    backend="flaml",
    feature_names=["temp", "wind", "humidity"],
    n_samples=300,
)
# Results
print(out.head())        # Normalised (deweathered) time-series
print(df_prep.head())    # Prepared dataset with splits & features
print(model)             # Trained AutoML model
# Evaluate model performance manually
stats = modStats(df_prep, model)
print(stats)The pipeline performs:
- Data preparation — parse datetime, validate target/features, impute values, add date-based covariates, and split into training/testing.
- Model training — trains a model using AutoML (FLAML or H2O).
- Normalisation — resamples weather covariates and estimates counterfactual ("deweathered") series.
Returned values:
- out: DataFrame with observed and normalised series (and resample outputs if- aggregate=False).
- model: the trained AutoML model object.
- df_prep: prepared dataset after preprocessing and splitting.
---
EMI decomposition (emissions-driven component):
from normet.analysis.decomposition import decom_emi
emi = decom_emi(
    df=df,
    value="pm25",
    backend="flaml",
    feature_names=["temp", "wind", "humidity"],
    n_samples=200,
)
print(emi.head())
# Columns include:
# observed, date_unix, day_julian, weekday, hour,
# emi_total, emi_noise, emi_baseMET decomposition (meteorology-driven component):
from normet.analysis.decomposition import decom_met
met = decom_met(
    df=df,
    value="pm25",
    backend="flaml",
    feature_names=["temp", "wind", "humidity"],
    n_samples=200,
)
print(met.head())
# Columns include:
# observed, emi_total, <each meteorological feature>,
# met_total, met_base, met_noise---
Run augmented synthetic control (ASCM):
from normet.scm import _run_syn
syn = _run_syn(
    df=df_panel,
    date_col="date",
    unit_col="city",
    outcome_col="pm25",
    treated_unit="Beijing",
    cutoff_date="2017-01-01",
    donors=["Shanghai", "Guangzhou", "Chengdu"],
    ascm_backend="ascm",
)
print(syn.head())  # observed, synthetic, effectPlacebo-in-space test:
from normet.scm import placebo_in_space, effect_bands_space
out = placebo_in_space(
    df=df_panel,
    date_col="date",
    unit_col="city",
    outcome_col="pm25",
    treated_unit="Beijing",
    cutoff_date="2017-01-01",
)
bands = effect_bands_space(out, level=0.95)
print(bands.head())Placebo-in-time test:
from normet.scm import placebo_in_time
out_time = placebo_in_time(
    df=df_panel,
    date_col="date",
    unit_col="city",
    outcome_col="pm25",
    treated_unit="Beijing",
    cutoff_date="2017-01-01",
    ascm_backend="ascm", #'ascm' or 'mlascm'
    n_rep=50,  # number of pseudo cutoffs to test
)
print(out_time.head())---
Uncertainty bands can be constructed using either bootstrap or jackknife methods:
from normet.scm import uncertainty_bands, plot_uncertainty_bands
# Bootstrap version
boot = uncertainty_bands(
    df=df_panel,
    date_col="date",
    unit_col="city",
    outcome_col="pm25",
    treated_unit="Beijing",
    cutoff_date="2017-01-01",
    ascm_backend="ascm",
    method="bootstrap",   # donor/time resampling
    B=200,
)
plot_uncertainty_bands(boot, cutoff_date="2017-01-01")
# Jackknife version
jack = uncertainty_bands(
    df=df_panel,
    date_col="date",
    unit_col="city",
    outcome_col="pm25",
    treated_unit="Beijing",
    cutoff_date="2017-01-01",
    ascm_backend="ascm",
    method="jackknife",   # leave-one-donor-out
    ci_level=0.95,
)
plot_uncertainty_bands(jack, cutoff_date="2017-01-01")---
- Python >= 3.9
- numpy >= 1.22
- pandas >= 1.5
- scipy >= 1.10
- joblib >= 1.2
- matplotlib >= 3.6
Optional: - flaml >= 2.1 - h2o >= 3.44
---
If you use normet in your research, please cite:
Song, C. (2025). normet: Normalisation, Decomposition, and Counterfactual Modelling for Environmental Time-series. University of Manchester. GitHub repository: https://github.com/dsncas/normet
---
This project is licensed under the MIT License.
---
Contributions are welcome! Please:
- Fork the repository.
- Create a feature branch.
- Submit a pull request with clear description and tests.
Bug reports and feature requests can be submitted via the issue tracker.