Skip to content

Data Processing and Simulation Tools for Networked SIR+

License

Notifications You must be signed in to change notification settings

yvs314/epi-net-m

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

epi-net-m

A Benchmarking Framework for Optimal Control over Network Dynamic Systems based on a Metapopulation Epidemic Model

With Data Engineering to generate a Benchmark, a proof-of-concept Numerical Solver for an optimal control problem over Network SIR, and Visualization Tools to map infection and control effort on choropleths. The optimal control problem is finite-horizon Bolza over a nonlinear system with per-node social distancing controls and time-discounted running cost, square in control and infections. It generalizes (El Ouardighi, Khmeltinsky, Sethi, 2022) to the network dynamic system case, with omission of health infrastructure tracking.

The Data Engineering part is in /tools and called “Oboe”; it is written in Julia. Oboe loads the tract populations, commute, and air travel data from /data/by-tract; uses this data to generate a travel model aggregated to the given level; and then outputs a <name>-trav.dat matrix containing the total daily travellers between each subdivision and a <name>-init.csv table containing the “patient zero” initial values for the SIR model. These serve as a benchmark set.

The Benchmark is in /data/by-tract, with some snapshots in Releases; these have between 2 and 9110 nodes. An instance <name> is made of two files,

  • <name>-init.csv with one row for each of the n nodes listing the populations and initial values of susceptible, infected, and recovered,
  • <name>-trav.dat with n×n matrix of daily travelers between the nodes

The Numerics is in /m-core and written in Matlab. It uses the forward-backward sweep method to solve a two-point boundary value problem (TPBVP) for a network SIR model instantiated with <name>-trav.dat and <name>-init.csv. The entry point is /m-core/sweep.m. 🚧 A Julia version may or may not be under construction 🚧

The Visualization is in /tools/viz and based on VegaLite's Julia interface. The VegaLite schemas generate county-level choropleths for infection incidence and control effort.

CAVEAT: all data and Jupyter notebooks are stored with Git LFS. If after cloning the repository or downloading its contents, instead of expected file content, you see something like this

version https://git-lfs.github.com/spec/v1
oid sha256:9e93547e554054a1678f4863fd62bac1577dd6eea6b2efce0d265b16d6e0f438
size 5208

then your Git LFS installation did not work. Get the benchmark from Releases if you are not in the mood for Git LFS.


Citation

If you use this software, please cite this repository and (Salii, 2022)

@inproceedings{salii2022benchmarking,
	author={Salii, Yaroslav V.},
	editor={Benito, Rosa Maria and Cherifi, Chantal and Cherifi, Hocine and Moro, Esteban and Rocha, Luis M. and Sales-Pardo, Marta},
	title={Benchmarking Optimal Control for Network Dynamic Systems with Plausible Epidemic Models},
	booktitle={Complex Networks {\&} Their Applications {X}},
	year={2022},
	publisher={Springer International Publishing},
	address={Cham},
	pages={194--206},
	isbn={978-3-030-93413-2}
}

Usage: Data Engineering

Run the data processing routine from /tools as follows:

julia oboe-cli.jl --fips FIPS... --agg AGG --name NAME [--force]
  • FIPS... is a space-separated list of FIPS codes of the states whose data is to be processed, or ALL to process all of the contiguous US.
  • AGG is the desired aggregation level, which can be one of:
    • tra for census tract
    • cty for county
    • ap for airport
    • ste for state
  • NAME is the prefix for output files. Will be treated as all-uppercase. Must not contain dashes (-) or underscores (_).
  • Include --force to allow overwriting existing output files with the same prefix.

Examples

# process Washington and Oregon, aggregate to tract level
julia oboe-cli.jl --fips NW --agg tra --name NW

# process California, aggregate by airport, force overwrite
julia oboe-cli.jl --fips 06 --agg ap --name CA --force

# process NY, NJ and CT, aggregate by county, force overwrite
julia oboe-cli.jl --fips 09 34 36 --agg cty --name TRI --force

# process entire contiguous US, aggregate by state
julia oboe-cli.jl --fips ALL --agg ste --name USA

Julia API

To access the Oboe API from within Julia, execute the following:

using FromFile
@from "(path to)/Oboe/Oboe.jl" import Oboe
  • If running from a Julia file or Jupyter notebook, the path should be relative to the location of said file.
  • If running from a REPL, the path should be relative to the working directory of the REPL. In particular, when using VS Code's "execute in REPL" functionality, the working directory is set to the project root, so the path should be ./tools/Oboe/Oboe.jl

This should bring every API function into the Oboe namespace, e.g. Oboe.mkPsgMx.


Usage: Numerics

Open /m-core/sweep.m in MATLAB. Review the parameters section and the requested instance name. Run sweep.m.

Solution export

Automated, look into ./out after running sweep.m. Four sets of .csv are produced, giving each simulation day's population of each node's compartments, both in absolute and fractional forms for optimal control and null control. The per-node control effort is exported in <name>-frac.csv as uX columns, where X is the simulation day's number. In addition, a <name>-log.csv is emitted, which describes the forward-backward sweep iterations.

Figures

  • figStacked.m stacked plot of z+s+r for a given node
  • figSimplex.m all the nodes' trajectories in the (z,s) simplex (% infected, %susceptible)
  • figTrajectory.m all the nodes' trajectories for a given compartment: susceptible s, infected z, or recovered r

Usage: Visualization

See the VegaLite-based routines in /tools/viz.

There are two Pluto.jl-based notebooks, which work for county-level aggregation and provide .svg export:

  • network-explorer.jl Displays the designated airports as computed with Oboe, the census tracts, and also per-county populations
  • solution-explorer.jl From .csv solution files displays
    • the absolute numbers of infected per county, side-by-side in optimal control vs null control
    • the control effort per-county
    • average control effort plot (average over populations)

The VegaLite schemas used in the above two are in vega-specs.jl and can be used independently, e.g. through Jupyter Notebook or Julia-for-VS-Code.


Acknowledgements

The data set and benchmark instances are in part derived from the FluTE data, coupled with U.S. domestic carrier air travel data from U.S. Bureau of Transportation Statistics and airport information from Openflights repository.

Yaroslav Salii @yvs314 is the principal author, who designed the original version of the Oboe data processing routine and the MATLAB implementation of the Forward-Backward Sweep numerical solution method for Network Metapopulation SIR Epidemic Model with Social-Distancing Optimal Control, and VegaLite-based visualizations.

Kara Ignatenko @karaign implemented the Oboe command-line interface, significantly improved the performance of air travel network generator mkPsgMx, implemented the FromFile.jl-based modular version of the Oboe data processing routine, and Pluto.jl-based solution explorers.

Rinel Foguen Tchuendom and Shuang Gao @sigmagao jointly contributed an early version of the Euler method solver for Network Metapopulation SIR Epidemic Model in Kronecker product notation.

This software was written when the authors were with McGill University.