Skip to content
This repository has been archived by the owner on Feb 1, 2024. It is now read-only.
/ covid_19_forecast Public archive

ARIMA-based forecasting of confirmed COVID/ Corona cases for various country-province combinations

Notifications You must be signed in to change notification settings

atecon/covid_19_forecast

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Time-series based 7-days ahead forecasts of (known) confirmed cases of the novel Corona virus COVID-19 (2019-nCoV)

Archived since 2021-01-12: This repo is archived but still can be forked.

Statement: This is just "hobby" project and forecast results should not be taken too seriously! Even though forecast performance may be reasonable for some country and some period in time. Don't take the numbers forecast for granted! This project is mainly supposed to show what may be realized (without too much effort) with the open-source statistics and econometrics software gretl (URL: http://gretl.sourceforge.net/).

Minimum gretl version: At least gretl 2020b.

An automated job retrives latest data at 3am (CET), trains a new model, computes the forecasts and uploads the new forecasting plots here to my github-repo. The overall job finishes in about 15 seconds on a Raspberry Pi 4 computer --- this is just an amazing piece of hardware ;-)

Data source

Data provided by the Johns Hopkins University Center for Systems Science and Engineering (JHU CCSE) can be found here: https://github.com/CSSEGISandData/COVID-19

Some words on the underlying model

I want to keep it simple. If you fancy, you can use the current state of code as a starting point for devloping and evaluating more complex approaches. However, the ARIMA type of model applied may already provide a reasonable approach for modelling and computing short-term contagion dynamics.

I recently (2020-07-26) switched from simple pre-defined ARIMA models to my official auto_arima package for gretl automatcally searching for the "best" model. Details on the auto_arima package can be found here: https://github.com/atecon/auto_arima

The code executes a brute-force search for the 'best' ARIMA model specification by optimizing the corrected Akaike information criteria ("aicc"). The following parameter space -- implying 120 different ARIMA models in total -- is evaluated:

Default ARIMA parameter space:

string ARIMA_OPTS.INFO_CRIT = "aicc"		# information criteria to optimize
scalar ARIMA_OPTS.min_p = 0					# autoregressive (AR) order
scalar ARIMA_OPTS.max_p = 4                 # autoregressive (AR) order
scalar ARIMA_OPTS.min_d = 0                 # differencing order
scalar ARIMA_OPTS.max_d = 2                 # differencing order
scalar ARIMA_OPTS.min_q = 0                 # moving average (MA) order
scalar ARIMA_OPTS.max_q = 1                 # moving average (MA) order

scalar ARIMA_OPTS.min_P = 0                 # seasonal autoregressive (AR) order
scalar ARIMA_OPTS.max_P = 1                 # seasonal autoregressive (AR) order
scalar ARIMA_OPTS.min_D = 0                 # seasonal differencing order
scalar ARIMA_OPTS.max_D = 0                 # seasonal differencing order
scalar ARIMA_OPTS.min_Q = 0                 # seasonal MA order
scalar ARIMA_OPTS.max_Q = 1                 # seasonal MA order

Forecasting method

We compute out-of-sample multi-period interval forecasts. The multi-period forecast is recursively computed. Per default the 90 % forecast (Gaussian) interval will be shown as well.

The gretl script

The gretl script for setting up relevant things and executing the analysis is ./script/run.inp.

At the beginning of the script, the user can specify the following parameters:

string DIR_WORK = "" 		# <SET_PATH_HERE> e.g. "/home/git_project"
string DATA_URL = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Confirmed.csv"	# Data source
string INITIAL_DATE = "2020-01-22"			# 1st available observation of CSSE dataset
scalar MINIMUM_CASES = 30					# Minimal number of confirmed cases at latest observation for consideration
string SAVE_PLOT_AS = "png"					# grpahic format: "png", "pdf"", eps"

# ARIMA model settings -- see auto_arima package: http://ricardo.ecn.wfu.edu/gretl/cgi-bin/current_fnfiles/auto_arima.gfn
bundle ARIMA_OPTS = null
scalar ARIMA_OPTS.MAX_HORIZON = 7           # max. multi-step OoS forecast horizon
# optimize by information criteria, either aic, aicc, bic or hqc
string ARIMA_OPTS.INFO_CRIT = "aicc"
scalar ARIMA_OPTS.min_p = 0					# autoregressive (AR) order
scalar ARIMA_OPTS.max_p = 4                 # autoregressive (AR) order
scalar ARIMA_OPTS.min_d = 0                 # differencing order
scalar ARIMA_OPTS.max_d = 2                 # differencing order
scalar ARIMA_OPTS.min_q = 0                 # moving average (MA) order
scalar ARIMA_OPTS.max_q = 1                 # moving average (MA) order

scalar ARIMA_OPTS.min_P = 0                 # seasonal autoregressive (AR) order
scalar ARIMA_OPTS.max_P = 1                 # seasonal autoregressive (AR) order
scalar ARIMA_OPTS.min_D = 0                 # seasonal differencing order
scalar ARIMA_OPTS.max_D = 0                 # seasonal differencing order
scalar ARIMA_OPTS.min_Q = 0                 # seasonal MA order
scalar ARIMA_OPTS.max_Q = 1                 # seasonal MA order

This script will also load the functions doing the main stuff in the beckground which are stored in ./src/helper.inp as well as the 3rd party library auto_arima.

The gretl script can be executed in the following ways: Option A:

1) Clone the repo by means of ```git clone```
2) open the script "./script/run.inp"
3) Set your project path by setting the variable "DIR_WORK" accordingly.
4) Execute and enjoy.

Option B (works for linux):

1) Clone the repo by means of ```git clone```
2) Execute the shell-script run.sh

What does the script do?

The script downloads latest available CCSE-data, processes the raw data for obtaining a clean panel data set. Next, for each country-province combination two exercises are conducted:

1) If the parameter ```RUN_EXPOST_ANALYSIS``` is set to '1' , an **ex-post** forecasting analysis is done. For this,  the training-set is set to <CURRENT_DATE - MAX_HORIZON> observations where "CURRENT_DATE" refers to latest date for which data is available, and ```MAX_HORIZON``` is the set multi-step forecast horizon (default 7 days).
2) Out-of-sample interval forecasts for the forthcoming ```MAX_HORIZON``` days are computed.

Ex-Post and up-to-date out-of-sample 7-days ahead forecasts

The left panel shows forecast made in information available 7 days ago and the realization of 'confirmed cases' during this period. This may give you an idea of how 'well' the forecasted dynamics were.

The middle panel shows forecasts made on latest data for the forthcoming 7 days. The left panel depicts both the historic and predicted day-to-day changes of confirmed cases.

afghanistan -

albania -

algeria -

andorra -

angola -

antigua_and_barbuda -

argentina -

armenia -

australia - australian_capital_territory

australia - new_south_wales

australia - northern_territory

australia - queensland

australia - south_australia

australia - tasmania

australia - victoria

australia - western_australia

austria -

azerbaijan -

bahamas -

bahrain -

bangladesh -

barbados -

belarus -

belgium -

belize -

benin -

bhutan -

bolivia -

bosnia_and_herzegovina -

botswana -

brazil -

brunei -

bulgaria -

burkina_faso -

burma -

burundi -

cabo_verde -

cambodia -

cameroon -

canada - alberta

canada - british_columbia

canada - manitoba

canada - new_brunswick

canada - newfoundland_and_labrador

canada - nova_scotia

canada - ontario

canada - prince_edward_island

canada - quebec

canada - saskatchewan

central_african_republic -

chad -

chile -

china - anhui

china - beijing

china - chongqing

china - fujian

china - gansu

china - guangdong

china - guangxi

china - guizhou

china - hainan

china - hebei

china - heilongjiang

china - henan

china - hong_kong

china - hubei

china - hunan

china - inner_mongolia

china - jiangsu

china - jiangxi

china - jilin

china - liaoning

china - macau

china - ningxia

china - shaanxi

china - shandong

china - shanghai

china - shanxi

china - sichuan

china - tianjin

china - xinjiang

china - yunnan

china - zhejiang

colombia -

comoros -

congo_(brazzaville) -

congo_(kinshasa) -

costa_rica -

cote_d'ivoire -

croatia -

cuba -

cyprus -

czechia -

denmark - faroe_islands

denmark -

diamond_princess -

djibouti -

dominica -

dominican_republic -

ecuador -

egypt -

el_salvador -

equatorial_guinea -

eritrea -

estonia -

eswatini -

ethiopia -

fiji -

finland -

france - french_guiana

france - french_polynesia

france - guadeloupe

france - martinique

france - mayotte

france - reunion

france - saint_barthelemy

france - st_martin

france -

gabon -

gambia -

georgia -

germany -

ghana -

greece -

grenada -

guatemala -

guinea -

guinea-bissau -

guyana -

haiti -

honduras -

hungary -

iceland -

india -

indonesia -

iran -

iraq -

ireland -

israel -

italy -

jamaica -

japan -

jordan -

kazakhstan -

kenya -

korea_south -

kosovo -

kuwait -

kyrgyzstan -

latvia -

lebanon -

lesotho -

liberia -

libya -

liechtenstein -

lithuania -

luxembourg -

madagascar -

malawi -

malaysia -

maldives -

mali -

malta -

mauritania -

mauritius -

mexico -

moldova -

monaco -

mongolia -

montenegro -

morocco -

mozambique -

namibia -

nepal -

netherlands - aruba

netherlands - bonaire_sint_eustatius_and_saba

netherlands - curacao

netherlands - sint_maarten

netherlands -

new_zealand -

nicaragua -

niger -

nigeria -

north_macedonia -

norway -

oman -

pakistan -

panama -

papua_new_guinea -

paraguay -

peru -

philippines -

poland -

portugal -

qatar -

romania -

russia -

rwanda -

saint_lucia -

saint_vincent_and_the_grenadines -

san_marino -

sao_tome_and_principe -

saudi_arabia -

senegal -

serbia -

seychelles -

sierra_leone -

singapore -

slovakia -

slovenia -

somalia -

south_africa -

south_sudan -

spain -

sri_lanka -

sudan -

suriname -

sweden -

switzerland -

syria -

taiwan* -

tajikistan -

tanzania -

thailand -

timor-leste -

togo -

trinidad_and_tobago -

tunisia -

turkey -

us -

uganda -

ukraine -

united_arab_emirates -

united_kingdom - bermuda

united_kingdom - british_virgin_islands

united_kingdom - cayman_islands

united_kingdom - channel_islands

united_kingdom - gibraltar

united_kingdom - isle_of_man

united_kingdom - turks_and_caicos_islands

united_kingdom -

uruguay -

uzbekistan -

venezuela -

vietnam -

west_bank_and_gaza -

yemen -

zambia -

zimbabwe -

About

ARIMA-based forecasting of confirmed COVID/ Corona cases for various country-province combinations

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages