README.Rmd

---
output: github_document
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/",
  out.width = "100%"
)
```


# eppverification: Verification Tools for the Statistical Postprocessing of Ensemble Forecasts <img src='man/figures/sticker.png' align="right" height="150" />
<!-- badges: start -->
[![CRAN status](https://www.r-pkg.org/badges/version/eppverification)](https://CRAN.R-project.org/package=eppverification)
[![R-CMD-check](https://github.com/jobstdavid/eppverification/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/jobstdavid/eppverification/actions/workflows/R-CMD-check.yaml)
[![codecov](https://codecov.io/gh/jobstdavid/eppverification/branch/master/graph/badge.svg?token=WJ7XWFO9PT)](https://codecov.io/gh/jobstdavid/eppverification)
[![version](https://img.shields.io/badge/version-0.4.0-green.svg?style=flat)](https://github.com/jobstdavid/eppverification)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5574572.svg)](https://doi.org/10.5281/zenodo.5574572)
<!-- badges: end -->


An R package providing user-friendly **univariate and multivariate verification tools** for the statistical ensemble post-processing. It allows to score and assess the **calibration (reliability)** and **sharpness** of ensemble forecasts and predictive distributions. In addition this package can be used to create **useful contemporary visualizations** for verification. 

## Installation

You can install the latest development version from [GitHub](https://github.com/jobstdavid) with:

```r
#install.packages("remotes")
remotes::install_github("jobstdavid/eppverification")
```

## Package overview

The goal of probabilistic forecasting is to maximize the **sharpness** of the probabilistic forecast *F* (CDF) subject to **calibration**. Therefore this package contains tools for assessing: 

* **calibration**. It  refers to the statistical consistency between the predictive probabilistic forecast *F* and the associated observation *y*. Consequently it is a joint property of predictions and verifications. The predictive probabilistic forecast *F* is calibrated, if the observation *y* can not be distinguished from a random draw from the predictive probabilistic forecast *F*.
* **sharpness**. It refers to the concentration of the predictive probabilistic forecast *F*. Additionally it is a property of the predictive probabilistic forecast *F*, only. The more concentrated/narrower the predictive probabilistic forecast *F* is, the sharper the forecast is.
* **calibration and sharpness simultaneously**. Proper scoring rules assess calibration and sharpness properties of the predictive probabilistic forecast *F* simultaneously. They are functions of the predictive probabilistic forecast *F* and the associated observation *y*. A smaller score of a proper scoring rule indicates a "better" forecast.

In the following you find univariate and multivariate verification tools for calibration, sharpness and proper scoring rules, where the corresponding function provided within this package is written in brackets. 

### Univariate Verification Tools
* **Calibration**: Verification Rank Histogram (`vr.hist`), Reliability Index (`ri`), Entropy (`ent`), PIT Histogram (`pit.hist`), Central Prediction Interval Coverage  (`cpi`). 
* **Sharpness**: Root Mean Variance  (`rmv`), Central Prediction Interval Width  (`cpi`).
* **Dispersion**: Variance of PIT-Values  (`var.pit`).
* **Proper Scoring Rules**: Continuous Ranked Probability Score (`crps`), Logarithmic Score (`logs`), Interval Score (`is`), Quantile Score (`qs`),  Brier Score (`bs`), Dawid-Sebastiani Score (`dss`), Absolute Error (`ae`), Squared Error (`se`).

### Multivariate Verification Tools
* **Calibration**: Multivariate Verification Rank Histogram (`mvr.hist`), Multivariate Reliability Index (`mri`), Multivariate Entropy (`ment`).
* **Sharpness**: Determinant Sharpness (`ds`).
* **Proper Scoring Rules**: Logarithmic Score (`logs`), Energy Score (`es`), Euclidean Error (`ee`), Variogram Score (`vs`).

Further functions for **model comparison and visualizations** in this package are: 

* `dm.test`: This function performs a Diebold-Mariano-Test for two forecasts.
* `bh.test`: This function performs a Benjamini-Hochberg-Correction for different p-values.
* `ss`: This function calculates the skill score. 
* `cpi.plot`: This function plots the central prediction intervals for a certain interval range.
* `cov.plot`: This function plots the central prediction interval coverage for certain interval ranges of different models. 
* `score.plot`: This function plots scores of different models rated by selected measures as heatmap.

## Examples

```{r example1}
# load R package
library(eppverification)

# set.seed for reproducibility 
set.seed(2023)
```

### Univariate Verification Tools

#### Calibration
```{r example2}
# simulated data
n <- 30
m <- 50
y <- rnorm(n)
x <- matrix(rnorm(n*m), ncol = m)

# Verification Rank Histogram
vr.hist(y = y, x = x, bins = 3, reliability = TRUE, entropy = TRUE)
```

```{r example3}
# simulated data
n <- 10000
u <- runif(n)

# PIT Histogram
pit.hist(u = u, bins = 5, dispersion = TRUE, bias = TRUE)

```

#### Sharpness
```{r example4}
# Root Mean Variance
rmv(x = x)
```

#### Proper Scoring Rules
```{r example5}
# simulated data
n <- 30
m <- 50
y <- rnorm(n)
x <- matrix(rnorm(n*m), ncol = m)

# Continuous Ranked Probability Score
crps(y = y, x = x, method = "ens", aggregate = mean)
```

```{r example6}
# simulated data
n <- 30
y <- rnorm(n, mean = 1:n)
nominal.coverage <- 90
alpha <- (100-nominal.coverage)/100
lower <- qnorm(alpha/2, rnorm(n, mean = 1:n))
upper <- qnorm((1-alpha/2), rnorm(n, mean = 1:n))

# Central Prediction Interval Values
cpi(y = y, lower = lower, upper = upper, nominal.coverage = nominal.coverage,
separate = c("is", "overprediction", "underprediction", "width", "coverage"), aggregate = mean)
```

### Multivariate Verification Tools

#### Calibration
```{r example7}
# simulated data
n <- 30
m <- 50
y <- cbind(rnorm(n), rgamma(n, shape = 1))
x <- array(NA, dim = c(m, 2, n))
x[, 1, ] <- rnorm(n*m)
x[, 2, ] <- rgamma(n*m, shape = 1)

# Multivariate Verification Rank Histogram
mvr.hist(y = y, x = x, method = "mv", type = "absolute", bins = 17)
```


#### Sharpness
```{r example8}
# simulated data
n <- 30
m <- 50
x <- array(NA, dim = c(2, 2, n))
for (i in 1:n) {
x[, , i] <- cov(cbind(rnorm(m), rgamma(m, shape = 1)))
}

#Determinant Sharpness
ds(x = x, covmat = TRUE, aggregate = mean)
```


#### Proper Scoring Rules
```{r example9}
# simulated data
n <- 30
m <- 50
y <- cbind(rnorm(n), rgamma(n, shape = 1))
x <- array(NA, dim = c(m, 2, n))
x[, 1, ] <- rnorm(n*m)
x[, 2, ] <- rgamma(n*m, shape = 1)

# Energy Score
es(y = y, x = x, method = "ens", aggregate = mean)

# Euclidean Error
ee(y = y, x = x, method = "median", aggregate = mean)
```

### Model Comparison and Visualizations

```{r example10}
# simulated data
n <- 365
s1 <- arima.sim(list(ar = 0.7), sd = 0.5, 100)
s2 <- arima.sim(list(ar = 0.7), sd = 0.5, 100) - 0.2
p <- runif(100, min = 0, max = 0.05)

# Diebold-Mariano-Test
dm.test(s1, s2, alternative = "two.sided", h = 1)

# Benjamini-Hochberg-Procedure
bh.test(p, alpha = 0.05)
```

```{r example11}
# simulated data
n <- 30
x <- seq(Sys.Date(), by = "day", length.out = n)
y <- rnorm(n, mean = 1:n)
nominal.coverage <- 90
alpha <- (100-nominal.coverage)/100
lower <- qnorm(alpha/2, rnorm(n, mean = 1:n))
upper <- qnorm((1-alpha/2), rnorm(n, mean = 1:n))

# Central Prediction Intervals Plot
cpi.plot(x = x, y = y, lower = lower, upper = upper, nominal.coverage = nominal.coverage, x.lab = "Date", y.lab = "Value", info = TRUE)
```

```{r example12}
# simulated data
n <- 30
x <- matrix(runif(n)*100, ncol = 3)
x <- apply(x, 2, sort)
nominal.coverage <- seq(5, 95, length.out = 10)
models <- c("A", "B", "C")

# Central Prediction Interval Coverage Model Comparison
cov.plot(x = x, models = models, nominal.coverage = nominal.coverage)
```

```{r example13}
# simulated data
x <- matrix(c(0.5, 0.3, 0.8, 0.21, 1.5, 0.7, 2, 1), byrow = TRUE, ncol = 4)
models <- c("A", "B", "C", "D")
measures <- c("CRPS", "LogS")

# Score Plot
score.plot(x = x, models = models, measures = measures)
```

## Contact

Feel free to contact [jobstd@uni-hildesheim.de](mailto:jobstd@uni-hildesheim.de) if you have any questions or suggestions.

## References
Gneiting, T. and Raftery, A. (2007). Strictly Proper Scoring Rules,
Prediction, and Estimation. Journal of the American Statistical
Association. 102(477). 359-378.