-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
242 lines (187 loc) · 8.38 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/",
out.width = "100%"
)
```
# eppverification: Verification Tools for the Statistical Postprocessing of Ensemble Forecasts <img src='man/figures/sticker.png' align="right" height="150" />
<!-- badges: start -->
[](https://CRAN.R-project.org/package=eppverification)
[](https://github.com/jobstdavid/eppverification/actions/workflows/R-CMD-check.yaml)
[](https://codecov.io/gh/jobstdavid/eppverification)
[](https://github.com/jobstdavid/eppverification)
[](https://doi.org/10.5281/zenodo.5574572)
<!-- badges: end -->
An R package providing user-friendly **univariate and multivariate verification tools** for the statistical ensemble post-processing. It allows to score and assess the **calibration (reliability)** and **sharpness** of ensemble forecasts and predictive distributions. In addition this package can be used to create **useful contemporary visualizations** for verification.
## Installation
You can install the latest development version from [GitHub](https://github.com/jobstdavid) with:
```r
#install.packages("remotes")
remotes::install_github("jobstdavid/eppverification")
```
## Package overview
The goal of probabilistic forecasting is to maximize the **sharpness** of the probabilistic forecast *F* (CDF) subject to **calibration**. Therefore this package contains tools for assessing:
* **calibration**. It refers to the statistical consistency between the predictive probabilistic forecast *F* and the associated observation *y*. Consequently it is a joint property of predictions and verifications. The predictive probabilistic forecast *F* is calibrated, if the observation *y* can not be distinguished from a random draw from the predictive probabilistic forecast *F*.
* **sharpness**. It refers to the concentration of the predictive probabilistic forecast *F*. Additionally it is a property of the predictive probabilistic forecast *F*, only. The more concentrated/narrower the predictive probabilistic forecast *F* is, the sharper the forecast is.
* **calibration and sharpness simultaneously**. Proper scoring rules assess calibration and sharpness properties of the predictive probabilistic forecast *F* simultaneously. They are functions of the predictive probabilistic forecast *F* and the associated observation *y*. A smaller score of a proper scoring rule indicates a "better" forecast.
In the following you find univariate and multivariate verification tools for calibration, sharpness and proper scoring rules, where the corresponding function provided within this package is written in brackets.
### Univariate Verification Tools
* **Calibration**: Verification Rank Histogram (`vr.hist`), Reliability Index (`ri`), Entropy (`ent`), PIT Histogram (`pit.hist`), Central Prediction Interval Coverage (`cpi`).
* **Sharpness**: Root Mean Variance (`rmv`), Central Prediction Interval Width (`cpi`).
* **Dispersion**: Variance of PIT-Values (`var.pit`).
* **Proper Scoring Rules**: Continuous Ranked Probability Score (`crps`), Logarithmic Score (`logs`), Interval Score (`is`), Quantile Score (`qs`), Brier Score (`bs`), Dawid-Sebastiani Score (`dss`), Absolute Error (`ae`), Squared Error (`se`).
### Multivariate Verification Tools
* **Calibration**: Multivariate Verification Rank Histogram (`mvr.hist`), Multivariate Reliability Index (`mri`), Multivariate Entropy (`ment`).
* **Sharpness**: Determinant Sharpness (`ds`).
* **Proper Scoring Rules**: Logarithmic Score (`logs`), Energy Score (`es`), Euclidean Error (`ee`), Variogram Score (`vs`).
Further functions for **model comparison and visualizations** in this package are:
* `dm.test`: This function performs a Diebold-Mariano-Test for two forecasts.
* `bh.test`: This function performs a Benjamini-Hochberg-Correction for different p-values.
* `ss`: This function calculates the skill score.
* `cpi.plot`: This function plots the central prediction intervals for a certain interval range.
* `cov.plot`: This function plots the central prediction interval coverage for certain interval ranges of different models.
* `score.plot`: This function plots scores of different models rated by selected measures as heatmap.
## Examples
```{r example1}
# load R package
library(eppverification)
# set.seed for reproducibility
set.seed(2023)
```
### Univariate Verification Tools
#### Calibration
```{r example2}
# simulated data
n <- 30
m <- 50
y <- rnorm(n)
x <- matrix(rnorm(n*m), ncol = m)
# Verification Rank Histogram
vr.hist(y = y, x = x, bins = 3, reliability = TRUE, entropy = TRUE)
```
```{r example3}
# simulated data
n <- 10000
u <- runif(n)
# PIT Histogram
pit.hist(u = u, bins = 5, dispersion = TRUE, bias = TRUE)
```
#### Sharpness
```{r example4}
# Root Mean Variance
rmv(x = x)
```
#### Proper Scoring Rules
```{r example5}
# simulated data
n <- 30
m <- 50
y <- rnorm(n)
x <- matrix(rnorm(n*m), ncol = m)
# Continuous Ranked Probability Score
crps(y = y, x = x, method = "ens", aggregate = mean)
```
```{r example6}
# simulated data
n <- 30
y <- rnorm(n, mean = 1:n)
nominal.coverage <- 90
alpha <- (100-nominal.coverage)/100
lower <- qnorm(alpha/2, rnorm(n, mean = 1:n))
upper <- qnorm((1-alpha/2), rnorm(n, mean = 1:n))
# Central Prediction Interval Values
cpi(y = y, lower = lower, upper = upper, nominal.coverage = nominal.coverage,
separate = c("is", "overprediction", "underprediction", "width", "coverage"), aggregate = mean)
```
### Multivariate Verification Tools
#### Calibration
```{r example7}
# simulated data
n <- 30
m <- 50
y <- cbind(rnorm(n), rgamma(n, shape = 1))
x <- array(NA, dim = c(m, 2, n))
x[, 1, ] <- rnorm(n*m)
x[, 2, ] <- rgamma(n*m, shape = 1)
# Multivariate Verification Rank Histogram
mvr.hist(y = y, x = x, method = "mv", type = "absolute", bins = 17)
```
#### Sharpness
```{r example8}
# simulated data
n <- 30
m <- 50
x <- array(NA, dim = c(2, 2, n))
for (i in 1:n) {
x[, , i] <- cov(cbind(rnorm(m), rgamma(m, shape = 1)))
}
#Determinant Sharpness
ds(x = x, covmat = TRUE, aggregate = mean)
```
#### Proper Scoring Rules
```{r example9}
# simulated data
n <- 30
m <- 50
y <- cbind(rnorm(n), rgamma(n, shape = 1))
x <- array(NA, dim = c(m, 2, n))
x[, 1, ] <- rnorm(n*m)
x[, 2, ] <- rgamma(n*m, shape = 1)
# Energy Score
es(y = y, x = x, method = "ens", aggregate = mean)
# Euclidean Error
ee(y = y, x = x, method = "median", aggregate = mean)
```
### Model Comparison and Visualizations
```{r example10}
# simulated data
n <- 365
s1 <- arima.sim(list(ar = 0.7), sd = 0.5, 100)
s2 <- arima.sim(list(ar = 0.7), sd = 0.5, 100) - 0.2
p <- runif(100, min = 0, max = 0.05)
# Diebold-Mariano-Test
dm.test(s1, s2, alternative = "two.sided", h = 1)
# Benjamini-Hochberg-Procedure
bh.test(p, alpha = 0.05)
```
```{r example11}
# simulated data
n <- 30
x <- seq(Sys.Date(), by = "day", length.out = n)
y <- rnorm(n, mean = 1:n)
nominal.coverage <- 90
alpha <- (100-nominal.coverage)/100
lower <- qnorm(alpha/2, rnorm(n, mean = 1:n))
upper <- qnorm((1-alpha/2), rnorm(n, mean = 1:n))
# Central Prediction Intervals Plot
cpi.plot(x = x, y = y, lower = lower, upper = upper, nominal.coverage = nominal.coverage, x.lab = "Date", y.lab = "Value", info = TRUE)
```
```{r example12}
# simulated data
n <- 30
x <- matrix(runif(n)*100, ncol = 3)
x <- apply(x, 2, sort)
nominal.coverage <- seq(5, 95, length.out = 10)
models <- c("A", "B", "C")
# Central Prediction Interval Coverage Model Comparison
cov.plot(x = x, models = models, nominal.coverage = nominal.coverage)
```
```{r example13}
# simulated data
x <- matrix(c(0.5, 0.3, 0.8, 0.21, 1.5, 0.7, 2, 1), byrow = TRUE, ncol = 4)
models <- c("A", "B", "C", "D")
measures <- c("CRPS", "LogS")
# Score Plot
score.plot(x = x, models = models, measures = measures)
```
## Contact
Feel free to contact [jobstd@uni-hildesheim.de](mailto:jobstd@uni-hildesheim.de) if you have any questions or suggestions.
## References
Gneiting, T. and Raftery, A. (2007). Strictly Proper Scoring Rules,
Prediction, and Estimation. Journal of the American Statistical
Association. 102(477). 359-378.