-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
192 lines (156 loc) · 11.2 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
---
title: "README"
bibliography: inst/REFERENCES.bib
csl: inst/apa.csl
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r include = FALSE, eval = FALSE}
library(RoBSA)
data(cancer, package = "survival")
priors <- calibrate_quartiles(median_t = 5, iq_range_t = 10, prior_sd = 0.5)
df <- data.frame(
time = veteran$time / 12,
status = veteran$status,
treatment = factor(ifelse(veteran$trt == 1, "standard", "new"), levels = c("standard", "new")),
karno_scaled = veteran$karno / 100
)
RoBSA.options(check_scaling = FALSE)
fit.test <- RoBSA(
Surv(time, status) ~ treatment + karno_scaled,
data = df,
priors = list(
treatment = prior_factor("normal", parameters = list(mean = 0.30, sd = 0.15),
truncation = list(0, Inf), contrast = "treatment"),
karno_scaled = prior("normal", parameters = list(mean = 0, sd = 1))
),
test_predictors = "treatment",
prior_intercept = priors[["intercept"]],
prior_aux = priors[["aux"]],
parallel = TRUE, seed = 1
)
fit.est <- RoBSA(
Surv(time, status) ~ treatment + karno_scaled,
data = df,
priors = list(
treatment = prior_factor("normal", parameters = list(mean = 0, sd = 1),
contrast = "treatment"),
karno_scaled = prior("normal", parameters = list(mean = 0, sd = 1))
),
test_predictors = "",
prior_intercept = priors[["intercept"]],
prior_aux = priors[["aux"]],
parallel = TRUE, seed = 1
)
saveRDS(fit.test, file = "models/README/fit.test.RDS")
saveRDS(fit.est, file = "models/README/fit.est.RDS")
```
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
dev = "png"
)
if(.Platform$OS.type == "windows"){
knitr::opts_chunk$set(dev.args = list(type = "cairo"))
}
# we pre-load the model, the fitting time is around 5 minutes
fit.test <- readRDS(file = "models/README/fit.test.RDS")
fit.est <- readRDS(file = "models/README/fit.est.RDS")
predicted_standard <- readRDS(file = "models/README/predicted_standard.RDS")
predicted_experimental <- readRDS(file = "models/README/predicted_experimental.test.RDS")
```
<!-- badges: start -->
[![R-CRAN-check](https://github.com/FBartos/RoBSA/workflows/R-CMD-check/badge.svg)](https://github.com/FBartos/RoBSA/actions)
[![R-tests](https://github.com/FBartos/RoBSA/workflows/R-CMD-tests/badge.svg)](https://github.com/FBartos/RoBSA/actions)
[![Codecov test coverage](https://codecov.io/gh/FBartos/RoBSA/branch/master/graph/badge.svg)](https://app.codecov.io/gh/FBartos/RoBSA?branch=master)
[![CRAN status](https://www.r-pkg.org/badges/version/RoBSA)](https://CRAN.R-project.org/package=RoBSA)
<!-- badges: end -->
# Robust Bayesian Survival Analysis (RoBSA)
This package estimates an ensemble of parametric survival models with different parametric families and uses Bayesian model averaging to combine them. The RoBSA ensemble uses Bayes factors to test for the presence or absence of effects of the individual predictors and evaluates the support for each parametric family. The resulting model-averaged parameter estimates are based on posterior model probabilities. The user can define a wide range of prior distributions for the effect size, intercepts, and auxiliary parameters. The package provides convenient functions for summary, visualizations, and fit diagnostics.
See @bartos2021informed (https://doi.org/10.1186/s12874-022-01676-9) introducing the methodology.
## Installation
The package requires [JAGS 4.3.1](https://mcmc-jags.sourceforge.io/) to be installed. The development version of the package can be installed from GitHub:
``` r
devtools::install_github("FBartos/RoBSA")
```
## Example
To illustrate the package's functionality, we use the `veteran` data set from the `survival` package containing 137 survival times of patients from a randomized trial of two treatment regimens for lung cancer. We start by loading the package and data set.
```{r}
library("RoBSA")
data(cancer, package = "survival")
head(veteran)
```
Before we fit the RoBSA ensemble with five accelerated failure times parametric families (exponential, Weibull, log-normal, log-logistic, and gamma), we must specify prior distributions for the intercepts and auxiliary parameters (governing the scales and shapes) of the competing parametric families. Here, we assume that we would expect the median survival type in the standard treatment group to be 5 years with an interquartile range of 10 years. We set the standard deviation of the prior distributions to 0.5, which provides us with a satisfactory degree of uncertainty about the parameter values. We obtain the corresponding prior distributions by using the `calibrate_quartiles()` function. Subsequently, we print the list containing priors for the intercepts and auxiliary parameters.
```{r}
priors <- calibrate_quartiles(median_t = 5, iq_range_t = 10, prior_sd = 0.5)
priors
```
We create a new data.frame object containing a fit ready data set. We (1) transform the survival times to years, (2) code the treatment variable as a factor and set `"standard"` as the default level (to test for effect of the `"new"` treatment effect; so the intercept of the model corresponds to the standard treatment and the treatment estimate to the improvement in the test treatment), and (3) scale the Karnofsky performance score (karno) to range from 0-1 (i.e., the coefficient estimate corresponds to the biggest possible difference).
```{r}
df <- data.frame(
time = veteran$time / 12,
status = veteran$status,
treatment = factor(ifelse(veteran$trt == 1, "standard", "new"), levels = c("standard", "new")),
karno_scaled = veteran$karno / 100
)
head(df)
```
### Hypothesis Testing
We proceed by specifying a RoBSA model intended to test an informed hypothesis of the presence of the treatment effect centered around the log(AF) of 0.3 with the sd of 0.1,5 via the informed prior distribution $\text{Normal}_+(0.30, 0.15)$ on the treatment effect. To set a prior distribution on factor, we use the `prior_factor()` function. This allows us to specify the type of contrast we want to use. Here, we use `"treatment"` contrast to estimate differences from the default level (we could also use `orthonormal` to test for a difference of the individual levels from the grand mean). Furthermore, we adjust for the Karnofsky performance score by setting a wider centered standard normal prior distribution $\text{Normal}(0, 1)$. To test only for the presence of the treatment effect and include the covariate in all models, we set `test_predictors = "treatment"`. Finally, we pass the appropriate prior distributions to the `priors`, `prior_intercept`, `prior_aux` arguments, set `seed = 1` for reproducibility, and use `parallel = TRUE` for speeding up the computation.
```r
fit.test <- RoBSA(
Surv(time, status) ~ treatment + karno_scaled,
data = df,
priors = list(
treatment = prior_factor("normal", parameters = list(mean = 0.30, sd = 0.15),
truncation = list(0, Inf), contrast = "treatment"),
karno_scaled = prior("normal", parameters = list(mean = 0, sd = 1))
),
test_predictors = "treatment",
prior_intercept = priors[["intercept"]],
prior_aux = priors[["aux"]],
parallel = TRUE, seed = 1
)
```
The `summary()` functions provides the main summary of the fitted model.
```{r}
summary(fit.test)
```
In the first table, we see that most posterior model probability is retained by the log-logistic (0.499), exponential (0.321), and log-normal (0.149) family, with the inclusion Bayes factors quantifying the change from prior to posterior model probabilities.
The second table then summarizes information about hypothesis tests of the model components. Here, in the RoBSA ensemble intended for testing, we are interested in the inclusion Bayes factor for the treatment effect. We find Bayes factor 0.148, indicating that there is moderate evidence in favor of no treatment effect (1/0.148 = 6.76) in comparison to our informed hypothesis of a positive treatment effect.
### Parameter Estimation
Now we try to estimate the model-averaged estimate of the difference between the two treatments. We change the prior distribution from the informed positive treatment effect to a neutral standard normal prior distribution allowing for both positive and negative treatment effect. We further set `test_predictors` to `""` in order to omit models assuming zero treatment effect.
```r
fit.est <- RoBSA(
Surv(time, status) ~ treatment + karno_scaled,
data = df,
priors = list(
treatment = prior_factor("normal", parameters = list(mean = 0, sd = 1),
contrast = "treatment"),
karno_scaled = prior("normal", parameters = list(mean = 0, sd = 1))
),
test_predictors = "",
prior_intercept = priors[["intercept"]],
prior_aux = priors[["aux"]],
parallel = TRUE, seed = 1
)
```
```{r}
summary(fit.est)
```
We again use the `summary()` function to obtain information about the fitted model. We find out that the experimental treatment led to notably shorter survival times with the mean model-averaged log(AF) = -0.19, 95% CI[-0.56, -0.19]. Furthermore, we observe an enormous effect of the scaled Karnofsky performance score, showing that moving from 0 to 1 increases the survival times with the mean model-averaged log(AF) = 2.54, 95% CI [1.72, 3.34].
### MCMC Diagnostics
To assess the convergence of the MCMC model, we can use the `diagnostics = TRUE` argument in the `summary()` function. The resulting table shows the maximum MCMC error, min effective sample size and maximum R-hat for each of the models.
```{r}
summary(fit.est, type = "diagnostics")
```
We find that while the R-hat and max MCMC error is satisfactory for all models from the RoBSA estimation ensemble, we might wish for a larger ESS. To achieve that, we would simply increase the number of sampling MCMC iterations by using the `iter` argument in the `RoBSA()` function. Furthermore detailed diagnostics such as trace plot, autocorrelation plots, and density plot for each parameter/model are provided via the `diagnostics_trace()`, `diagnostics_autocorrelation()`, and `diagnostics_density()` functions.
### Visualizing Predicted Survival
We can also visualize the model ensemble predictions for survival or hazard. We visualize the model-averaged survival for each treatment group via the `plot_survival()` and using the `predictor = "treatment"` argument. By default, the remaining predictors are set to their mean/default level, but predictions for different values can be obtained by specifying the `covariates_data` argument.
```{r predicted_surival, out.width = "80%", fig.align = "center"}
plot_survival(fit.est, predictor = "treatment")
```
In alignment with the previous summary, we see that the model-averaged survival for the new treatment group is below the estimated treatment of the standard treatment group. Similarly, we could also visualize the model-averaged hazard `plot_hazard()` or obtain numerical estimates for the survival, hazard, density, mean survival, and standard deviation of the survival at different levels of detail via the `predict()` function.
### References