-
Notifications
You must be signed in to change notification settings - Fork 0
/
Supplemental_Implementation_Example.Rmd
191 lines (139 loc) · 11 KB
/
Supplemental_Implementation_Example.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
---
title: "Implementation Examples of multi-level meta-analytic models with robust variance estimators and using Bayesian approaches to deal with non-independent effect sizes"
author: Shinichi Nakagawa, Alistair M. Senior, Wolfgang Viechtbauer and Daniel W. A. Noble
date: "`r Sys.Date()`"
bibliography: refs.bib
csl: ecology.csl
output:
bookdown::html_document2:
code_folding: hide
number_sections: no
toc: yes
toc_depth: 6
toc_float: yes
bookdown::word_document2:
toc: no
toc_depth: 6
number_sections: true
reference_docx: template.docx
editor_options:
chunk_output_type: console
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, cache = FALSE, message = FALSE, warning = FALSE, tidy = TRUE, fig.width = 10)
## numbers >= 10^5 will be denoted in scientific notation, ## numbers >= 10^5 will be denoted in scientific notation,
## and rounded to 2 digits ## and rounded to 2 digits
options(digits = 2)
```
```{r klippy, echo=FALSE, include=TRUE}
#install.packages("devtools")
remotes::install_github("rlesur/klippy")
klippy::klippy(tooltip_message = 'Click to Copy Code', tooltip_success = 'Done', position = 'right', color = "red")
```
# Introduction
In this tutorial, we demonstrate how meta-analyst's can implement approaches for correcting inflated Type I error rates when using multilevel models with non-independent effect size data. For ease of presentation, we use simulated data to demonstrate the implementation of a few easy solutions.
# R Packages Required
First, we'll load some of the packages that we'll need.
```{r packages_data,results='hide'}
# Clean workspace
rm(list=ls())
# Loading packages & Functions
#install.packages("pacman")
# Note. To get the latest development versuon of metafor use the follow code
#remotes::install_github("wviechtb/metafor"); library(metafor)
# Load packages
pacman::p_load(tidyverse, MASS, kableExtra, gridExtra, metafor, MCMCglmm, brms, robumeta, clubSandwich, pander, tidyverse)
```
# Simulating Non-independent Effect Size Data
Here, we will simulate some meta-analytic data. We will keep this very simple, just for demonstration purposes. Hence, we will assume that we have collected data from a total of 30 studies, and we'll assume that we were able to extract n = 5 effect sizes from each of these 30 studies. In total, we have a data set that contains n = 150 effect sizes.
```{r simulated data, class.source='klippy'}
# Simulate a dataset composed of 30 papers, each having 5 effects from the same study.
set.seed(123)
# Parameters
no.paper = 30 # Numbers of unique papers
n.effects = 5 # Number of effects per paper. We will keep simple and balanced
rho.e = 0.8 # Correlation in sampling variances
study_id = rep(1:no.paper, each = n.effects) # Study ID's
var_paper = 0.5 # Between-study variance (i.e., tau2)
var_effect = 0.2 # Effect size (or within study) variance
mu = 0.4 # Average, or overall, effect size
rho = c(0.2, 0.4, 0.6, 0.8)
# Add sampling variance
# First, sample
n.sample <- rnbinom(no.paper, mu = 40, size = 0.5) + 4
# Assume logRR. So, sampling variance approximated by the CV.
cv <- sample(seq(0.3, 0.35,by = 0.001), no.paper, replace = TRUE)
# Assuming sample size is the same for all effects within study (pretty sensible)
rep_sample <- rep(n.sample, each = n.effects)
# Sampling variances
vi <- (2*cv^2)/rep_sample
# Create the sampling variance matrix, which accounts for the correlation between effect sizes within study
sigma_vi <- as.matrix(Matrix::bdiag(impute_covariance_matrix(vi = vi, cluster = study_id, r = rho.e)))
# Now create the effect size level covariance matrix. Thsi would simulate a situation where, for example, you have effect sizes on reading and maths that are clearly correlated, but to varying extents across studies
# Create list of matrices, 1 for each study
matrices <- list()
for(i in 1:no.paper){
r <- sample(rho, size = 1, replace = TRUE)
tmp <- matrix(r, nrow = n.effects, ncol = n.effects)
diag(tmp) <- 1
matrices[[i]] <- tmp
}
cor_yi <- as.matrix(Matrix::bdiag(matrices))
cov_yi <- cor_yi * sqrt(var_effect) * sqrt(var_effect)
# Derive data from simulation
yi <- mu + rep(rnorm(no.paper, 0, sqrt(var_paper)), each = n.effects) + mvrnorm(n = 1, mu = rep(0,n.effects*no.paper), cov_yi) + mvrnorm(n = 1, mu = rep(0,n.effects*no.paper), sigma_vi)
# Create the data. We'll just assume that meta-analysts have already derived their effect size and sampling variance
data <- data.frame(study_id = study_id,
yi = yi,
vi = vi,
obs = c(1:(n.effects*no.paper)))
```
Now that we have our simulated (i.e., fake or toy) data, we can demonstrate a few corrections that can be applied to multilevel meta-analytic models that will offset any possible inflated Type I error rates.
# Fit the Multi-level Meta-analytic (MLMA) Model
First, lets just fit our multilevel meta-analytic (MLMA) model. We can do that using our simulated data as follows:
```{r fitmodel, class.source='klippy'}
mod_multilevel = metafor::rma.mv(yi = yi, V = vi, mods = ~1,
random=list(~1|study_id,~1|obs),
data=data, test="t")
summary(mod_multilevel)
```
We have fit a simple MLMA model that estimates the overall meta-analytic mean. We can see that our model estimates this correctly. Remember that the true mean is `r mu`, and we are pretty close to this value (i.e., `r mod_multilevel$beta`). This is also true of our random effect variance estimates. In this case, we know that the MLMA model is ignoring the shared sampling variance dependence in effect sizes. We expect that this 'should' (at least on average) inflate Type I error rates. Assuming we did not know any better we would want to account for this dependence. Below, we describe a few corrections that meta-analysts can apply that should overcome the problems associated with not accounting for effect size dependence.
Before moving on to some useful corrections, users should be aware that the most up-to-date version of `metafor` (version 2.5-101) does now provide users with some protection against Type I errors. Instead of using the number of effect sizes in the calculation of the degrees of freedom we can instead make use of the total numbers of papers instead. We show in our simulations that a "papers-1" degrees of freedom can be fairly good. This can be implemented as follows after installing the development version of `metafor` (see "R Packages Required" above):
```{r fitmodel2, class.source='klippy'}
mod_multilevel_pdf = rma.mv(yi = yi, V = vi, mods = ~1,
random=list(~1|study_id,~1|obs),
data=data, test="t", dfs = "contain")
summary(mod_multilevel_pdf)
```
We can see here that the df for the model has changes from 149 to 29, and the p-value has been adjusted accordingly.
# Correction using Robust Variance Estimator (RVE) with Saitterwaite Degrees of Freedom Correction
A very simple solution is to make use of robust variance estimators (RVE). This can be done in a few packages, but very easily using the `clubSandwich` package [@Pustejovsky2020; @Hedges2010; @Tipon2015], which also works well with `metafor` [@Wolfgang2010] models. This approach also makes use of a Saitterwaite degrees of freedom correction [@SW; @Tipon2015]. This works with `metafor` objects quite elegantly. The benefit of such an approach is simply that we need not make any assumptions about what the correlation between effect sizes actually is (assuming we didn't know the true correlation) [@Hedges2010; @Tipon2015]. In addition, it also will account for possible heteroscedascity. This solution can be implemented as follows using our MLMA model we fit in the above section.
```{r fitrobust, class.source='klippy'}
mod_RVE <- coef_test(mod_multilevel, vcov="CR2", cluster = data$study_id)
print(mod_RVE)
```
With this simple (and well balanced) data, our RVE approaches doesn't change the results much.
# Correction by Applying Bayesian Multi-level Meta-analytic Model
As we describe in our comment, Bayesian approaches, assuming one has a good sample size, do a very good job correcting for inflated Type I errors across a variety of situations. Bayesian MLMA models can be fit in various packages. Probably the most flexible for meta-analyst's are `MCMCglmm` [@Hadfield2010] and `brms` [@Brkner2017; @Brkner2018]. Here, we demonstrate how to fit the same MLMA model using `MCMCglmm` which has a syntax that is different from the typical one used in packages such as `metafor` and `lme4` [@Bates2015], which meta-analysts might be more accustomed too.
```{r bayes, class.source='klippy'}
prior <- list(R = list(V = 1, nu = 0.002),
G = list(G1 = list(V = 1 , nu = 1, alpha.mu=0, alpha.V=25^2)))
bayes_multilevel <- MCMCglmm(yi ~ 1, mev = data$vi, random = ~ study_id, data = data, prior = prior, verbose = FALSE)
summary(bayes_multilevel)
```
As expected, our credible intervals get a little bit wider.
# Correction by Modeling the Entire Sampling Covariance Matrix
Of course, we can also take an approach proposed by @Noble2017, where we fit the covariance matrix directly by simply assuming that effects that come from the same study are correlated by r = 0.5. Ultimately, one could change this correlation, depending on the situation and context, but r = 0.5 will probably suffice in many situations. This assumes, however, that the degree of correlation among effect sizes within a study is the same across studies. This assumption is relaxed in the RVE approaches described above. We can also test whether this is a safe assumption by combining it with a ClubSandwich estimator. We can build the matrix an implement this approach as follows:
```{r VCVmatrix, class.source='klippy'}
vcv <- impute_covariance_matrix(vi = data$vi, cluster = data$study_id, r = 0.5)
mod_multilevel_vcv <- metafor::rma.mv(yi=yi, V = vcv, mods=~1, random=list(~1|study_id,~1|obs), data=data, test="t")
mod_multilevel_vcv <- coef_test(mod_multilevel_vcv, vcov="CR2", cluster = data$study_id)
print(mod_multilevel_vcv)
```
# Conclusions
The goal of our short tutorial was to dispel the idea that overcoming, and implementing, solutions to deal with non-independent effect sizes when working with multi-level meta-analytic models is challenging. Our simulations show [@Nakagawa2021] that there are a number of very easily implemented solutions. As such, meta-analyst's can harness the power of MLMA models without the need to average effect sizes, as suggested by @Song2020.
# Session Info
```{r session, echo=FALSE}
sessionInfo() %>% pander()
```
# References