-
Notifications
You must be signed in to change notification settings - Fork 0
/
12-anova.qmd
435 lines (292 loc) · 24.3 KB
/
12-anova.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
# One-way ANOVA
This chapter marks the beginning of the content you cover in Research Methods 2. You should be familiar with all the content from Research Methods 1 we covered in Chapters 1 to 11, so please revisit previous chapters if you need a refresher.
In the course materials, you have worked through manually calculating an ANOVA to gain a conceptual understanding. However, when you run an ANOVA, typically the computer does all of these calculations for you. In this chapter, we will show you how to run a one-way ANOVA using the <pkg>afex</pkg> package and post-hoc tests using the <pkg>emmeans</pkg> package.
**Chapter Intended Learning Outcomes (ILOs)**
By the end of this chapter, you will be able to:
- Apply and interpret a one-way ANOVA.
- Break down the results of a one-way ANOVA using post-hocs tests and apply a correction for multiple comparisons.
- Conduct a power analysis for a one-way ANOVA.
## Chapter preparation
### Introduction to the data set
For this chapter, we are using open data from experiment 2 in @james_computer_2015. The abstract of their article is:
> Memory of a traumatic event becomes consolidated within hours. Intrusive memories can then flash back repeatedly into the mind's eye and cause distress. We investigated whether reconsolidation - the process during which memories become malleable when recalled - can be blocked using a cognitive task and whether such an approach can reduce these unbidden intrusions. We predicted that reconsolidation of a reactivated visual memory of experimental trauma could be disrupted by engaging in a visuospatial task that would compete for visual working memory resources. We showed that intrusive memories were virtually abolished by playing the computer game Tetris following a memory-reactivation task 24 hr after initial exposure to experimental trauma. Furthermore, both memory reactivation and playing Tetris were required to reduce subsequent intrusions (Experiment 2), consistent with reconsolidation-update mechanisms. A simple, non-invasive cognitive-task procedure administered after emotional memory has already consolidated (i.e., > 24 hours after exposure to experimental trauma) may prevent the recurrence of intrusive memories of those emotional events.
In summary, they were interested in whether you can reduce intrusive memories associated with a traumatic event. Participants were randomly allocated to one of four groups and watched a video designed to be traumatic:
1. Control
2. Reactivation + Tetris
3. Tetris
4. Reactivation
They measured the number of intrusive memories prior to the start of the study, then participants kept a diary to record intrusive memories about the film in the 7 days after watching it. The authors were interested in whether the combination of reactivation and playing Tetris would lead to the largest reduction in intrusive memories. You will recreate their analyses using a one-way ANOVA.
### Organising your files and project for the chapter
Before we can get started, you need to organise your files and project for the chapter, so your working directory is in order.
1. In your folder for research methods and the book `ResearchMethods1_2/Quant_Fundamentals`, create a new folder called `Chapter_12_ANOVA`. Within `Chapter_12_ANOVA`, create two new folders called `data` and `figures`.
2. Create an R Project for `Chapter_12_ANOVA` as an existing directory for your chapter folder. This should now be your working directory.
3. Create a new R Markdown document and give it a sensible title describing the chapter, such as `12 ANOVA`. Delete everything below line 10 so you have a blank file to work with and save the file in your `Chapter_12_ANOVA` folder.
4. We are working with a new data set, so please save the following data file: [James_2015.csv](data/James_2015.csv). Right click the link and select "save link as", or clicking the link will save the files to your Downloads. Make sure that you save the file as ".csv". Save or copy the file to your `data/` folder within `Chapter_12_ANOVA`.
You are now ready to start working on the chapter!
### Activity 1 - Read and wrangle the data
```{r lib_data, echo = FALSE, message=FALSE, warning=FALSE}
# Load packages
library(pwr)
library(effectsize)
library(broom)
library(afex)
library(emmeans)
library(performance)
library(tidyverse)
# Read data and add new column
james_data <- read_csv("data/James_2015.csv") %>%
mutate(PID = row_number(),
Condition = as.factor(Condition)) %>%
select(PID,
Condition,
intrusions = Days_One_to_Seven_Image_Based_Intrusions_in_Intrusion_Diary)
```
As the first activity, try and test yourself by completing the following task list to practice your data wrangling skills. Create a final object called `james_data` to be consistent with the tasks below.
::: {.callout-tip}
#### Try this
To wrangle the data, complete the following tasks:
1. Load the following packages (several of these are new, so revisit [Chapter 1](#install-tidy) if you need a refresher of installing R packages, but remember not to install packages on the university computers / online server):
- <pkg>pwr</pkg>
- <pkg>effectsize</pkg>
- <pkg>broom</pkg>
- <pkg>afex</pkg>
- <pkg>emmeans</pkg>
- <pkg>performance</pkg>
- <pkg>tidyverse</pkg>
2. Read the data file `data/James_2015.csv` to the object name `james_data`.
3. Create a new variable called `PID` that equals `row_number()` to act as a participant ID which is currently missing from the data set.
4. Convert `Condition` to a factor.
5. Select and rename the following three variables as we do not need the others:
- `PID`
- `Condition`
- Rename `Days_One_to_Seven_Image_Based_Intrusions_in_Intrusion_Diary` to `intrusions`.
:::
::: {.callout-caution collapse="true"}
#### Show me the solution
You should have the following in a code chunk:
```{r eval=FALSE}
# Load packages
library(pwr)
library(effectsize)
library(broom)
library(afex)
library(emmeans)
library(performance)
library(tidyverse)
# Read data and add new column
james_data <- read_csv("data/James_2015.csv") %>%
mutate(PID = row_number(),
Condition = as.factor(Condition)) %>%
select(PID,
Condition,
intrusions = Days_One_to_Seven_Image_Based_Intrusions_in_Intrusion_Diary)
```
:::
### Activity 2 - Create summary statistics {#anova-a4}
Next, we want to calculate some descriptive statistics to see some overall trends in the data. We are really interested in the scores from each experimental group rather than overall.
::: {.callout-tip}
#### Try this
- Summarise the data to show the mean, standard deviation, and standard error for the number of intrusive memories (`intrusions`) grouped by `Condition`.
- Your table should have four columns, `Condition`, `mean`, `sd`, and `se`.
Hint: You can calculate the standard error through: `sd/sqrt(n)` or `sd/sqrt(length(some_variable_name)`.
:::
::: {.callout-caution collapse="true"}
#### Show me the solution
You should have the following in a code chunk:
```{r}
james_data %>%
group_by(Condition) %>%
summarise(mean = round(mean(intrusions), 2),
sd = round(sd(intrusions), 2),
se = round(sd/sqrt(length(intrusions)), 2))
```
:::
### Activity 3 - Visualisation {#anova-a5}
Now we can visualise the data. In the original paper they use a bar plot, but let's use a better plot that gives us more information about the data.
::: {.callout-tip}
#### Try this
- Create a violin-boxplot with the number of intrusive memories on the y-axis and condition on the x-axis (See Chapter 7 if you need a reminder).
- Change the labels on the x-axis to something more informative for the condition names.
Your plot should look like this:
```{r echo=FALSE}
james_data %>%
ggplot(aes(x = Condition, y = intrusions))+
geom_violin()+
geom_boxplot(width = .2) +
scale_y_continuous(name = "Number of Intrusions") +
scale_x_discrete(labels = c("Control", "Reactivation + Tetris", "Tetris", "Reactivation")) +
theme_classic()
```
:::
::: {.callout-caution collapse="true"}
#### Show me the solution
You should have the following in a code chunk:
```{r eval=FALSE}
james_data %>%
ggplot(aes(x = Condition, y = intrusions))+
geom_violin()+
geom_boxplot(width = .2) +
scale_y_continuous(name = "Number of Intrusions") +
scale_x_discrete(labels = c("Control", "Reactivation + Tetris", "Tetris", "Reactivation")) +
theme_classic()
```
We can see from this plot that there are a few potential outliers in each of the groups. This information is not present in the bar plot, which is why it's not a good idea to use them for this kind of data.
:::
## One-way ANOVA {#anova-a6}
Now we can run the one-way ANOVA using `aov_ez()` from the <pkg>afex</pkg> package and save it to the object `mod`. As well as running the ANOVA, the `aov_ez()` function also conducts a Levene's test for homogeneity of variance so that we can test our final assumption.
### Activity 4 - Running a one-way ANOVA using afex
`aov_ez()` will likely produce some messages that look like errors, do not worry about these, they are just letting you know what it's done. Run the code below to view the results of the ANOVA.
```{r one-way-anova, message=FALSE, warning=FALSE}
mod <- aov_ez(id = "PID", # the column containing the participant IDs
dv = "intrusions", # the DV
between = "Condition", # the between-subject variable
es = "pes", # sets effect size to partial eta-squared
type = 3, # this affects how the sum of squares is calculated, set this to 3
include_aov = TRUE,
data = james_data)
mod
```
Just like with the t-tests and correlations, we can use `tidy()` to make the output easier to work with. Run the below code to transform the output. Do not worry about the warning message, it is just telling you it does not know how to automatically rename the columns so it will keep the original names.
```{r tidy_aov}
mod_output <- mod$anova_table %>%
tidy()
mod_output
```
* `term` = the IV
* `num.Df` = degrees of freedom effect
* `den.Df` = degrees of freedom residuals
* `MSE` = Mean-squared errors
* `statistic` = F-statistic
* `ges` = effect size
* `p.value` = p.value
You should refer to the lecture for more information on what each variable means and how it is calculated.
- Is the overall effect of Condition significant? `r mcq(c(answer = "Yes", "No"))`
- What is the F-statistics to 2 decimal places? `r fitb("3.79")`
- According to the rules of thumb, the effect size is `r mcq(c("Small", "Medium", answer = "Large"))`
### Activity 5 - Checking assumptions for ANOVA {#anova-a7}
To test the assumptions, we must use the model we created with `aov_ez()`. For a one-way independent ANOVA, the assumptions are the same as for a Student t-test / regression model with a categorical predictor:
1. The DV is interval or ratio data.
2. The observations should be independent.
3. The residuals should be normally distributed.
4. There should be homogeneity of variance between the groups.
We know that 1 and 2 are met because of the design of our study. To test 3, we can look at the qq-plot of the residuals. Instead of saving just the model object, we must specifically select the `aov` component and run our diagnostic plots.
```{r fig-qq}
#| label: fig-qq
#| fig.cap: "qq-plot for model residuals."
plot(mod$aov,
which = 2)
```
The qq-plot shows the assumption of normality might not be ideal. Is this a problem? If the sample sizes for each group are equal, then ANOVA is robust to violations of both normality and of homogeneity of variance. If you are interested, there is a good discussion of these issues in @blanca_effect_2018 and @knief_violating_2021. We can check how many participants are in each condition using `count()`:
```{r count-condition}
james_data %>%
count(Condition)
```
Thankfully, the sample sizes are equal, so we should be OK to proceed with the ANOVA. It is not clear whether normality was checked in the original paper.
For the last assumption, we can test homogeneity of variance by checking the third diagnostic plot for the scale against location.
```{r homogeneity}
#| label: fig-homogeneity
#| fig.cap: "Diagnostic plot for homogeneity of variance."
plot(mod$aov,
which = 3)
```
Compared to normality, this assumption looks closer to being supported as the variance of each group is approximately equal. @james_computer_2015 suspect there might be issues with this assumption as they mention that the ANOVAs do not assume equal variance, however, the results of the ANOVA that are reported are identical to our results above where no correction has been made although the post-hoc tests are Welch t-tests (you can tell this because the degrees of freedom have been adjusted and are not whole numbers).
While all of this might seem very confusing - we imagine you might be wondering what the point of assumption testing is given that it seems to be ignored - we are showing you this for three reasons:
1. To reassure you that sometimes the data can fail to meet the assumptions and it is still ok to use the test. To put this in statistical terms, many tests are **robust** to mild deviations of normality and unequal variance, particularly with equal sample sizes.
2. As a critical thinking point, to remind you that just because a piece of research has been published does not mean it is perfect and you should always evaluate whether the methods used are appropriate.
3. To reinforce the importance of pre-registration where these decisions could be made in advance, and/or open data and code so that analyses can be reproduced exactly to avoid any ambiguity about exactly what was done. In this example, given the equal sample sizes and the difference in variance between the groups isn't too extreme, it looks like it is still appropriate to use an ANOVA but the decisions and justification for those decisions could have been more transparent.
::: {.callout-note}
The `check_model()` function from <pkg>performance</pkg> also works here, so you can see what it looks like if you run:
```{r eval=FALSE}
check_model(mod$aov)
```
:::
### Activity 6 - Post-hoc tests {#anova-a8}
For post-hoc comparisons, the paper appears to have computed Welch t-tests but there is no mention of any multiple comparison correction. We could reproduce these results by using `t.test()` for each of the contrasts.
For example, to compare condition 1 (the control group) with condition 2 (the reactivation plus tetris group) we could run:
```{r t-test}
james_data %>%
filter(Condition %in% c("1", "2")) %>%
droplevels() %>% # ignore unused factor levels
t.test(intrusions ~ Condition,
data = .)
```
::: {.callout-important}
Because `Condition` has four levels, we cannot just specify `intrustion ~ Condition` because a t-test compares two groups and it would not know which of the four to compare so first we have to filter the data and use a new function `droplevels()`. It's important to remember that when it comes to R there are two things to consider, the data you can see and the underlying structure of that data. In the above code we use `filter()` to select only conditions 1 and 2 so that we can compare them. However, that does not change the fact that R "knows" that `Condition` has four levels - it does not matter if two of those levels do not have any observations any more, the underlying structure still says there are four groups. `droplevels()` tells R to remove any unused levels from a factor. Try running the above code but without `droplevels()` and see what happens.
:::
However, a quicker and better way of doing this that allows you apply a correction for multiple comparisons easily is to use `emmeans()` which computes all possible pairwise comparison t-tests and applies a correction to the p-value.
First, we use `emmeans()` to run the comparisons and then we can pull out the contrasts and use `tidy()` to make it easier to work with.
Run the code below. Which conditions are significantly different from each other? Are any of the comparisons different from the ones reported in the paper now that a correction for multiple comparisons has been applied?
```{r post-hoc tests}
mod_pairwise <-emmeans(mod,
pairwise ~ Condition,
adjust = "bonferroni")
mod_contrasts <- mod_pairwise$contrasts %>%
tidy()
mod_contrasts
```
::: {.callout-warning}
The inquisitive among you may have noticed that `mod` is a list of 5 and seemingly contains the same thing three times: `anova_table`, `aov` and `Anova`. The reasons behind the differences are too complex to go into detail on this course (see The R Companion [website here](https://rcompanion.org/rcompanion/d_04.html) for more information) but the simple version is that `anova_table` and `Anova` use one method of calculating the results (type 3 sum of squares) and `aov` uses a different method (type 1 sum of squares). What's important for your purposes is that you need to use `anova_table` to view the overall results (and replicate the results from papers) and `aov`to run the follow-up tests and to get access to the residuals (or `lm()` for factorial ANOVA). As always, precision and attention to detail is key.
:::
### Activity 7 - Power and effect sizes {#anova-a9}
Finally, we can replicate their power analysis using `pwr.anova.test`from the <pkg>pwr</pkg> package.
> On the basis of the effect size of d = 1.14 from Experiment 1, we assumed a large effect size of f = 0.4. A sample size of 18 per condition was required in order to ensure an 80% power to detect this difference at the 5% significance level.
```{r pwr_anova}
pwr.anova.test(k = 4,
f = .4,
sig.level = .05,
power = .8)
```
We have already got the effect size for the overall ANOVA, however, we should also really calculate Cohen's d using `cohens_d()` from <pkg>effectsize</pkg> for each of the pairwise comparisons. This code is a little long because you need to do it separately for each comparison, bind them all together and then add them to `mod_contrasts` - just make sure your understand which bits of the code you would need to change to run this on different data. As we are binding rows and columns rather than joining, it is critical the comparisons are already in the correct order.
```{r}
# Calculate Cohen's d for all comparisons
d_1_2 <- cohens_d(intrusions ~ Condition,
data = filter(james_data,
Condition %in% c(1, 2)) %>%
droplevels())
d_1_3 <- cohens_d(intrusions ~ Condition,
data = filter(james_data,
Condition %in% c(1, 3)) %>%
droplevels())
d_1_4 <- cohens_d(intrusions ~ Condition,
data = filter(james_data,
Condition %in% c(1, 4)) %>%
droplevels())
d_2_3 <- cohens_d(intrusions ~ Condition,
data = filter(james_data,
Condition %in% c(2, 3)) %>%
droplevels())
d_2_4 <- cohens_d(intrusions ~ Condition,
data = filter(james_data,
Condition %in% c(2, 4)) %>%
droplevels())
d_3_4 <- cohens_d(intrusions ~ Condition,
data = filter(james_data,
Condition %in% c(3, 4)) %>%
droplevels())
# Bind all the comparisons in the order of mod_contrasts
pairwise_ds <- bind_rows(d_1_2,
d_1_3,
d_1_4,
d_2_3,
d_2_4,
d_3_4)
# Bind this object to the mod_contrasts object
mod_contrasts <- mod_contrasts %>%
bind_cols(pairwise_ds)
```
::: {.callout-warning}
What are your options if the data do not meet the assumptions and it's really not appropriate to continue with a regular one-way ANOVA? As always, there are multiple options and it is a judgement call.
1. You could run a non-parametric test, the Kruskal-Wallis for between-subject designs and the Friedman test for within-subject designs.
2. If normality is the problem, you could try transforming the data.
3. You could use bootstrapping, which is not something we will cover in this course at all.
:::
## Reporting the results of your ANOVA {#anova-a10}
The below code replicates the write-up in the paper, although has changed the Welch t-test to the pairwise comparisons from `emmeans()`.
```{r writeup1, eval = FALSE}
Second, and critically, for the 7-day diary postintervention, there was a significant difference between groups in overall intrusion frequency in daily life, F(`r mod_output$num.Df`, `r mod_output$den.Df`) = `r mod_output$statistic %>% round(2)`, p = `r mod_output$p.value %>% round(3)`, ηp2 = .`r mod_output$ges %>% round(2)`. Planned comparisons demonstrated that relative to the no-task control group, only those in the reactivation-plus-Tetris group, t(`r mod_contrasts$df[1]`) = `r mod_contrasts$statistic[1] %>% round(2)`, p = `r mod_contrasts$adj.p.value[1] %>% round(2)`, d = `r mod_contrasts$Cohens_d[1] %>% round(2)`, experienced significantly fewer intrusive memories; this finding replicated Experiment 1. The reactivation-plus-Tetris group had significantly fewer intrusive thoughts than the reactivation-only group, t(`r mod_contrasts$df[5]`) = `r mod_contrasts$statistic[5] %>% round(2)`, p = `r mod_contrasts$adj.p.value[5] %>% round(2)`, d = `r mod_contrasts$Cohens_d[5] %>% round(2)`. Further, there were no significant differences between the reactivation-plus-Tetris group and the Tetris-only group, t(`r mod_contrasts$df[4]`) = `r mod_contrasts$statistic[4] %>% round(2)`, p = `r mod_contrasts$adj.p.value[4] %>% round(2)`, d = `r mod_contrasts$Cohens_d[4] %>% round(2)`, the no-task control group and the reactivation-only group, t(`r mod_contrasts$df[3]`) = `r mod_contrasts$statistic[3] %>% round(2)`, p = `r mod_contrasts$adj.p.value[3] %>% round(2)`, or between the no-task control group and the Tetris-only group, t(`r mod_contrasts$df[2]`) = `r mod_contrasts$statistic[2] %>% round(2)`, p = `r mod_contrasts$adj.p.value[2] %>% round(2)`
```
If you add that code to your R Markdown document, knitting it should create the following:
> Second, and critically, for the 7-day diary postintervention, there was a significant difference between groups in overall intrusion frequency in daily life, F(`r mod_output$num.Df`, `r mod_output$den.Df`) = `r mod_output$statistic %>% round(2)`, p = `r mod_output$p.value %>% round(3)`, ηp2 = .`r mod_output$ges %>% round(2)`. Planned comparisons demonstrated that relative to the no-task control group, only those in the reactivation-plus-Tetris group, t(`r mod_contrasts$df[1]`) = `r mod_contrasts$statistic[1] %>% round(2)`, p = `r mod_contrasts$adj.p.value[1] %>% round(2)`, d = `r mod_contrasts$Cohens_d[1] %>% round(2)`, experienced significantly fewer intrusive memories; this finding replicated Experiment 1. Critically, as predicted by reconsolidation theory, the reactivation-plus-Tetris group had significantly fewer intrusive memories than the Tetris-only group, t(`r mod_contrasts$df[4]`) = `r mod_contrasts$statistic[4] %>% round(2)`, p = `r mod_contrasts$adj.p.value[4] %>% round(2)`, d = `r mod_contrasts$Cohens_d[4] %>% round(2)`, as well as the reactivation-only group, t(`r mod_contrasts$df[5]`) = `r mod_contrasts$statistic[5] %>% round(2)`, p = `r mod_contrasts$adj.p.value[5] %>% round(2)`, d = `r mod_contrasts$Cohens_d[5] %>% round(2)`. Further, there were no significant differences between the no-task control group and the reactivation-only group, t(`r mod_contrasts$df[3]`) = `r mod_contrasts$statistic[3] %>% round(2)`, p = `r mod_contrasts$adj.p.value[3] %>% round(2)`, or between the no-task control group and the Tetris-only group, t(`r mod_contrasts$df[2]`) = `r mod_contrasts$statistic[2] %>% round(2)`, p = `r mod_contrasts$adj.p.value[2] %>% round(2)`
## End of chapter
Well done! You have now covered how to run a one-way ANOVA using the <pkg>afex</pkg> package. Linear regression models are great for their flexibility, but it is not always simple for expressing a design with several levels. Combining <pkg>afex</pkg> and <pkg>emmeans</pkg> is a powerful combination when you have categorical independent variables / predictors with several levels.
In the next chapter, we will extend this to when you have multiple independent variables and you want to investigate the interaction between them for how they affect your dependent variable / outcome.