-
Notifications
You must be signed in to change notification settings - Fork 0
/
piglet-cpb.qmd
507 lines (377 loc) · 30.9 KB
/
piglet-cpb.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
---
title: "Piglet cardiopulmonary bypass induces intestinal dysbiosis and barrier dysfunction associated with systemic inflammation"
author: "Jeffrey D. Salomon, **Haowen Qiu**, Dan Feng, Jacob Owens, Ludmila Khailova, Suzanne Osorio Lujan, John Iguidbashian, Yashpal S. Chhonker, Daryl J. Murry, Jean Jack Riethoven, Merry L. Lindsey, Amar B. Singh, Jesse A. Davidson"
date: "2022-11-25"
abstract: "The intestinal microbiome is essential to human health and homeostasis and is implicated in the pathophysiology of disease, including congenital heart disease and cardiac surgery. Improving the microbiome and reducing inflammatory metabolites may reduce systemic inflammation following cardiac surgery with cardiopulmonary bypass (CPB) to expedite recovery post-operatively. Limited research exists in this area and identifying animal models that can replicate changes in the human intestinal microbiome after CPB are necessary. We used a piglet model of CPB with 2 groups, CPB (n=5) and a control group with mechanical ventilation (n=7) to evaluate changes to the microbiome, intestinal barrier dysfunction, and intestinal metabolites with inflammation after CPB. We identified significant changes to the microbiome, barrier dysfunction, intestinal short chain fatty acids and eicosanoids, and elevate cytokines in the CPB/DHCA group compared to the control group at just four hours after intervention. This piglet model of CPB replicates known human changes to the intestinal flora and metabolite profile and can be used to evaluate gut interventions aimed at reducing downstream inflammation after cardiac surgery with CPB."
doi: "https://doi.org/10.1242/dmm.049742"
execute:
echo: true
#cache: true
format:
html:
toc: true
toc-location: left
reference-location: document
cold-fold: true
theme: flatly
self-contained: true
#cache: true
comments:
hypothesis: true
---
```{r message=FALSE, warning=FALSE, results='hide'}
suppressPackageStartupMessages(c(
library("tidyverse"),
library("knitr"),
library("phyloseq"),
library("vegan"),
library("microbiome"),
library("ANCOMBC"),
library("corncob"),
library("egg"),
library("ggpubr"),
library("cowplot"),
library("mixOmics"),
library("picante"),
library("ade4"),
library("ggsci"),
library("energy"),
library("grid")
))
# get start time
start_time <- Sys.time()
load("Salomon_piglet_ps.rda")
```
```{r message = FALSE, warning = FALSE, echo=FALSE}
script_folder = "../scripts/"
#source(paste(script_folder, "visual_functions.R", sep = ""))
source(paste(script_folder, "qidong_r_functions.R", sep = ""))
source(paste(script_folder, "mediation_util_functions.R", sep = ""))
```
## Project and data background
Many animal models have been useful in evaluating cardiopulmonary bypass (CPB) and a variety of cardiovascular, kidney, respiratory and neurologic outcomes (Davidson et al., 2019; Grocott et al., 1999; Hubert et al., 2003; Jungwirth and De Lange, 2010; Madrahimov et al., 2018). No animal models of CPB, however, have been utilized to evaluate changes to the microbiome, intestinal barrier dysfunction or intestinal eicosanoids. As the importance of the microbiome grows, animal models are crucial to evaluate the microbiome and factors contributing to post-surgical inflammation to identify potential therapeutic interventions. In this article, we used a model of piglet CPB/DHCA to evaluate the microbiome, intestinal EBD, SCFAs and eicosanoids. We hypothesized that the CPB/DHCA group would experience microbial and metabolite derangements along with EBD and systemic inflammation compared to controls.
A total of 12 piglets were used in this study, five in the CPB/DHCA group and seven in the control group receiving mechanical ventilation only. Sequences of 16S rRNA amplicon libraries generated using fecal microbial DNA resulted in a total of 12.6 million reads. Sequences were demultiplexed using Illumina software (MiSeq Control Software version 2.6) according to the manufacturer's guidelines. After the demultiplexing step, the bioinformatics analyses were performed following the Bioconductor workflow for microbiome data analysis (Callahan et al., 2016b) using R software (version 4.0).
## Abundance and diversity
For denoising, the R package DADA2 (version 1.18.0) (Callahan et al., 2016a) was used with the following conditions: the forward reads were truncated at position 280 and their first 17 nucleotides were trimmed, whereas the reverse ones were truncated at the position 250 and their first 21 nucleotides were trimmed, to discard positions for which nucleotide median quality was Q25 or below. High-quality sequencing reads were clustered to infer amplicon sequence variants (ASVs), and a final table of ASV counts per sample was generated after removing chimeras.In addition, a naïve Bayes taxonomy classifier (Wang et al., 2007) was used to classify each ASV against the SILVA 138.1 reference database to construct the taxonomy table, and MAFFT (version 7.407) (Katoh et al., 2002) and FASTTREE (version 2.1.11) (Price et al., 2009) programs were used to construct a phylogenetic tree.
The ASV table, taxonomy table, sample metadata, and phylogenetic tree are used to construct a `phyloseq` object for further analysis and statistical inference. A few screening and filtering steps before final phyloseq object is ready for diversity and statistical analyses.
```{r message=FALSE, warning=FALSE}
piglet_meta = piglet_meta %>%
mutate(combo = case_when(
group == "CNT" & time == "pre" ~ "Control-Pre",
group == "CNT" & time == "post" ~ "Control-Post",
group == "CPB" & time == "pre" ~ "CPB-Pre",
group == "CPB" & time == "post" ~ "CPB-Post"
)) %>%
mutate(combo = factor(combo, levels = c("Control-Pre", "Control-Post", "CPB-Pre", "CPB-Post")))
sample_data(ps) = piglet_meta
piglet_meta = piglet_meta %>%
mutate(new_ID = case_when(ID == 6 & group == "CNT" ~ "CNT1",
ID == 20 & group == "CNT" ~ "CNT2",
ID == 21 & group == "CNT" ~ "CNT3",
ID == 44 & group == "CNT" ~ "CNT4",
ID == 50 & group == "CNT" ~ "CNT5",
ID == 54 & group == "CNT" ~ "CNT6",
ID == 59 & group == "CNT" ~ "CNT7",
ID == 9 & group == "CPB" ~ "CPB1",
ID == 12 & group == "CPB" ~ "CPB2",
ID == 14 & group == "CPB" ~ "CPB3",
ID == 23 & group == "CPB" ~ "CPB4",
ID == 48 & group == "CPB" ~ "CPB5"))
piglet_meta = piglet_meta %>%
mutate(new_name = paste(new_ID, time, sep = ""))
sample_data(ps) = piglet_meta
```
* Step 1, remove Archaea, unknowns, chloroplasts;
```{r warning=FALSE, message=FALSE}
#sample_data(ps) = piglet_meta
# Remove Archaea, unknowns, chloroplasts
ps.clean <- subset_taxa(ps, Kingdom == "Bacteria") %>%
subset_taxa(!is.na(Phylum)) %>%
subset_taxa(!Class %in% c("Chloroplast")) %>%
subset_taxa(!Family %in% c("mitochondria"))
ps.clean
```
* Step 2, filter taxa (ASV) with low abundance (< 2);
```{r warning=FALSE, message=FALSE}
# Remove very low abundance ASV
ps.clean.p0 <- filter_taxa(ps.clean, function (x) {sum(x > 0) >= 2}, prune=TRUE)
ps.clean.p0
```
The relative abundance plot shows the taxonomic distribution in each group at the phylum and genus levels (Fig. 1). Although the Firmicutes and Bacteroidota phyla dominated both groups, the CPB/DHCA group trended towards a reduced number of different species in the post-operative samples compared to those in the pre-operative samples. The bacterial richness, indicated by the number of distinct operational taxonomic units (OTUs) identified in each sample, was not significantly different between the two groups (Fig. 2A). Phylogenic diversity showed a significant difference between the control and CPB/DHCA group in the post-operative time point (P=0.018, Fig. 2B). There was a trend towards significant reduction in α-diversity between the pre- and post-operative samples in the CPB/DHCA group (P=0.095).
```{r message=FALSE, warning=FALSE, fig.dim=c(12, 10)}
#| fig-cap: Fig.1 Relative bacterial abundance between CPB/DHCA group and controls.(A) Bacteria at the phylum level in each sample pre- and post-surgery for the control group and the CPB/DHCA group. There is a slightly larger increase in the amounts of Proteobacteria in the CPB/DHCA group pre-operative to post-operative samples compared to the control group. (B) Bacteria at the genus level. The legend identifies SCFA-producing organisms, which are reduced in the CPB/DHCA group post-operative samples compared to the control group post-operative samples. CPB, cardiopulmonary bypass; SCFA, short-chain fatty acid.
#relative abundance for all sample
ps.clean.re <- transform_sample_counts(ps.clean.p0, function(x) x / sum(x))
ps.clean.re.df <- psmelt(ps.clean.re)
ps.clean.re.df = ps.clean.re.df %>%
mutate(combo = case_when(
group == "CNT" & time == "pre" ~ "Control-Pre",
group == "CNT" & time == "post" ~ "Control-Post",
group == "CPB" & time == "pre" ~ "CPB-Pre",
group == "CPB" & time == "post" ~ "CPB-Post"
)) %>%
mutate(combo = factor(combo, levels = c("Control-Pre", "Control-Post", "CPB-Pre", "CPB-Post")))
legend_list = c("Lachnospiraceae NK3A20 group", "Campylobacter", "Holdemanella", "Candidatus Saccharimonas", "Selenomonas", "Fibrobacter", "Eisenbergiella", "[Eubacterium] ventriosum group")
# Using relative abundance data
p = plotBars2(ps.clean.re.df, x = "new_name", fill = "Phylum", title = NULL, xlab = NULL, ylab = "Phylum Relative Abundance", legend = TRUE) +
facet_wrap(~combo, scales = "free_x", nrow = 1)
q = plotBars3(ps.clean.re.df, x = "new_name", fill = "Genus", title = NULL, xlab = NULL, ylab = "Genus Relative Abundance", legend = TRUE, legend_list = legend_list) +
facet_wrap(~combo, scales = "free_x", nrow = 1)
egg::ggarrange(p,q, ncol = 1, labels = c("A", "B"), label.args = list(gp = grid::gpar(face = "plain")))
```
Overall β-diversity (i.e. inter-subject differences in community composition) was visualized using principal coordinates analysis (PCoA) and evaluated statistically with permutational multivariate ANOVA (PERMANOVA). There were significant differences in β-diversity between groups using all four distance matrices (Bray–Curtis, P=0.017; Jaccard, P=0.003; UniFrac, P=0.018; and weighted UniFrac, P=0.017). β-diversity using UniFrac distance matrix also showed a trend toward significance after first blocking by time point (Fig. 2C), suggesting that the group undergoing CPB/DHCA is the dominant factor driving microbiome community dissimilarities.
```{r warning=FALSE, message=FALSE}
# plug-in alpha diversity
samp = data.frame(data.frame(phyloseq::otu_table(ps.clean.p0))) %>%
rename_all(~stringr::str_replace(.,"^X",""))
tree = phyloseq::phy_tree(ps.clean.p0)
adiv <- data.frame(
phyloseq::estimate_richness(ps.clean.p0, measures = c("Observed", "Shannon", "Chao1", "Simpson", "InvSimpson", "Fisher")),
"PD" = picante::pd(samp, tree, include.root=FALSE)[,1],
dplyr::select(as.tibble(phyloseq::sample_data(ps.clean.p0)), group, time, ID ) ) %>%
dplyr::select(-se.chao1)
adiv = adiv %>%
mutate(combo = case_when(
group == "CNT" & time == "pre" ~ "pre_cnt",
group == "CNT" & time == "post" ~ "post_cnt",
group == "CPB" & time == "pre" ~ "pre_cpb",
group == "CPB" & time == "post" ~ "post_cpb"
)) %>%
mutate(combo = factor(combo, levels = c("pre_cnt", "post_cnt", "pre_cpb", "post_cpb")))
my_comparisons = list( c("pre_cnt", "post_cnt"), c("pre_cpb", "post_cpb"), c("pre_cnt", "pre_cpb"), c("post_cnt", "post_cpb") )
alpha_1 = ggplot(adiv, aes(x = combo, y = Observed, color = group, shape = time)) + geom_point(na.rm = TRUE, size = 2) +
labs( x = NULL, y = "Observed OTUs") +
scale_color_manual(values = c("CPB" = "#0000FF", "CNT" = "#800080")) +
scale_x_discrete(labels = c("Control-Pre", "Control-Post", "CPB-Pre", "CPB-Post")) +
stat_compare_means(comparisons = my_comparisons) +
theme_classic()+ theme(legend.position="none")
#alpha_1
alpha_2 = ggplot(adiv, aes(x = combo, y = PD, color = group, shape = time)) + geom_point(na.rm = TRUE, size = 2) +
labs( x = NULL, y = "Phylogenetic diversity (PD)") +
scale_color_manual(values = c("CPB" = "#0000FF", "CNT" = "#800080")) +
scale_x_discrete(labels = c("Control-Pre", "Control-Post", "CPB-Pre", "CPB-Post")) +
stat_compare_means(comparisons = my_comparisons) +
theme_classic()+ theme(legend.position="none")
#alpha_2
```
```{r warning=FALSE, message=FALSE}
# beta_diversity
#ordBC <- ordinate(ps.beta, "PCoA", "bray")
#ordJC <- ordinate(ps.beta, "PCoA", "jaccard")
ordUF <- ordinate(ps.clean.p0, "PCoA", "unifrac")
#ordwUF <- ordinate(ps.beta, "PCoA", "wunifrac")
#beta = plot_ordination(ps.clean.p0, ordUF, color = sample_data(ps.clean.p0)$group) # to get PCoA1% and PCoA2%
smpID <- sample_data(ps.clean.p0)$sample
df <- rbind(data.frame(ordUF$vectors[,1:4], sample = smpID,method = 'unifrac'))
#df <- rbind(data.frame(ordBC$vectors[,1:4], sample = smpID, method = 'BC'),
# data.frame(ordJC$vectors[,1:4], sample = smpID,method = 'Jaccard'),
# data.frame(ordUF$vectors[,1:4], sample = smpID,method = 'unifrac'),
# data.frame(ordwUF$vectors[,1:4], sample = smpID,method = 'wunifrac'))
# add sample_data info
df <- merge(df, data.frame(sample_data(ps.clean.p0)), by = 'sample') %>%
mutate(combo = case_when(
group == "CNT" & time == "pre" ~ "pre_cnt",
group == "CNT" & time == "post" ~ "post_cnt",
group == "CPB" & time == "pre" ~ "pre_cpb",
group == "CPB" & time == "post" ~ "post_cpb"
)) %>%
mutate(combo = factor(combo, levels = c("pre_cnt", "post_cnt", "pre_cpb", "post_cpb")))
beta = ggplot(data = df, aes(Axis.1,Axis.2, color = group, shape = time) ) +
geom_point(size = 2) +
scale_color_manual(values = c("CPB" = "#0000FF", "CNT" = "#800080")) +
stat_ellipse(aes_(group = df$combo)) +
labs( x = "PCoA 1 [21.2%]", y = "PCoA 2 [15.7%]") +
theme_classic()
#beta
```
```{r message=FALSE, warning=FALSE,fig.dim=c(12,5)}
#| fig-cap: Fig.2 α- and β-diversity plots in CPB/DHCA group and controls. (A) Observed operational taxonomic units (OTUs) in the CPB/DHCA group compared to the control group. There were no statistically significant differences in the total number of bacteria present between the two groups. (B) Phylogenetic diversity between the CPB/DHCA group and the control group. There was a significant decrease in phylogenetic diversity in the CPB post-operative samples compared to the control post-operative samples. (C) β-diversity via UniFrac distance matrix. There was a statistically significant difference in the β-diversity in the CPB group compared to the control group. The numbers indicate P-values using unpaired Wilcoxon rank sum test. PCoA, principal coordinates analysis.
cowplot::plot_grid(alpha_1, alpha_2, beta, labels = c("A", "B", "C"), nrow = 1)
```
## Differential abundance analysis using ANCOMBC and corncob
To identify specific taxonomic variations associated with group (CPB/DHCA versus control), differential abundance analyses were performed that identified multiple groups of organisms at the genus, family and phylum level with significant abundance differences between the CPB/DHCA group and the controls. At the genus level, SCFA-producing organisms, such as Fibrobacter, Eisenbergiella, Campylobacter, Lachnispiraceae NK3A20 group and the Eubacterium genera, among others, were reduced in the CPB/DHCA group compared to the controls (Fig. S1A) (Sun et al., 2021; Li et al., 2020). At the family level, similar groups of SCFA-producing organisms, such as Spirochaetaceae, Selenomonadaceae, Christensenellaceae and Fibrobacteraceae, were reduced in the CPB/DHCA group compared to the controls (Fig. S1B) (Peterson et al., 2022; Li et al., 2020; Liu et al., 2019; Mukherjee et al., 2020; Walker et al., 2005; Van den Abbeele et al., 2022).
```{r warning=FALSE, message=FALSE}
# corncob and ancombc plots: following da_template.rmd
model = "time+group"
# Agglomerate to family-level and rename
ps.clean.p0.family = phyloseq::tax_glom(ps.clean.p0, taxrank = rank_names(ps.clean.p0)[5])
phyloseq::taxa_names(ps.clean.p0.family) = phyloseq::tax_table(ps.clean.p0.family)[,"Family"]
# Agglomerate to genus-level and rename
ps.clean.p0.genus = phyloseq::tax_glom(ps.clean.p0, taxrank = rank_names(ps.clean.p0)[6])
phyloseq::taxa_names(ps.clean.p0.genus) = phyloseq::tax_table(ps.clean.p0.genus)[,"Genus"]
```
```{r warning=FALSE, message=FALSE}
taxa = "genus" # cov_combo_list is a vector not a list
covariate = unlist(str_split(model, pattern = "\\+"))
var1 = covariate[1]
var2 = covariate[2]
if (taxa == "genus"){
ps.da = ps.clean.p0.genus
} else if (taxa == "family"){
ps.da = ps.clean.p0.family
}
# ancombc
out = ancombc(phyloseq = ps.da, formula = model, p_adj_method = "fdr", group = "time", global = TRUE )
# formula: how the microbial absolute abundances for each taxon depend on the variables in metadata.
res_ancombc = out$res
df_fig1 = data.frame(res_ancombc$lfc * res_ancombc$diff_abn, check.names = FALSE) %>% rownames_to_column("taxon_id")
df_fig2 = data.frame(res_ancombc$se * res_ancombc$diff_abn, check.names = FALSE) %>%
rownames_to_column("taxon_id")
colnames(df_fig2)[-1] = paste0(colnames(df_fig2)[-1], "SD")
df_fig_var2 = df_fig1 %>% left_join(df_fig2, by = "taxon_id") %>%
column_to_rownames(., var = "taxon_id") %>%
dplyr::select(., starts_with(var2) ) %>%
dplyr::filter(., (!!as.name(colnames(df_fig1)[4])) != 0 ) %>%
dplyr::arrange(desc( (!!as.name( colnames(df_fig1)[4]) ))) %>%
rownames_to_column(., var = "taxon_id") %>%
dplyr::mutate(group = ifelse( (!!as.name( colnames(df_fig1)[4] )) > 0, "g1", "g2"))
df_fig_var2$taxon_id = factor(df_fig_var2$taxon_id, levels = df_fig_var2$taxon_id)
# genus level
group_genus_ancombc_df = df_fig_var2
group_genus_ancombc = ggplot(data = group_genus_ancombc_df, aes(x = taxon_id, y = groupCPB, fill = group, color = group)) +
geom_bar(stat = "identity", width = 0.7, position = position_dodge(width = 0.4)) +
geom_errorbar(aes(ymin = groupCPB - groupCPBSD, ymax = groupCPB + groupCPBSD ), width = 0.2, position = position_dodge(0.05), color = "black") +
labs(x = NULL, y = "Log fold change", title = paste0("Waterfall Plot: Genus") ) +
theme_bw() +
theme(legend.position = "none", plot.title = element_text(hjust = 0.5),
panel.grid.minor.y = element_blank(), axis.text.x = element_text(angle = 60, hjust = 1))
#group_genus_ancombc
#corncob
corncob_formula = as.formula(paste("", paste(covariate, collapse = " + "), sep = " ~ " ))
set.seed(42)
da_analysis <- differentialTest(formula = corncob_formula,
phi.formula = corncob_formula,
formula_null = ~ 1,
phi.formula_null = corncob_formula,
test = "Wald", boot = FALSE,
data = ps.da,
fdr_cutoff = 0.05)
# genus level
group_genus_corncob = plot(da_analysis)
group_genus_corncob = group_genus_corncob + labs(title = "Differential abundance: Genus") + theme(legend.position = "none")
#group_genus_corncob
```
```{r warning=FALSE, message=FALSE}
taxa = "family" # cov_combo_list is a vector not a list
covariate = unlist(str_split(model, pattern = "\\+"))
var1 = covariate[1]
var2 = covariate[2]
if (taxa == "genus"){
ps.da = ps.clean.p0.genus
} else if (taxa == "family"){
ps.da = ps.clean.p0.family
}
# ancombc
out = ancombc(phyloseq = ps.da, formula = model, p_adj_method = "fdr", group = "time", global = TRUE )
# formula: how the microbial absolute abundances for each taxon depend on the variables in metadata.
res_ancombc = out$res
df_fig1 = data.frame(res_ancombc$lfc * res_ancombc$diff_abn, check.names = FALSE) %>% rownames_to_column("taxon_id")
df_fig2 = data.frame(res_ancombc$se * res_ancombc$diff_abn, check.names = FALSE) %>%
rownames_to_column("taxon_id")
colnames(df_fig2)[-1] = paste0(colnames(df_fig2)[-1], "SD")
df_fig_var2 = df_fig1 %>% left_join(df_fig2, by = "taxon_id") %>%
column_to_rownames(., var = "taxon_id") %>%
dplyr::select(., starts_with(var2) ) %>%
dplyr::filter(., (!!as.name(colnames(df_fig1)[4])) != 0 ) %>%
dplyr::arrange(desc( (!!as.name( colnames(df_fig1)[4]) ))) %>%
rownames_to_column(., var = "taxon_id") %>%
dplyr::mutate(group = ifelse( (!!as.name( colnames(df_fig1)[4] )) > 0, "g1", "g2"))
df_fig_var2$taxon_id = factor(df_fig_var2$taxon_id, levels = df_fig_var2$taxon_id)
# family level
group_family_ancombc_df = df_fig_var2
group_family_ancombc = ggplot(data = group_family_ancombc_df, aes(x = taxon_id, y = groupCPB, fill = group, color = group)) +
geom_bar(stat = "identity", width = 0.7, position = position_dodge(width = 0.4)) +
geom_errorbar(aes(ymin = groupCPB - groupCPBSD, ymax = groupCPB + groupCPBSD ), width = 0.2, position = position_dodge(0.05), color = "black") +
labs(x = NULL, y = "Log fold change", title = paste0("Waterfall Plot: Family") ) +
theme_bw() +
theme(legend.position = "none", plot.title = element_text(hjust = 0.5),
panel.grid.minor.y = element_blank(), axis.text.x = element_text(angle = 60, hjust = 1))
#group_family_ancombc
#corncob
corncob_formula = as.formula(paste("", paste(covariate, collapse = " + "), sep = " ~ " ))
set.seed(42)
da_analysis <- differentialTest(formula = corncob_formula,
phi.formula = corncob_formula,
formula_null = ~ 1,
phi.formula_null = corncob_formula,
test = "Wald", boot = FALSE,
data = ps.da,
fdr_cutoff = 0.05)
# family level
group_family_corncob = plot(da_analysis)
group_family_corncob = group_family_corncob + labs(title = "Differential abundance: Family") + theme(legend.position = "none")
#group_family_corncob
```
```{r message=FALSE, warning=FALSE, fig.dim=c(12, 10)}
#| fig-cap: Fig.S1 Differential Abundance in the CPB/DHCA Group at the Family and Genus levels. Panel A depicts taxonomic family rank changes in organism abundance of the CPB/DHCA group showing waterfall plot (top) and the corncob plot (bottom). Panel B depicts taxonomic genus rank changes in organism abundance of the CPB/DHCA group with waterfall plot (top) and corncob plot (bottom). Both waterfall plot and corncob plot show LOG fold changes of organisms that are different between the two groups.
egg::ggarrange(group_family_corncob, group_family_ancombc, group_genus_corncob, group_genus_ancombc, nrow = 2, labels = c("A", "", "B", ""), label.args = list(gp = grid::gpar(face = "plain")) ) #+ theme(plot.margin = margin(0.1,0.1,2,0.1, "cm"))
```
## Linear discriminant analysis (LDA) effect size (LEfSE)
Linear discriminant analysis (LDA) effect size (LEfSE) was performed to identify microbial biomarkers at different classification levels between the two groups (LDA score>2.0). This is used to determine the features most likely to explain differences between groups by coupling standard tests for statistical significance with additional tests encoding biological consistency and effect (Segata et al., 2011). LEfSE revealed multiple genera predominantly associated with either the CPB/DHCA group or the control group (Fig. 3A). A cladogram was developed for taxonomic representation of biologically consistent differences in the CPB/DHCA group and the control group (Fig. 3B). Broadly, several families of organisms were noted to be associated with the CPB/DHCA group in both the LEfSE plot and cladogram, including Lachnospiraceae, Christensenellaceae, Monoglobaceae and Peptococcaceae.
![Fig.3 LEfSE plot and cladogram of bacterial associations in CPB/DHCA group and controls. (A) LEfSE plot providing organisms associated with either the CPB/DHCA group (green) or the control group (CNT, red). The logarithmic score details the strength of the association of each organism to a specific group. (B) Cladogram of the LEfSE analysis with organisms in the shaded green area associating more strongly with the CPB/DHCA group and organisms in the shaded red area associating with the control group. The microbial compositions were compared at different taxonomic levels. LDA, linear discriminate analysis.](Figure-3.png)
## Canonical correlation analysis
Canonical correlation analysis was performed to evaluate the correlation between the microbiome and other sets of measured biomarkers.
There are two methods included in the `mixOmics` package to allow the CCA method to be regularized, such that the $\lambda1$ and $\lambda2$ are optimized, ridge approach ($L2$ norm) and shrinkage approach ($L1$ norm).
To read more about CCA methodology in `mixOmics`, please visit their [website](http://mixomics.org/methods/rcca/). Case study can be found [here](http://mixomics.org/case-studies/rcca-nutrimouse-case-study/).
```{r message=FALSE, warning=FALSE}
ps.clean.cca = ps.clean.p0.genus
omics_list = c("EBD", "cytokine", "SCFA")
```
::: {.panel-tabset}
```{r cca, echo = FALSE, results="asis", include=FALSE, warning=FALSE, message=FALSE}
options(knitr.duplicate.label = 'allow')
src_text <-
lapply(1:length( omics_list ),
function(i) {
knitr::knit_expand('cca_template.qmd',
pair_name = omics_list[[i]])
}
)
src_text <- paste(src_text, collapse = "\n")
```
```{r, echo = FALSE, results="asis", warning=FALSE, message=FALSE, fig.show='hide'}
knit_res <- knitr::knit_child(text = unlist(src_text), quiet = TRUE)
cat(knit_res, sep="\n")
```
:::
Network and heatmap analysis showed the strength of association between the microbiome and these biomarkers. The markers for EBD (Fig. 6A), specifically FABP2, were positively associated with pro-inflammatory organisms, such as Klebsiella, Escherichia and Enterococcus, and negatively associated with the SCFA-producing organisms Roseburia, Lachnospiraceae UCG.008, and Eubacterium. Claudin-2 was noted to have negative association with Holdemania, an SCFA-producing organism, as well as increases in Klebsiella and Peptostreptococcus, known to induce intestinal inflammation (Atarashi et al., 2017). The cytokine network and heatmap (Fig. 6B) demonstrated that many organisms were negatively associated with TNF-α, but some had a positive association with IL-1β and IL-6, such as Hungatella, Howardella and Romboutsia. Conversely, Roseburia, Fournierella and Angelakisella were noted to have a strongly negative association with TNF-α and a mildly positive or neutral association with IL-1β and IL-6. Specifically looking at SCFAs (Fig. 6C), Victivallis had significant negative association with all SCFAs in both the network and heatmap.
![Fig.6 Canonical correlation analysis of the microbiome with EBD, cytokines and SCFA. (A) Network map (top) and heatmap (bottom) of the markers of EBD and associated organisms. (B) Network map (top) and heatmap (bottom) of inflammatory cytokines and associated organisms. (C) Network map (top) and heatmap (bottom) of SCFAs and associated organisms.](Figure-6.png)
## Mediation analysis
Mediation analysis was performed to further understand the role the microbiome played as a mediator for outcomes such as other measured biomarkers, using CPB/DHCA as the exposure. In statistics, mediation analysis tests a hypothetical causal chain where one variable X (**Exposure**) affects a second variable M (**Mediator**) and, in turn, that variable affects a third variable Y (**Outcome**). Mediators describe the how or why of a (typically well-established) relationship between two other variables (**Exposure and Outcome**) and are sometimes called intermediary variables, as they often describe the process through which an effect occurs. This is also sometimes called an indirect effect.
The application of mediation analysis in the analysis of microbiome data, where **the effect of a treatment on an outcome is transmitted through perturbing the microbial communities or compositional mediators**, is not straightforward. The **compositional** (i.e., each relative abundance is a non-negative value [0,1), which adds up to 1) and **high-dimensional** nature of microbiome data makes the standard mediation analysis not directly applicable to our setting.
Recent years, there has been many implementation of multivariate mediation analysis for microbiome data analysis, following mediation analyses are performed using MODIMA, LDM and ccmm, testing hypotheses where **group as exposure, and one of EBD markers, or cytokines or SCFA as outcome**.
```{r}
# automation
outcome = colnames(piglet_meta)[c(7,9,11,13:65)]
exposure = colnames(piglet_meta)[6]
combo = expand.grid(exposure, outcome) %>%
mutate(Var3 = paste(Var1, Var2, sep = "-") )
```
::: {.panel-tabset}
```{r modima, echo = FALSE, results="asis", include=FALSE, warning=FALSE, message=FALSE}
options(knitr.duplicate.label = 'allow')
src_text <-
lapply( c(1:3, 13,17:18) ,
function(i) {
knitr::knit_expand('modima_template.qmd',
pair_name = combo$Var3[[i]])
}
)
src_text <- paste(src_text, collapse = "\n")
```
```{r, echo = FALSE, results="asis", warning=FALSE, message=FALSE, fig.show='hide'}
knit_res <- knitr::knit_child(text = unlist(src_text), quiet = TRUE)
cat(knit_res, sep="\n")
```
:::
We identified two eicosanoids, PGD2 and PGE2, along with valeric acid to be significantly mediated by the microbiome, given CPB as exposure (Fig. 7A). As expected, the microbiome was not a significant mediator for intestinal EBD (Fig. 7B), which corroborates the theory that CPB directly induces intestinal barrier dysfunction, thereby creating the intestinal permeability for the microbiome and intestinal metabolites to leak out of the gut and signal systemic inflammation.
![Fig.7 Mediation analysis of the microbiome on changes to EBD, cytokines, SCFAs and eicosanoids. (A) The three outcomes, PGD2, PGE2 and valeric acid, to be mediated by changes in the microbiome. Exposure is CPB, the mediator is the microbiome, and the outcomes are listed above. (B) The three markers of EBD, FABP2, claudin-2 and claudin-3, depicting no statistically significant mediation effect of the microbiome on the changes in EBD. CNT, control; FABP2, fatty acid-binding protein 2. Blue lines indicate the association between the microbiome and individual biomarkers using principal component axis 1, and shaded gray areas represent the 95% confidence intervals.](Figure-7.png)
## Reproducibility
The amount of time took to generate the report:
```{r time_spend}
Sys.time() - start_time
```
*R* session information:
```{r R_session}
sessionInfo()
```