-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path2_analysis.Rmd
671 lines (501 loc) · 27.9 KB
/
2_analysis.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
---
title: "LOCATE: main analysis"
author: "Adrian Barnett"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output: word_document
bibliography: references.bib
---
```{r setup, include=FALSE}
# follows protocol: https://bmchealthservres.biomedcentral.com/articles/10.1186/s12913-020-05233-2
knitr::opts_chunk$set(echo = FALSE, warning=FALSE, message=FALSE, error=FALSE, comment='', dpi=400)
options(width=1000) # Wide pages
options(scipen=999) # avoid scientific presentation
source('99_functions.R')
source('99_glm_bayes.R') # for Bayesian glm model
library(knitr)
library(dplyr)
library(tidyr)
library(flextable)
library(janitor)
library(stringr)
library(survival)
library(survminer) # for nice plots of survival models
library(broom)
library(nimble) # for Bayesian models
library(mitools) # for multiple imputation
# for graphics
library(ggplot2)
library(gridExtra)
g.theme = theme_bw() + theme(panel.grid.minor = element_blank())
cbPalette <- c("#999999", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
# colours for treatment groups
group_colours = c('skyblue','darkorange')
# data from 1_read_data_redcap.R
load('data/1_redcap_data.RData')
data = filter(data, !is.na(randomised)) # only those randomised
# data from 0_read_data_scans.R
load('data/0_scans.RData')
# scramble treatment group
scramble = FALSE
if(scramble == TRUE){
TeachingDemos::char2seed('tranmere')
rand = data$randomised
rand = sample(rand, size = length(rand), replace=FALSE)
data = select(data, -randomised) %>%
mutate(randomised = rand)
}
```
The methods used here are based on the published protocol [@Brain2020]. Any changes to the protocol will be noted in the paper. The trial registration on ANZCTR is available [here](https://www.anzctr.org.au/Trial/Registration/TrialReview.aspx?id=378779&isReview=true).
Changes to protocol:
* The protocol used 8.0 kpa whereas the trial registration used 8.0 and 8.2 kpa to define high-risk. We used 8.0 kpa for the analysis.
* The protocol specified that outcomes would be examined using time since GP referral. However, we instead used time since randomisation. This is because the intervention can only have an impact after randomisation, and using time from GP referral would include a period of 'immortal time'.
* Reduced time to additional screening/testing for liver disease was a planned outcome that was not included. We included time to first appointment at the hepatology clinic, which was not in the protocol.
* Our protocol stated we would use data from the MBS and PBS. However, getting the necessary approvals for this data took much longer than anticipated. Rather than delay the start of the trial, we decided not to use the MBS/PBS data. We also removed the following planned inclusion criteria: "Consent to access of their data from Queensland Health and MBS and PBS".
* Multiple outcomes in the protocol used a time point of "12 months post-baseline". However, because of delays in recruitment, we delayed the final follow-up until all patients had completed at least 12 months since randomisation. Hence the time-point used is "at least 12 months post-baseline." This decision was made on logistical grounds, as it was easier to do all follow-ups at the same time, and it had the added benefit of increasing the overall follow-up time and hence the statistical power.
All these changes were made before the final analyses were run.
# Statistical methods
We used descriptive statistics to compare the randomised groups at baseline.
Our primary outcome and some secondary outcomes were time to event outcomes. We therefore used a survival analysis approach, modelling the time to the outcome or right-censoring when data collection ended [@Dobson2018]. We used a Kaplan--Meier plot to graphically examine the difference between randomised groups. To estimate the difference between groups we used a Weibull regression model with randomised group as a predictor variable. We included other predictors that were likely to explain variance in the outcome, which were: age, gender and centre. The estimates from the Weibull model are hazard ratios and we include estimates of the median times for the usual care and new model of care groups, and the difference in these medians.
We used a generalised linear model for the outcomes that were continuous (quality of life), counts (number of GP visits), and binary (using Statins). When available, we included the participants' baseline as a predictor, for example, including baseline EQ-5D in the quality of life model.
We used Bayesian methods for the regression models, and hence the results are presented as the mean differences with 95% credible intervals. We checked the convergence of the Bayesian models by plotting the chains for the intercept.
We used multiple imputation when predictors were missing, using 10 imputations and combining the imputed estimates to give an overall result [@Sterne2009].
We did not impute results when the outcome was missing and instead excluded participants with missing outcomes.
We used the CONSORT guidelines to report the results [@Schulz2010].
# Time to diagnosis of high-risk NAFLD (primary clinical outcome)
```{r, include = FALSE}
result = run_survival(indata = data,
outcome_date = 'primary_date',
start_date = 'date_returned', # date randomised
censor_date = 'review_date')
#
plot_1 = result$chain_plot + ggtitle('High-risk NAFLD')
```
This outcome was listed in the published protocol as the primary clinical outcome.
The start date is the date of GP referral, the end date is the date of high-risk diagnosis, and the censoring date is the date the patients' notes were examined by the study nurse.
### Kaplan-Meier plot (primary clinical outcome)
```{r, fig.width=7, fig.height=6}
# change strata labels
ggsurvplot(result$km,
fun = 'event', # reverse curve to plot events
data = result$survdata,
legend.title = '', # remove 'strata' from legend
xlab = 'Years since randomisation',
ylab = 'Cumulative diagnoses',
ylim = c(0, 0.3),
palette = group_colours,
conf.int = FALSE,
ggtheme = theme_bw(),
risk.table = TRUE)
#
```
There were `r result$negative` participants excluded from this analysis due to negative or missing times. The total follow-up time is `r round(result$total_fu)` years. The total number of events is `r result$events` with `r result$censored` censored.
The plot shows the estimated accumulation over time of participants diagnosed as high-risk. The vertical notches on the lines show when participants were right-censored as the follow-up time ended. The table below the graph shows the participants "at risk", meaning those who were yet to be diagnosed as high-risk.
The plot shows an interesting pattern. There were some participants rapidly diagnosed in the new model care in the first few months, followed by no change for the next 2 years. In the usual care arm, the high-risk diagnoses accumulated steadily in the first year.
### Weibull survival model (primary clinical outcome)
```{r}
table = select(result$table, -'var') %>%
filter(!str_detect(variable, 'Intercept')) %>%
mutate(variable = nice_variable(variable))
flextable(table) %>%
theme_box() %>%
autofit() %>%
colformat_double(j=2:4, digits=2)
```
Despite the strong separation in the Kaplan-Meier plot, the hazard ratio for the new model of care was wide and included an increased time to diagnosis. The estimated median times reflect the small number of events, with estimates of many years before half the sample will be diagnosed.
Age was associated with an increased hazard of the primary outcome. Every 10 year increase in age increased the hazard by 1.38, 95% confidence interval 1.05 to 1.85. This means that older patients had a shorter time to high-risk NAFLD.
The Weibull shape controls the parametric hazard function over time. It is reported here for completeness and will not be included in the journal article.
# Time to first successful fibrosis assessment with Fibroscan
```{r, include = FALSE}
result = run_survival(indata = data,
outcome_date = 'scan_date', # was fibroscan_date_2 in error
start_date = 'date_returned', # date randomised
censor_date = 'review_date')
#
plot_2 = result$chain_plot + ggtitle('Time to Fibroscan')
# get stats on all but one participant for intervention group
time_stat = filter(data,randomised == 'New model of care') %>%
mutate(time = scan_date - date_returned) %>%
summarise(est = round(quantile(time, 0.99, na.rm=TRUE)),
est = as.numeric(est))
```
This outcome was listed in the published protocol as a secondary clinical outcome.
There were `r result$negative` participants excluded from this analysis due to negative or missing times. The total follow-up time is `r round(result$total_fu)` years.
### Kaplan-Meier plot (time to successful Fibroscan)
```{r, fig.width=7, fig.height=6}
# change strata labels
ggsurvplot(result$km,
fun = 'event', # reverse curve to plot events
data = result$survdata,
legend.title = '', # remove 'strata' from legend
xlab = 'Years since randomisation',
ylab = 'Cumulative scans',
ylim = c(0, 1),
palette = group_colours,
conf.int = FALSE,
ggtheme = theme_bw(),
risk.table = TRUE)
#
```
The plot shows the estimated accumulation over time of successful scans. The vertical notches on the lines show when participants were right-censored as the follow-up time ended. The table below the graph shows the participants "at risk", meaning those who are yet to have a Fibroscan. The total number of participants with a scan is `r result$events`, with `r result$censored` censored.
Almost everyone in the new model of care had their scan within `r time_est` days. In contrast, the usual care group slowly accumulated scans over time.
### Weibull survival model (time to successful Fibroscan)
```{r}
table = select(result$table, -'var') %>%
filter(!str_detect(variable, 'Intercept')) %>%
mutate(variable = nice_variable(variable))
flextable(table) %>%
theme_box() %>%
autofit() %>%
colformat_double(j=2:4, digits=2)
```
We used a Weibull regression model to estimate the time to Fibroscan. We included the randomised group and the potential predictors of gender, age and centre.
The mean hazard ratio for the new model of care was well above one and the median reduction in time to scan was close to one year.
# Time to first appointment at the hepatology clinic
```{r, include = FALSE}
result = run_survival(indata = data,
outcome_date = 'date_first_appointment',
start_date = 'date_returned', # date randomised
censor_date = 'review_date')
#
plot_3 = result$chain_plot + ggtitle('First appointment')
```
This outcome **was not** in the published protocol.
There were `r result$negative` participants excluded from this analysis due to negative or missing times. The total follow-up time is `r round(result$total_fu)` years.
### Kaplan-Meier plot (time to first appointment)
```{r, fig.width=7, fig.height=6}
# change strata labels
ggsurvplot(result$km,
fun = 'event', # reverse curve to plot events
data = result$survdata,
legend.title = '', # remove 'strata' from legend
xlab = 'Years since randomisation',
ylab = 'Cumulative appointments',
ylim = c(0, 1),
palette = group_colours,
conf.int = FALSE,
ggtheme = theme_bw(),
risk.table = TRUE)
#
```
The total number of appointments is `r result$events`, with `r result$censored` censored.
### Weibull survival model (time to first appointment)
```{r}
table = select(result$table, -'var') %>%
filter(!str_detect(variable, 'Intercept')) %>%
mutate(variable = nice_variable(variable))
flextable(table) %>%
theme_box() %>%
autofit() %>%
colformat_double(j=2:4, digits=2)
```
Similarly to the previous primary outcome, the new model of care decreased the average time, but the 95% credible intervals were wide and included the potential for longer times with the new model of care.
# Reduction in hospital admissions
There were no admissions to hospital for any LOCATE participants, so we have not analysed this outcome.
This outcome was listed in the published protocol as a secondary clinical outcome.
# Reduction in emergency department presentations
```{r, include = FALSE}
result = run_survival(indata = data,
outcome_date = 'ed_pres_1',
start_date = 'date_returned', # randomisation date
censor_date = 'review_date')
#
plot_4 = result$chain_plot + ggtitle('ED presentations')
```
In this section we examine the time from randomisation to first emergency department presentation.
This outcome was listed in the published protocol as a secondary clinical outcome.
There were `r result$negative` participants excluded from this analysis due to negative or missing times. The total follow-up time is `r round(result$total_fu)` years.
### Kaplan-Meier plot (time to ED presentation)
```{r, fig.width=7, fig.height=6}
# change strata labels
ggsurvplot(result$km,
fun = 'event', # reverse curve to plot events
data = result$survdata,
legend.title = '', # remove 'strata' from legend
xlab = 'Years since randomisation',
ylab = 'Cumulative presentations',
ylim = c(0, 0.4),
palette = group_colours,
conf.int = FALSE,
ggtheme = theme_bw(),
risk.table = TRUE)
#
```
We use a survival model to examine the time to ED presentation. Participants with no ED presentations were censored at their follow-up date.
The two cumulative incidence curves are very similar.
### Weibull survival model (time to ED presentation)
```{r}
table = select(result$table, -'var') %>%
filter(!str_detect(variable, 'Intercept')) %>%
mutate(variable = nice_variable(variable))
flextable(table) %>%
theme_box() %>%
autofit() %>%
colformat_double(j=2:4, digits=2)
```
The results show hazard ratios and 95% credible intervals (the model was fitted using a Bayesian method). A hazard ratio over 1 indicates an increased hazard, meaning a faster time to ED presentation.
The model adjusted for gender, age and centre.
The Sunshine Coast had a greatly reduced hazard ratio, meaning longer times to ED presentation.
There was no apparent difference in median time between the new model of care and usual care.
# Improved health-related quality of life
This outcome was listed in the published protocol as a secondary clinical outcome.
### Boxplot (quality of life)
```{r, fig.width=5, fig.height=3.5}
bplot = ggplot(data= data, aes(x=randomised, y=eq5d_12m, col=randomised))+
geom_boxplot()+
scale_color_manual(NULL, values = group_colours)+
g.theme+
xlab('')+
ylab('EQ-5D')+
theme(legend.position = 'none')+
coord_flip()
bplot
```
The higher the score, the better the quality of life. The highest possible score is 100, and the lowest 0. A few participants were close to zero.
### Statistical model (quality of life)
```{r, include=FALSE}
# Bayesian model, control for baseline, phone time centred at 12 months, where most results were done
result = glm_bayes(indata = filter(data, !is.na(eq5d_12m)), # cannot use missing outcome
equation = "eq5d_12m ~ I(eq5d_b-70) + I(randomised == 'New model of care') + I((age-50)/10) + sex + I(phone_time-12) + centrename")
#
plot_5 = result$chain_plot + ggtitle('Quality of life')
```
```{r}
#
flextable(result$table) %>%
theme_box() %>%
autofit() %>%
colformat_double(j=2:4, digits=1)
```
The results are on the scale of EQ-5D from 0 to 100. The model adjusted for baseline EQ-5D, age, gender, time since randomisation, and centre. Time since randomisation was added because there was some variance in when the participants were called, although most were close to the 12-month time.
Because of missing follow-up data, there were `r result$N_participants` participants available for this analysis.
There was little difference in quality of life between the usual care and new model of care groups.
Baseline quality of life was strongly predictive of follow-up quality of life, which is as expected.
### Residual check (quality of life)
```{r, fig.width=5}
hplot = ggplot(data= result$resid, aes(x = resid) )+
geom_histogram(col='grey66', fill='darkseagreen3')+
xlab('Residual (observed minus fitted)')+
ylab('Count')+
g.theme
hplot
```
There are no clear concerns in the residual plot.
# Reduced hepatocellular carcinoma (HCC) detected outside specific surveillance
There was only one diagnosis of HCC for any LOCATE participants, so we have not analysed this outcome.
This outcome was listed in the published protocol as a secondary clinical outcome.
# Reduced variceal bleeding occurring without variceal surveillance
There were only two diagnoses of variceal bleeding for any LOCATE participants, so we have not analysed this outcome.
This outcome was listed in the published protocol as a secondary clinical outcome.
# Increased statin use
This outcome was listed in the published protocol as a secondary clinical outcome.
From the protocol: "Statin use will be compared between groups by examining the number of participants given a new statin script during the follow-up period. The denominator will be the number of patients not on statins at the time of their referral letter to the hepatology clinic."
### Table (statin use)
```{r}
tab = filter(data, !is.na(statin_12m)) %>%
mutate(statin_12m = as.character(statin_12m)) %>%
tabyl(var2 = randomised, var1 = statin_12m) %>%
adorn_percentages(denominator = "col") %>%
adorn_pct_formatting() %>%
adorn_ns(position='front')
names(tab)[1] =' '
flextable(tab) %>%
theme_box() %>%
autofit()
```
The table shows the number of participants and column percentages. This table is for all available participants, regardless of their statin use at baseline.
### Statistical model (statin use)
```{r, include=FALSE}
# model
for_model = filter(data, statin_12m != 'Already on cholesterol lowering drugs 12 months ago') %>% # exclude this group
mutate(dep = as.numeric(statin_12m == 'Yes, started statin')) # make dependent variable
result = glm_bayes(indata = for_model,
equation = "dep ~ I(randomised == 'New model of care') + I((age-50)/10) + sex + centrename",
prior_sd = c(1000,1000,1000,2,1000), # tightened prior for men; first number is intercept
family = 'binomial')
# also tighten prior for centre?
#
plot_6 = result$chain_plot + ggtitle('Statin use')
```
```{r}
tab = filter(result$table, !str_detect(variable, 'Intercept'))
flextable(tab) %>%
theme_box() %>%
autofit() %>%
colformat_double(j=2:4, digits=2)
```
The dependent variable is starting a statin.
We excluded those patients who were already on a statin at baseline. The number of participants included was `r result$N_participants`.
We used a logistic regression model so the table above shows the odds ratios and 95% credible intervals.
The model adjusted for age, gender and centre.
There were no men who started a statin, hence the estimate for men was extremely variable. To account for this, we used a relatively tight Bayesian prior for gender, which helped give a more realistic estimate for this variable.
The new model of care was associated with a much higher odds of new statin use, but the 95% credible intervals were wide and included a reduced use of new statins relative to the usual care group.
# Increased referrals to a specialist, other than a hepatologist: dietician/nutritionist or psychologist
This outcome was listed in the published protocol as a secondary clinical outcome.
### Table (dietician/nutritionist or psychologist)
```{r}
#
tab1 = filter(data, !is.na(dietician_12m)) %>%
mutate(dietician_12m = as.character(dietician_12m)) %>%
tabyl(var2 = randomised, var1 = dietician_12m) %>%
adorn_percentages(denominator = "col") %>%
adorn_pct_formatting() %>%
adorn_ns(position='front')
names(tab1)[1] ='Dietician/Nutritionist'
#
tab2 = filter(data, !is.na(psychologist_12m)) %>%
mutate(psychologist_12m = as.character(psychologist_12m)) %>%
tabyl(var2 = randomised, var1 = psychologist_12m) %>%
adorn_percentages(denominator = "col") %>%
adorn_pct_formatting() %>%
adorn_ns(position='front')
names(tab2)[1] ='Psychologist'
```
#### Dietician or Nutritionist
```{r}
flextable(tab1) %>%
theme_box() %>%
autofit()
```
#### Psychologist
```{r}
flextable(tab2) %>%
theme_box() %>%
autofit()
```
### Statistical model (Dietician or nutritionist)
```{r, include=FALSE}
# model
for_model = filter(data, !is.na(dietician_12m)) %>%
mutate(dep = as.numeric(dietician_12m != 'No')) # make dependent variable
result = glm_bayes(indata = for_model,
missing = 'dietician_baseline',
equation = "dep ~ I(randomised == 'New model of care') + I(dietician_baseline != 'No') + I((age-50)/10) + sex + centrename",
family = 'binomial')
#
plot_7 = result$chain_plot + ggtitle('Dietician')
#
n_missing_outcome = sum(is.na(data$dietician_12m))
n_missing_predictor = sum(is.na(for_model$dietician_baseline))
```
There were `r n_missing_outcome` participants who were missing this outcome and so were excluded from this analysis.
There were `r n_missing_predictor` participants with missing data for whether they had seen a dietician or nutritionist at baseline. We used multiple imputation to impute their missing values before running the model.
We used a logistic regression model so the table above shows the odds ratios and 95% credible intervals. The mean odds was much higher for the new model of care group, but the 95% credible interval included a higher odds for usual care. Participants who had seen a dietician or nutritionist at baseline were far more likely to see one at follow-up, which is as expected. Older participants were less likely to see a dietician or nutritionist.
```{r}
tab = filter(result$table, !str_detect(variable, 'Intercept'))
flextable(tab) %>%
theme_box() %>%
autofit() %>%
colformat_double(j=2:4, digits=2)
```
### Statistical model (Psychologist)
```{r, include=FALSE}
# model, removed centre, not converging?
# model
for_model = filter(data, !is.na(psychologist_12m)) %>%
mutate(dep = as.numeric(psychologist_12m != 'No')) # make dependent variable
result = glm_bayes(indata = for_model,
equation = "dep ~ I(randomised == 'New model of care') + I((age-50)/10) + sex + centrename",
prior_sd = c(1000,1000,1000,1000,2), # tightened prior for centre; first number is intercept
family = 'binomial')
#
plot_8 = result$chain_plot + ggtitle('Psychologist')
#
n_missing_outcome = sum(is.na(data$psychologist_12m))
```
There were `r n_missing_outcome` participants who were missing this outcome and so were excluded from this analysis.
```{r}
tab = filter(result$table, !str_detect(variable, 'Intercept'))
flextable(tab) %>%
theme_box() %>%
autofit() %>%
colformat_double(j=2:4, digits=2)
```
We used a logistic regression model so the table above shows the odds ratios and 95% credible intervals. The mean odds ratio was under 1 for the new model of care, but the credible intervals were wide.
None of the participants on the Sunshine Coast saw a psychologist in the last 12 months. We used a tighter Bayesian prior for centre to account for this.
# Increased GP use
This outcome **was not** listed in the published protocol or trial registration.
### Boxplot (GP use)
```{r, fig.width=5, fig.height=3.5}
bplot = ggplot(data= data, aes(x=randomised, y=gp_12m, col=randomised))+
geom_boxplot()+
scale_color_manual(NULL, values = group_colours)+
g.theme+
xlab('')+
ylab('Number of GP visits in last year')+
theme(legend.position = 'none')+
coord_flip()
bplot
```
The number of GP visits in the last year has a strong positive skew and four outliers of over 40 visits.
### Statistical model (GP use)
```{r, include=FALSE}
#
result = glm_bayes(indata = data,
equation = "gp_12m ~ I(randomised == 'New model of care') + log(gp_baseline + 1) + I((age-50)/10) + sex + centrename",
missing = 'gp_baseline', #
family = 'poisson')
#
plot_9 = result$chain_plot + ggtitle('GP use')
#
n_missing_outcome = sum(is.na(data$gp_12m))
n_missing_predictor = sum(is.na(for_model$gp_baseline))
```
There were `r n_missing_outcome` participants who were missing this outcome and so were excluded from this analysis.
There were `r n_missing_predictor` participants with missing data for their number of GP visits at baseline. We used multiple imputation to impute their missing values before running the model.
```{r}
to_table = filter(result$table, !str_detect(variable, "Intercept")) # remove intercept
flextable(to_table) %>%
theme_box() %>%
autofit() %>%
colformat_double(j=2:4, digits=2)
```
The model used a Poisson distribution, so the results are presented as rate ratios. We adjust for GP use at baseline and, not surprisingly, there was a strong positive association between baseline GP usage and follow-up usage. We also adjust for age and sex.
The new model of care was associated with a decrease in GP usage.
Older participants were more likely to visit the GP. Participants at the Sunshine Coast had a reduced rate of GP use.
### Residual check (GP use)
```{r, fig.width=5}
hplot = ggplot(data= result$resid, aes(x = resid) )+
geom_histogram(col='grey66', fill='darkseagreen3')+
xlab('Residual (observed minus fitted)')+
ylab('Count')+
g.theme
hplot
```
There are no clear concerns in the residual plot.
# Reduced mortality
There were no deaths for any LOCATE participants during the follow-up, so we have not analysed this outcome.
This outcome was listed in the published protocol as a secondary clinical outcome.
# Per protocol analysis
```{r}
tab = tabyl(data, pp, randomised)
# adorn_percentages('col') %>%
# adorn_ns(position='front') # add back numbers
names(tab)[1] ='Per protocol'
flextable(tab) %>%
theme_box() %>%
autofit()
```
The table above shows the number of participants by randomised group that followed the per protocol definition. Only 2 patients did not follow the protocol. Given this small difference we will not run a separate per protocol analysis.
# Appendices
The appendices contain technical details. Some may be included as appendices in the paper.
### Bayesian models
```{r, include=FALSE}
source('99_mcmc.R')
```
The models were fitted using a Bayesian paradigm, so the results are given as means and 95% credible intervals, which have a more intuitive interpretation that standard 95% confidence intervals.
We used the nimble package to fit the Bayesian models. We used `r n.chains` chains, with a burn-in of `r format(MCMC,big.mark=',')` followed by a sample of `r format(MCMC,big.mark=',')` thinned by `r thin`.
### Bayesian model convergence
In this section we show the graphical estimates of model convergence.
```{r, fig.width=9, fig.height=9}
#
grid.arrange(plot_1, plot_2, plot_3,
plot_4, plot_5, plot_6,
plot_7, plot_8, plot_9, ncol = 3)
```
All the chains appear to show good mixing and convergence, so there are no concerns for any of the models.
### R and package versions
We used the following version of R and packages.
```{r}
sessionInfo()
```
# References