-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathchap1.qmd
322 lines (215 loc) · 10.4 KB
/
chap1.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
---
title: "Applied Survival Analysis"
subtitle: "Chapter 1 - Introduction"
css: style_slides.css
author:
name: Lu Mao
affiliation:
- name: Department of Biostatistics & Medical Informatics
- University of Wisconsin-Madison
email: lmao@biostat.wisc.edu
format:
revealjs:
auto-stretch: false
# beamer: default
editor: visual
include-in-header:
- text: |
<style type="text/css">
ul li ul li {
font-size: 0.80em;
}
</style>
---
## Outline
1. Time-to-event data and examples
2. Censoring mechanisms and implications
3. Summarizing the raw data
$$\newcommand{\indep}{\perp \!\!\! \perp}$$
# Data and Examples
## What are time-to-event data?
- Common outcome type in medical studies
- **Starting point**: Randomization, study entry, birth, etc.
- **Endpoint**: Death, hospitalization, disease onset, etc.
- In engineering: Machine failure times (reliability analysis)
- **Right censoring**
- Event does not occur by study end or dropout
- Only know event time $>$ censoring time
- **Survival analysis**: Statistical methods for censored data
## Example: Univariate event (I)
- **German Breast Cancer (GBC) Study**
- **Population**: 686 patients with node-positive breast cancer
- **Objective**: Assess if tamoxifen + chemo reduces mortality
- **Baseline info**: Age, tumor size, hormone levels, menopausal status, etc.
- **Follow-up**: Median 44 months
- 171 deaths $\to$ exact times known
- 515 censored $\to$ survival time $>$ censoring time
## Example: Univariate event (II)
- **German Breast Cancer (GBC) Study**
![](images/intro_gbc.png){fig-align="center" width="75%"}
## Example: Recurrent events (I)
- **Chronic Granulomatous Disease (CGD) Study**
- **Population**: 128 patients in a randomized placebo-controlled trial
- **Objective**: Assess gamma interferon effect on recurrent infections
- **Follow-up**: Median 293 days
- Infections: Min = 0, Max = 7
- **Challenge**: Correlated events within individuals
- **Data** in "long" format (multiple records per patient)
## Example: Recurrent events (II)
- **Chronic Granulomatous Disease (CGD) Study**
![](images/intro_cgd.png){fig-align="center" width="75%"}
## Example: Multivariate/Clustered Events (I)
- **Diabetic Retinopathy Study**
- **Population**: 197 high-risk diabetic patients in a randomized controlled trial
- **Objective**: Determine if photocoagulation (a laser treatment) delays blindness onset
- **Design**: One eye treated (by either xenon or argon), the other untreated (control)
- **Challenge**: Correlation between eyes
## Example: Multivariate/Clustered Events (II)
- **Diabetic Retinopathy Study**
![](images/intro_drs.png){fig-align="center" width="70%"}
## Example: Competing Risks (I)
- **Definition**: Multiple types of events where one prevents the occurrence of the others
- Natural example: different causes of death
- **Competing risk vs censoring**:
- Both terminate follow-up
- Competing risk: part of the outcome; inference based on its presence
- Censoring: irrelevant to outcome; inference based on its absence
- **Example**: death from prostate cancer as main outcome
- Death from other (metastasized) cancers $\to$ competing risk
- Death from traffic accidents $\to$ censoring
## Example: Competing Risks (II)
- **Bone Marrow Transplant Study**
- **population**: 864 multiple-myeloma leukemia patients undergoing allogeneic haematopoietic cell transplantation (HCT)
- **Objective**: Evaluate risk factors for treatment-related mortality (TRM) and relapse of leukemia
- **Competing risks**: TRM defined as death in remission (i.e., before relapse); thus is precluded by relapse
- **Risk factors**: cohort indicator (years 1995–2000 or 2001–2005), type of donor (unrelated or identical sibling), history of a prior transplant, time from diagnosis to transplantation (\<24 months, or ≥ 24 months)
## Example: Competing Risks (III)
- **Bone Marrow Transplant Study**
- Why only one record per patient?
![](images/intro_bmt.png){fig-align="center" width="8in" height="3.3in"}
## Example: More Complex Outcomes (Semi-competing risks)
- **German Breast Cancer (GBC) Study**
- Nonfatal event + terminal event (death)
![](images/intro_gbc1.png){fig-align="center" width="70%"}
## Example: More Complex Outcomes (with Longitudinal data)
- **Anti-Retroviral Drug Trial**
- Repeated measures of CD4 cell count + death
![](images/intro_ard.png){fig-align="center" width="75%"}
## Example: More Complex Outcomes (Multistate process)
- **Breast Cancer Life History Study**
- Remission $\to$ relapse $\to$ metastasis $\to$ death (can skip states)
![](images/intro_bc.png){fig-align="center" width="70%"}
## Example: Composite Endpoints(I)
- **Composite endpoint**: one with multiple components
- Recurrent/multivariate events
- (Semi-)Competing risks
- Longitudinal measurements
- Multistate processes
- Analysis of complex outcomes
- **Marginal approach**: models components separately
- **Conditional approach**: models components jointly
- **Composite approach**: combines components
- Progression/relapse-free survival (time to the earlier of progression/relapse or death)
## Example: Composite Endpoints(II)
- Advantages:
- Concentrates information $\to$ Statistical efficiency
- No need for multiple testing adjustment
- A single measure of overall effect size
- Preferred for primary analysis of Phase-III clinical trials by
- US Food and Drug Administration (FDA)
- ICH (International Council for Harmonisation for pharmaceuticals)
- Challenges:
- Statistical efficiency (e.g., beyond first event)
- Scientific relevance (e.g., relative importance of components)
# Censoring mechanisms and implications
## Censoring Mechanisms
- **Two mechanisms**
- Study termination (administrative censoring)
- Loss to follow-up (LTFU, e.g., withdrawal, death from other causes)
::: callout-caution
## Caution about censoring
- Event/censoring time $=$ time *from starting point* (e.g., randomization) to event/censoring (as opposed to time on the calendar)
- LTFU may not be independent of outcome (e.g., sicker patients withdraw early)
- Collect withdrawal reasons if possible
- Censoring or competing risk? $\leftarrow$ Domain knowledge
:::
## Censoring Mechanisms: Illustration
- Calendar time vs time synchronized by starting point
![](images/intro_censor.png){fig-align="center" width="90%"}
## Statistical Implications (I)
- Censored observation
- Not completely missing!
- Partial information: event time $>$ censoring time
- Ignoring partial information $\to$ Bias in inference
- Naive approaches
- Treat censoring as event $\to$ Underestimates time to event
- Exclude censored observations $\to$ Underestimates time to event (longer event times more likely censored)
## Statistical Implications (II)
- **Notation**
- $T$: Outcome event time
- $C$: Censoring time
- Observed data: $X=\min(T, C)$, $\delta = I(T\leq C)$
- (𝑋, 𝛿) = (`time`, `status`) in previous data examples
- **Estimation**
- **Independent censoring assumption**
$$ C \indep T$$
- **Estimand**: $S(t)={\rm pr}(T > t)$, i.e., probability of subject “surviving” to time $t$, using a random sample of $(X_i, \delta_i)$ $(i=1,\ldots, n)$
## Statistical Implications (III)
- Naive methods
- Event-imputation empirical survival function: $$\hat S_{\rm imp}(t)=n^{-1}\sum_{i=1}^n I(X_i > t) \to {\rm pr}(X > t)\leq S(t)$$
- Complete-case empirical survival function: $$\hat S_{\rm cc}(t)=\frac{\sum_{i=1}^n I(X_i > t, \delta_i = 1)}{\sum_{i=1}^n\delta_i}
\to {\rm pr}(T > t\mid T\leq C)\leq S(t)$$
- Both naïve methods underestimate the true survival function
## Statistical Implications: Example
- **German Breast Cancer (GBC) Study**
![](images/intro_gbc_naive.png){fig-align="center" width="70%"}
# Summarizing Raw Data
## Importance of Descriptive Analysis
- Statistical models rely more or less on assumptions
- Good practice to summarize data descriptively as first step
- Get to know the data
- Informs subsequent analysis
- Check balance of baseline characteristics between randomized arms
- “Table 1” in medical research papers
- Two types of summary statistics
- Subject-level characteristics (baseline variables, number of events per subject)
- Event rates (over aggregate length of follow-up)
## How to Calculate Event Rate (I)
- Length of follow-up is **event-specific**
- If an event is "non-recurrent", its occurrence means patient is no longer at risk for it
![](images/intro_erate_form.png){fig-align="center" width="75%"}
::: callout-note
Denominator is called person-year (or person-time) of follow-up.
:::
## How to Calculate Event Rate (II)
- Semi-competing risks
![](images/intro_erate1.png){fig-align="center" width="50%"}
## How to Calculate Event Rate (III)
- Recurrent events
![](images/intro_erate2.png){fig-align="center" width="70%"}
## Table One: Example
![](images/intro_tab1.png){fig-align="center" width="75%"}
# Conclusion
## Chapter Summary
- Types of time-to-event outcomes
- Univariate, recurrent, multivariate/clustered, (semi-)competing risks, repeated measures, multistate processes, and everything in between…
- Common feature: censoring
- Arises if study ends or patient drops out prior to event
- Must be handled with care to avoid false conclusion
- Importance of descriptive analysis
- Event rate $\to$ attention to denominator
## HW1 (Due Feb 5)
- Choose one
- Problem 1.1 (Recommended for PhD in Stats/BDS)
- Problem 1.2
- Problem 1.8 (Attach you annotated code)
- (Extra credit) Problem 1.3
## Guidelines for HW
- Present a readable and coherent text to report your methods and results
- Include numerical/graphical results only if they contribute to your narrative
- All tables and figures should be properly titled/captioned, with informative labels/legends
- Use full names instead of abbreviations/acronyms
- E.g., “meno” $\to$ “Menopause (yes v no)”; “est” $\to$ “Estrogen (fmol/mg)”
- Specify the unit of variable, e.g., "Age (years)"
- See Table 1.11 and Fig. 1.2 for examples
- Append the full code for diagnostic purposes