-
Notifications
You must be signed in to change notification settings - Fork 11
/
proportional-hazards-models.qmd
720 lines (535 loc) · 18.7 KB
/
proportional-hazards-models.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
# Proportional Hazards Models
---
{{< include shared-config.qmd >}}
## Introduction
---
::: notes
Recall: the exponential distribution has constant hazard:
:::
$$
\begin{aligned}
f(t) &= \lambda e^{-\lambda t}\\
S(t) &= e^{-\lambda t}\\
h(t) &= \lambda
\end{aligned}
$$
---
::: notes
Let's make two generalizations.
First, we let the hazard depend on some
covariates $x_1,x_2, \dots, x_p$;
we will indicate this dependence by extending our notation for hazard:
:::
$$\haz(t|\vx) \eqdef \p(T=t|T\ge t, \vX = \vx)$$
---
:::{#def-base-hazard}
#### baseline hazard
:::: notes
The **baseline hazard**, **base hazard**, or **reference hazard**,
denoted $h_0(t)$ or $\lambda_0(t)$, is the [hazard function](intro-to-survival-analysis.qmd#def-hazard) for the
subpopulation of individuals whose covariates
are all equal to their reference levels:
::::
$$\haz_0(t) \eqdef \haz(t | \vX = \v0)$$
:::
---
Similarly:
:::{#def-base-cuhaz}
#### baseline cumulative hazard
:::: notes
The **baseline cumulative hazard**,
**base cumulative hazard**,
or **reference cumulative hazard**,
denoted $H_0(t)$ or $\Lambda_0(t)$,
is the [cumulative hazard function](intro-to-survival-analysis.qmd#def-cuhaz)
for the subpopulation of individuals whose covariates
are all equal to their reference levels:
::::
$$\cuhaz_0(t) \eqdef \cuhaz(t | \vX = \v0)$$
:::
---
Also:
:::{#def-baseline-surv}
#### Baseline survival function
The **baseline survival function**
is the survival function
for an individual whose covariates
are all equal to their default values.
$$S_0(t) \eqdef S(t | \vX = \v0)$$
:::
---
::: notes
As the second generalization,
we let the
base hazard,
cumulative hazard, and
survival functions
depend on $t$,
but not on the covariates (for now).
We can do this using either parametric
or semi-parametric approaches.
:::
## Cox's Proportional Hazards Model
---
::: notes
The generalization is that the hazard function is:
:::
$$
\begin{aligned}
h(t|x)&= h_0(t)\theta(x)\\
\theta(x) &= \exp{\eta(x)}\\
\eta(x) &= \vx\'\vb \\
&\eqdef \beta_1x_1+\cdots+\beta_px_p
\end{aligned}
$$
::: notes
The relationship between $h(t|x)$ and $\eta(x)$ is typically modeled using a log link,
as in a generalized linear model; that is:
:::
$$\log{h(t|x)} = \log{h_0(t)} + \eta(x)$$
---
::: notes
This model is **semi-parametric**,
because the linear predictor depends on estimated
parameters but the base hazard function is unspecified.
There is no constant term in $\eta(x)$,
because it is absorbed in the base hazard.
:::
---
Alternatively, we could define $\beta_0(t) = \log{h_0(t)}$, and then:
$$\eta(x,t) = \beta_0(t) + \beta_1x_1+\cdots+\beta_px_p$$
---
::: notes
For two different individuals with covariate patterns $\boldsymbol x_1$ and $\boldsymbol x_2$, the ratio of the hazard functions (a.k.a. **hazard ratio**, a.k.a. **relative hazard**) is:
:::
$$
\begin{aligned}
\frac{h(t|\boldsymbol x_1)}{h(t|\boldsymbol x_2)}
&=\frac{h_0(t)\theta(\boldsymbol x_1)}{h_0(t)\theta(\boldsymbol x_2)}\\
&=\frac{\theta(\boldsymbol x_1)}{\theta(\boldsymbol x_2)}\\
\end{aligned}
$$
::: notes
Under the proportional hazards model,
this ratio (a.k.a. proportion) does not depend on $t$.
This property is a structural limitation of the model;
it is called the **proportional hazards assumption**.
:::
---
::: {#def-pha}
### proportional hazards
A conditional probability distribution $p(T|X)$
has **proportional hazards** if the hazard ratio $h(t|\boldsymbol x_1)/h(t|\boldsymbol x_2)$ does not depend on $t$. Mathematically, it can be written as:
$$
\frac{h(t|\boldsymbol x_1)}{h(t|\boldsymbol x_2)}
= \theta(\boldsymbol x_1,\boldsymbol x_2)
$$
:::
::: notes
As we saw above, Cox's proportional hazards model has this property, with $\theta(\boldsymbol x_1,\boldsymbol x_2) = \frac{\theta(\boldsymbol x_1)}{\theta(\boldsymbol x_2)}$.
:::
---
:::{.callout-note}
We are using two similar notations, $\theta(\boldsymbol x_1,\boldsymbol x_2)$ and $\theta(\boldsymbol x)$. We can link these notations if we define $\theta(\boldsymbol x) \eqdef \theta(\boldsymbol x, \boldsymbol 0)$ and $\theta(\boldsymbol 0) = 1$.
:::
---
The proportional hazards model also has additional notable properties:
$$
\begin{aligned}
\frac{h(t|\boldsymbol x_1)}{h(t|\boldsymbol x_2)}
&=\frac{\theta(\boldsymbol x_1)}{\theta(\boldsymbol x_2)}\\
&=\frac{\exp{\eta(\boldsymbol x_1)}}{\exp{\eta(\boldsymbol x_2)}}\\
&=\exp{\eta(\boldsymbol x_1)-\eta(\boldsymbol x_2)}\\
&=\exp{\boldsymbol x_1'\beta-\boldsymbol x_2'\beta}\\
&=\exp{(\boldsymbol x_1 - \boldsymbol x_2)'\beta}\\
\end{aligned}
$$
---
Hence on the log scale, we have:
$$
\begin{aligned}
\log{\frac{h(t|\boldsymbol x_1)}{h(t|\boldsymbol x_2)}}
&=\eta(\boldsymbol x_1)-\eta(\boldsymbol x_2)\\
&= \boldsymbol x_1'\beta-\boldsymbol x_2'\beta\\
&= (\boldsymbol x_1 - \boldsymbol x_2)'\beta
\end{aligned}
$$
---
If only one covariate $x_j$ is changing, then:
$$
\begin{aligned}
\log{\frac{h(t|\boldsymbol x_1)}{h(t|\boldsymbol x_2)}}
&= (x_{1j} - x_{2j}) \cdot \beta_j\\
&\propto (x_{1j} - x_{2j})
\end{aligned}
$$
::: notes
That is, under Cox's model $h(t|\boldsymbol x) = h_0(t)\exp{\boldsymbol x'\beta}$, the log of the hazard ratio is proportional to the difference in $x_j$, with the proportionality coefficient equal to $\beta_j$.
:::
---
Further,
$$
\begin{aligned}
\log{h(t|\boldsymbol x)}
&=\log{h_0(t)} + x'\beta
\end{aligned}
$$
::: notes
That is, the covariate effects are additive on the log-hazard scale; hazard functions for different covariate patterns should be vertical shifts of each other.
See also:
<https://en.wikipedia.org/wiki/Proportional_hazards_model#Why_it_is_called_%22proportional%22>
:::
### Additional properties of the proportional hazards model
If $h(t|x)= h_0(t)\theta(x)$, then:
:::{#thm-ph-cuhaz}
#### Cumulative hazards are also proportional to $H_0(t)$
$$
\begin{aligned}
H(t|x)
&\eqdef \int_{u=0}^t h(u)du\\
&= \int_{u=0}^t h_0(u)\theta(x)du\\
&= \theta(x)\int_{u=0}^t h_0(u)du\\
&= \theta(x)H_0(t)
\end{aligned}
$$
where $H_0(t) \eqdef H(t|0) = \int_{u=0}^t h_0(u)du$.
:::
---
:::{#thm-log-cuhaz-parallel}
#### The logarithms of cumulative hazard should be parallel
$$
\log{H(t|\vx)} =\log{H_0(t)} + \vx'\vb
$$
:::
---
:::{#thm-ph-surv}
#### Survival functions are exponential multiples of $S_0(t)$
$$
\begin{aligned}
S(t|x)
&= \exp{-H(t|x)}\\
&= \exp{-\theta(x)\cdot H_0(t)}\\
&= \left(\exp{- H_0(t)}\right)^{\theta(x)}\\
&= \sb{S_0(t)}^{\theta(x)}\\
\end{aligned}
$$
:::
### Testing the proportional hazards assumption
::: notes
The Nelson-Aalen estimate of the cumulative hazard is usually used for estimates of the hazard and often the cumulative hazard.
If the hazards of the three groups are proportional, that means that the ratio of the hazards is constant over $t$. We can test this using the ratios of the estimated cumulative hazards, which also would be
proportional, as shown above.
:::
```{r}
library(KMsurv)
library(survival)
library(dplyr)
data(bmt)
bmt =
bmt |>
as_tibble() |>
mutate(
group =
group |>
factor(
labels = c("ALL","Low Risk AML","High Risk AML")))
nafit = survfit(
formula = Surv(t2,d3) ~ group,
type = "fleming-harrington",
data = bmt)
bmt_curves = tibble(timevec = 1:1000)
sf1 <- with(nafit[1], stepfun(time,c(1,surv)))
sf2 <- with(nafit[2], stepfun(time,c(1,surv)))
sf3 <- with(nafit[3], stepfun(time,c(1,surv)))
bmt_curves =
bmt_curves |>
mutate(
cumhaz1 = -log(sf1(timevec)),
cumhaz2 = -log(sf2(timevec)),
cumhaz3 = -log(sf3(timevec)))
```
```{r}
#| fig-cap: "Hazard Ratios by Disease Group for `bmt` data"
#| label: fig-HR-bmt
library(ggplot2)
bmt_rel_hazard_plot =
bmt_curves |>
ggplot(
aes(
x = timevec,
y = cumhaz1/cumhaz2)
) +
geom_line(aes(col = "ALL/Low Risk AML")) +
ylab("Hazard Ratio") +
xlab("Time") +
ylim(0,6) +
geom_line(aes(y = cumhaz3/cumhaz1, col = "High Risk AML/ALL")) +
geom_line(aes(y = cumhaz3/cumhaz2, col = "High Risk AML/Low Risk AML")) +
theme_bw() +
labs(colour = "Comparison") +
theme(legend.position="bottom")
print(bmt_rel_hazard_plot)
```
---
We can zoom in on 30-300 days to take a closer look:
```{r}
#| fig-cap: "Hazard Ratios by Disease Group (30-300 Days)"
#| label: fig-HR-bmt-inset
bmt_rel_hazard_plot + xlim(c(30,300))
```
---
The cumulative hazard curves should also be proportional
```{r}
#| fig-cap: "Disease-Free Cumulative Hazard by Disease Group"
#| label: fig-cuhaz-bmt
library(ggfortify)
plot_cuhaz_bmt =
bmt |>
survfit(formula = Surv(t2, d3) ~ group) |>
autoplot(fun = "cumhaz",
mark.time = TRUE) +
ylab("Cumulative hazard")
plot_cuhaz_bmt |> print()
```
---
```{r}
#| fig-cap: "Disease-Free Cumulative Hazard by Disease Group (log-scale)"
#| label: fig-cuhaz-bmt-loglog
plot_cuhaz_bmt +
scale_y_log10() +
scale_x_log10()
```
### Smoothed hazard functions
::: notes
The Nelson-Aalen estimate of the cumulative hazard is usually used for
estimates of the hazard. Since the hazard is the derivative of the
cumulative hazard, we need a smooth estimate of the cumulative hazard,
which is provided by smoothing the step-function cumulative hazard.
The R package `muhaz` handles this for us. What we are looking for is
whether the hazard function is more or less the same shape, increasing,
decreasing, constant, etc. Are the hazards "proportional"?
:::
```{r}
#| fig-cap: "Smoothed Hazard Rate Estimates by Disease Group"
library(muhaz)
muhaz(bmt$t2,bmt$d3,bmt$group=="High Risk AML") |> plot(lwd=2,col=3)
muhaz(bmt$t2,bmt$d3,bmt$group=="ALL") |> lines(lwd=2,col=1)
muhaz(bmt$t2,bmt$d3,bmt$group=="Low Risk AML") |> lines(lwd=2,col=2)
legend("topright",c("ALL","Low Risk AML","High Risk AML"),col=1:3,lwd=2)
```
::: notes
Group 3 was plotted first because it has the highest hazard.
Except for an initial blip in the high risk AML group,
the hazards look roughly proportional.
They are all strongly decreasing.
:::
### Fitting the Proportional Hazards Model
::: notes
How do we fit a proportional hazards regression model? We need to
estimate the coefficients of the covariates, and we need to estimate the
base hazard $h_0(t)$. For the covariates, supposing for simplicity that
there are no tied event times, let the event times for the whole data
set be $t_1, t_2,\ldots,t_D$. Let the risk set at time $t_i$ be $R(t_i)$
and
:::
$$
\begin{aligned}
\eta(\boldsymbol{x}) &= \beta_1x_{1}+\cdots+\beta_p x_{p}\\
\theta(\boldsymbol{x}) &= e^{\eta(\boldsymbol{x})}\\
h(t|X=x)&= h_0(t)e^{\eta(\boldsymbol{x})}=\theta(\boldsymbol{x}) h_0(t)
\end{aligned}
$$
---
Conditional on a single failure at time $t$, the probability that the
event is due to subject $f\in R(t)$ is approximately
$$
\begin{aligned}
\Pr(f \text{ fails}|\text{1 failure at } t)
&= \frac{h_0(t)e^{\eta(\boldsymbol{x}_f)}}{\sum_{k \in R(t)}h_0(t)e^{\eta(\boldsymbol{x}_f)}}\\
&=\frac{\theta(\boldsymbol{x}_f)}{\sum_{k \in R(t)} \theta(\boldsymbol{x}_k)}
\end{aligned}
$$
The logic behind this has several steps. We first fix (ex post) the
failure times and note that in this discrete context, the probability
$p_j$ that a subject $j$ in the risk set fails at time $t$ is just the
hazard of that subject at that time.
If all of the $p_j$ are small, the chance that exactly one subject fails
is
$$
\sum_{k\in R(t)}p_k\left[\prod_{m\in R(t), m\ne k} (1-p_m)\right]\approx\sum_{k\in R(t)}p_k
$$
If subject $i$ is the one who experiences the event of interest at time
$t_i$, then the **partial likelihood** is
$$
\mathcal L^*(\beta|T)=
\prod_i \frac{\theta(x_i)}{\sum_{k \in R(t_i)} \theta(\boldsymbol{x}_k)}
$$
and we can numerically maximize this with respect to the coefficients
$\boldsymbol{\beta}$ that specify
$\eta(\boldsymbol{x}) = \boldsymbol{x}'\boldsymbol{\beta}$. When there
are tied event times adjustments need to be made, but the likelihood is
still similar. Note that we don't need to know the base hazard to solve
for the coefficients.
Once we have coefficient estimates
$\hat{\boldsymbol{\beta}} =(\hat \beta_1,\ldots,\hat\beta_p)$, this also
defines $\hat\eta(x)$ and $\hat\theta(x)$ and then the estimated base
cumulative hazard function is $$\hat H(t)=
\sum_{t_i < t} \frac{d_i}{\sum_{k\in R(t_i)} \theta(x_k)}$$ which
reduces to the Nelson-Aalen estimate when there are no covariates. There
are numerous other estimates that have been proposed as well.
## Cox Model for the `bmt` data
### Fit the model
```{r}
bmt.cox <- coxph(Surv(t2, d3) ~ group, data = bmt)
summary(bmt.cox)
```
The table provides hypothesis tests comparing groups 2 and 3 to group 1.
Group 3 has the highest hazard, so the most significant comparison is
not directly shown.
The coefficient 0.3834 is on the log-hazard-ratio scale, as in
log-risk-ratio. The next column gives the hazard ratio 1.4673, and a
hypothesis (Wald) test.
The (not shown) group 3 vs. group 2 log hazard ratio is 0.3834 + 0.5742
= 0.9576. The hazard ratio is then exp(0.9576) or 2.605.
Inference on all coefficients and combinations can be constructed using
`coef(bmt.cox)` and `vcov(bmt.cox)` as with logistic and poisson
regression.
**Concordance** is agreement of first failure between pairs of subjects
and higher predicted risk between those subjects, omitting
non-informative pairs.
The Rsquare value is Cox and Snell's pseudo R-squared and is not very
useful.
### Tests for nested models
`summary()` prints three tests for whether the model with the group
covariate is better than the one without
- `Likelihood ratio test` (chi-squared)
- `Wald test` (also chi-squared), obtained by adding the squares of the z-scores
- `Score` = log-rank test, as with comparison of survival functions.
The likelihood ratio test is probably best in smaller samples, followed
by the Wald test.
### Survival Curves from the Cox Model
We can take a look at the resulting group-specific curves:
```{r}
#| fig-cap: "Survival Functions for Three Groups by KM and Cox Model"
km_fit = survfit(Surv(t2, d3) ~ group, data = as.data.frame(bmt))
cox_fit = survfit(
bmt.cox,
newdata =
data.frame(
group = unique(bmt$group),
row.names = unique(bmt$group)))
library(survminer)
list(KM = km_fit, Cox = cox_fit) |>
survminer::ggsurvplot(
# facet.by = "group",
legend = "bottom",
legend.title = "",
combine = TRUE,
fun = 'pct',
size = .5,
ggtheme = theme_bw(),
conf.int = FALSE,
censor = FALSE) |>
suppressWarnings() # ggsurvplot() throws some warnings that aren't too worrying
```
When we use `survfit()` with a Cox model, we have to specify the covariate levels we are interested in; the argument `newdata` should include a `data.frame` with the same named columns as the predictors in the Cox model and one or more levels of each.
---
From `?survfit.coxph`:
> If the `newdata` argument is missing, a curve is produced for a
> single "pseudo" subject with covariate values equal to the means component of the fit.
> The resulting curve(s) almost never make sense,
> but the default remains due to an unwarranted attachment to the option shown by some users and by other packages.
> Two particularly egregious examples are factor variables and interactions.
> Suppose one were studying interspecies transmission of a virus, and the data set has a factor variable with levels ("pig", "chicken") and about equal numbers of observations for each.
> The "mean" covariate level will be 0.5 -- is this a flying pig?
### Examining `survfit`
```{r}
survfit(Surv(t2, d3) ~ group, data = bmt)
```
```{r}
survfit(Surv(t2, d3) ~ group, data = bmt) |> summary()
```
```{r}
survfit(bmt.cox)
survfit(bmt.cox, newdata = tibble(group = unique(bmt$group)))
```
```{r}
bmt.cox |>
survfit(newdata = tibble(group = unique(bmt$group))) |>
summary()
```
## Adjustment for Ties (optional)
---
At each time $t_i$ at which more than one of the subjects has an event,
let $d_i$ be the number of events at that time, $D_i$ the set of
subjects with events at that time, and let $s_i$ be a covariate vector
for an artificial subject obtained by adding up the covariate values for
the subjects with an event at time $t_i$. Let
$$\bar\eta_i = \beta_1s_{i1}+\cdots+\beta_ps_{ip}$$ and
$\bar\theta_i = \exp{\bar\eta_i}$.
Let $s_i$ be a covariate vector for an artificial subject obtained by
adding up the covariate values for the subjects with an event at time
$t_i$. Note that
$$
\begin{aligned}
\bar\eta_i &=\sum_{j \in D_i}\beta_1x_{j1}+\cdots+\beta_px_{jp}\\
&= \beta_1s_{i1}+\cdots+\beta_ps_{ip}\\
\bar\theta_i &= \exp{\bar\eta_i}\\
&= \prod_{j \in D_i}\theta_i
\end{aligned}
$$
#### Breslow's method for ties
Breslow's method estimates the partial likelihood as
$$
\begin{aligned}
L(\beta|T) &=
\prod_i \frac{\bar\theta_i}{[\sum_{k \in R(t_i)} \theta_k]^{d_i}}\\
&= \prod_i \prod_{j \in D_i}\frac{\theta_j}{\sum_{k \in R(t_i)} \theta_k}
\end{aligned}
$$
This method is equivalent to treating each event as distinct and using the non-ties formula.
It works best when the number of ties is small.
It is the default in many statistical packages, including PROC PHREG in SAS.
#### Efron's method for ties
The other common method is Efron's, which is the default in R.
$$L(\beta|T)=
\prod_i \frac{\bar\theta_i}{\prod_{j=1}^{d_i}[\sum_{k \in R(t_i)} \theta_k-\frac{j-1}{d_i}\sum_{k \in D_i} \theta_k]}$$
This is closer to the exact discrete partial likelihood when there are
many ties.
The third option in R (and an option also in SAS as `discrete`) is the
"exact" method, which is the same one used for matched logistic
regression.
#### Example: Breslow's method
Suppose as an example we have a time $t$ where there are 20 individuals
at risk and three failures. Let the three individuals have risk
parameters $\theta_1, \theta_2, \theta_3$ and let the sum of the risk
parameters of the remaining 17 individuals be $\theta_R$. Then the
factor in the partial likelihood at time $t$ using Breslow's method is
::: smaller
$$
\left(\frac{\theta_1}{\theta_R+\theta_1+\theta_2+\theta_3}\right)
\left(\frac{\theta_2}{\theta_R+\theta_1+\theta_2+\theta_3}\right)
\left(\frac{\theta_3}{\theta_R+\theta_1+\theta_2+\theta_3}\right)
$$
:::
If on the other hand, they had died in the order 1,2, 3, then the
contribution to the partial likelihood would be:
::: smaller
$$
\left(\frac{\theta_1}{\theta_R+\theta_1+\theta_2+\theta_3}\right)
\left(\frac{\theta_2}{\theta_R+\theta_2+\theta_3}\right)
\left(\frac{\theta_3}{\theta_R+\theta_3}\right)
$$
:::
as the risk set got smaller with each failure. The exact method roughly
averages the results for the six possible orderings of the failures.
#### Example: Efron's method
But we don't know the order they failed in, so instead of reducing the
denominator by one risk coefficient each time, we reduce it by the same
fraction. This is Efron's method.
::: smaller
$$\left(\frac{\theta_1}{\theta_R+\theta_1+\theta_2+\theta_3}\right)
\left(\frac{\theta_2}{\theta_R+2(\theta_1+\theta_2+\theta_3)/3}\right)
\left(\frac{\theta_3}{\theta_R+(\theta_1+\theta_2+\theta_3)/3}\right)$$
:::
{{< include coxph-model-building.qmd >}}