diff --git a/.nojekyll b/.nojekyll index c95b8ac77..f58bbc9da 100644 --- a/.nojekyll +++ b/.nojekyll @@ -1 +1 @@ -f37b9049 \ No newline at end of file +5de40da9 \ No newline at end of file diff --git a/Linear-models-overview.html b/Linear-models-overview.html index 00c0e6ab5..487afc050 100644 --- a/Linear-models-overview.html +++ b/Linear-models-overview.html @@ -3926,7 +3926,7 @@

Note

full -6.818 +6.995 0.4805 0.3831 5.956 @@ -3934,7 +3934,7 @@

Note

reduced -6.703 +6.595 0.4454 0.3802 5.971 diff --git a/Linear-models-overview_files/figure-html/unnamed-chunk-104-1.png b/Linear-models-overview_files/figure-html/unnamed-chunk-104-1.png index 45acb305a..df3d0b4d4 100644 Binary files a/Linear-models-overview_files/figure-html/unnamed-chunk-104-1.png and b/Linear-models-overview_files/figure-html/unnamed-chunk-104-1.png differ diff --git a/Linear-models-overview_files/figure-html/unnamed-chunk-98-1.png b/Linear-models-overview_files/figure-html/unnamed-chunk-98-1.png index 2c16ca282..895d1257f 100644 Binary files a/Linear-models-overview_files/figure-html/unnamed-chunk-98-1.png and b/Linear-models-overview_files/figure-html/unnamed-chunk-98-1.png differ diff --git a/Regression-Models-for-Epidemiology.pdf b/Regression-Models-for-Epidemiology.pdf index 28ddef35c..8ce4d73eb 100644 Binary files a/Regression-Models-for-Epidemiology.pdf and b/Regression-Models-for-Epidemiology.pdf differ diff --git a/index.html b/index.html index 337d2a23a..ef538a922 100644 --- a/index.html +++ b/index.html @@ -395,7 +395,7 @@

Table of contents

Published
-

Last modified: 2024-05-27: 18:34:22 (PM)

+

Last modified: 2024-05-28: 14:03:32 (PM)

diff --git a/intro-MLEs.html b/intro-MLEs.html index 378ee3410..b81d08d75 100644 --- a/intro-MLEs.html +++ b/intro-MLEs.html @@ -2082,7 +2082,7 @@
diff --git a/logistic-regression.html b/logistic-regression.html index 6f0f55aab..b249b301b 100644 --- a/logistic-regression.html +++ b/logistic-regression.html @@ -3703,8 +3703,8 @@

<
ggplotly(HL_plot)
-
- +
+
@@ -3855,8 +3855,8 @@

wcgs_response_resid_plot |> ggplotly()
-
- +
+

We can see a slight fan-shape here: observations on the right have larger variance (as expected since \(var(\bar y) = \pi(1-\pi)/n\) is maximized when \(\pi = 0.5\)).

@@ -3992,8 +3992,8 @@

Re
wcgs_resid_plot1 |> ggplotly()
-
- +
+
diff --git a/proportional-hazards-models.html b/proportional-hazards-models.html index 1eb1ee4c9..26654b583 100644 --- a/proportional-hazards-models.html +++ b/proportional-hazards-models.html @@ -336,29 +336,28 @@ @@ -378,7 +377,7 @@
Published
-

Last modified: 2024-05-02: 18:10:45 (PM)

+

Last modified: 2024-05-28: 14:03:32 (PM)

@@ -448,10 +447,9 @@ .quarto-figure-center > figure { text-align: center; } -

-6.1 The proportional hazards model

-

-6.1.1 Background on the Proportional Hazards Model

+

+6.1 Introduction

+

The exponential distribution has constant hazard:

\[ \begin{aligned} @@ -460,12 +458,19 @@ h(t) &= \lambda \end{aligned} \]

+

Let’s make two generalizations. First, we let the hazard depend on some covariates \(x_1,x_2, \dots, x_p\); we will indicate this dependence by extending our notation for hazard:

+

\[h(t|\boldsymbol x) \stackrel{\text{def}}{=}p(T=t|T\ge t, \boldsymbol X = \boldsymbol x)\]

+

Second, we let the base hazard depend on \(t\), but not on the covariates (for now). We can do this using either parametric or semi-parametric approaches.

-

-6.1.2 Cox’s Proportional Hazards Model

-

The generalization is that the hazard function is

+ +

+6.2 Cox’s Proportional Hazards Model

+
+
+

The generalization is that the hazard function is:

+

\[ \begin{aligned} h(t|x)&= h_0(t)\theta(x)\\ @@ -474,18 +479,25 @@ &\stackrel{\text{def}}{=}\beta_1x_1+\cdots+\beta_px_p \end{aligned} \]

+

The relationship between \(h(t|x)\) and \(\eta(x)\) has a log link (that is, \(\text{log}\left\{h(t|x)\right\} = \text{log}\left\{h_0(t)\right\} + \eta(x)\)), as in a generalized linear model.

This model is semi-parametric, because the linear predictor depends on estimated parameters but the base hazard function is unspecified. There is no constant term in \(\eta(x)\), because it is absorbed in the base hazard.

-

Alternatively, we could define \(\beta_0(t) = \text{log}\left\{h_0(t)\right\}\), and then \(\eta(x,t) = \beta_0(t) + \beta_1x_1+\cdots+\beta_px_p\).

+
+
+

Alternatively, we could define \(\beta_0(t) = \text{log}\left\{h_0(t)\right\}\), and then:

+

\[\eta(x,t) = \beta_0(t) + \beta_1x_1+\cdots+\beta_px_p\]

+
+

For two different individuals with covariate patterns \(\boldsymbol x_1\) and \(\boldsymbol x_2\), the ratio of the hazard functions (a.k.a. hazard ratio, a.k.a. relative hazard) is:

+

\[ \begin{aligned} \frac{h(t|\boldsymbol x_1)}{h(t|\boldsymbol x_2)} &=\frac{h_0(t)\theta(\boldsymbol x_1)}{h_0(t)\theta(\boldsymbol x_2)}\\ &=\frac{\theta(\boldsymbol x_1)}{\theta(\boldsymbol x_2)}\\ \end{aligned} -\]

-

Under the proportional hazards model, this ratio (a.k.a. proportion) does not depend on \(t\). This property is a structural limitation of the model; it is called the proportional hazards assumption.

+\] ::: notes Under the proportional hazards model, this ratio (a.k.a. proportion) does not depend on \(t\). This property is a structural limitation of the model; it is called the proportional hazards assumption. :::

+

Definition 6.1 (proportional hazards) A conditional probability distribution \(p(T|X)\) has proportional hazards if the hazard ratio \(h(t|\boldsymbol x_1)/h(t|\boldsymbol x_2)\) does not depend on \(t\). Mathematically, it can be written as:

\[ @@ -493,7 +505,10 @@ = \theta(\boldsymbol x_1,\boldsymbol x_2) \]

+

As we saw above, Cox’s proportional hazards model has this property, with \(\theta(\boldsymbol x_1,\boldsymbol x_2) = \frac{\theta(\boldsymbol x_1)}{\theta(\boldsymbol x_2)}\).

+
+
@@ -507,7 +522,8 @@

We are using two similar notations, \(\theta(\boldsymbol x_1,\boldsymbol x_2)\) and \(\theta(\boldsymbol x)\). We can link these notations if we define \(\theta(\boldsymbol x) \stackrel{\text{def}}{=}\theta(\boldsymbol x, \boldsymbol 0)\) and \(\theta(\boldsymbol 0) = 1\).

-

It also has additional notable properties:

+
+

The proportional hazards model also has additional notable properties:

\[ \begin{aligned} \frac{h(t|\boldsymbol x_1)}{h(t|\boldsymbol x_2)} @@ -518,6 +534,7 @@ &=\text{exp}\left\{(\boldsymbol x_1 - \boldsymbol x_2)'\beta\right\}\\ \end{aligned} \]

+

Hence on the log scale, we have:

\[ \begin{aligned} @@ -527,6 +544,7 @@ &= (\boldsymbol x_1 - \boldsymbol x_2)'\beta \end{aligned} \]

+

If only one covariate \(x_j\) is changing, then:

\[ \begin{aligned} @@ -535,7 +553,10 @@ &\propto (x_{1j} - x_{2j}) \end{aligned} \]

+

That is, under Cox’s model \(h(t|\boldsymbol x) = h_0(t)\text{exp}\left\{\boldsymbol x'\beta\right\}\), the log of the hazard ratio is proportional to the difference in \(x_j\), with the proportionality coefficient equal to \(\beta_j\).

+
+

Further,

\[ \begin{aligned} @@ -543,15 +564,16 @@ &=\text{log}\left\{h_0(t)\right\} + x'\beta \end{aligned} \]

-

That is, the covariate effects are additive on the log-hazard scale.

+
+

That is, the covariate effects are additive on the log-hazard scale; hazard functions for different covariate patterns should be vertical shifts of each other.

See also:

https://en.wikipedia.org/wiki/Proportional_hazards_model#Why_it_is_called_%22proportional%22

-

-6.1.3 Additional properties of the proportional hazards model

+ +

+6.2.1 Additional properties of the proportional hazards model

If \(h(t|x)= h_0(t)\theta(x)\), then:

-

Cumulative hazards are also proportional to \(H_0(t)\) -

-

\[ +

+

Theorem 6.1 (Cumulative hazards are also proportional to \(H_0(t)\)) \[ \begin{aligned} H(t|x) &\stackrel{\text{def}}{=}\int_{u=0}^t h(u)du\\ @@ -561,26 +583,37 @@ \end{aligned} \]

where \(H_0(t) \stackrel{\text{def}}{=}H(t|0) = \int_{u=0}^t h_0(u)du\).

-

Survival functions are exponential multiples of \(S_0(t)\) -

-

\[ + +


+
+

Theorem 6.2 (The logarithms of cumulative hazard should be parallel) \[ +\text{log}\left\{H(t|\tilde{x})\right\} =\text{log}\left\{H_0(t)\right\} + \tilde{x}'\tilde{\beta} +\]

+
+
+
+

Theorem 6.3 (Survival functions are exponential multiples of \(S_0(t)\)) \[ \begin{aligned} S(t|x) &= \text{exp}\left\{-H(t|x)\right\}\\ &= \text{exp}\left\{-\theta(x)\cdot H_0(t)\right\}\\ &= \left(\text{exp}\left\{- H_0(t)\right\}\right)^{\theta(x)}\\ -&= \left(S_0(t)\right)^{\theta(x)}\\ +&= \left[S_0(t)\right]^{\theta(x)}\\ \end{aligned} \]

where \(S_0(t) \stackrel{\text{def}}{=}P(T\ge t | \boldsymbol X = 0)\) is the survival function for an individual whose covariates are all equal to their default values.

-

-6.1.4 Testing the proportional hazards assumption

+ +

+6.2.2 Testing the proportional hazards assumption

+

The Nelson-Aalen estimate of the cumulative hazard is usually used for estimates of the hazard and often the cumulative hazard.

If the hazards of the three groups are proportional, that means that the ratio of the hazards is constant over \(t\). We can test this using the ratios of the estimated cumulative hazards, which also would be proportional, as shown above.

+
Show R code

 library(KMsurv)
 library(survival)
+library(dplyr)
 data(bmt)
 
 bmt = 
@@ -631,43 +664,78 @@
 
 print(bmt_rel_hazard_plot)
-
-
Hazard Ratios by Disease Group

+
+
+Figure 6.1: Hazard Ratios by Disease Group for bmt data +
+ +
+

We can zoom in on 30-300 days to take a closer look:

Show R code
bmt_rel_hazard_plot + xlim(c(30,300))
-
-
Hazard Ratios by Disease Group (30-300 Days)

+
+
+Figure 6.2: Hazard Ratios by Disease Group (30-300 Days) +
+ +
-

-6.1.5 Smoothed hazard functions

-

The Nelson-Aalen estimate of the cumulative hazard is usually used for estimates of the hazard. Since the hazard is the derivative of the cumulative hazard, we need a smooth estimate of the cumulative hazard, which is provided by smoothing the step-function cumulative hazard.

-

The R package muhaz handles this for us. What we are looking for is whether the hazard function is more or less the same shape, increasing, decreasing, constant, etc. Are the hazards “proportional”?

+
+

The cumulative hazard curves should also be proportional

-
Show R code
plot(
-  survfit(Surv(t2,d3)~group,data=bmt),
-  col=1:3,
-  lwd=2,
-  fun="cumhaz",
-  mark.time = TRUE)
-legend("bottomright",c("ALL","Low Risk AML","High Risk AML"),col=1:3,lwd=2)
+
Show R code
library(ggfortify)
+plot_cuhaz_bmt =
+  bmt |>
+  survfit(formula = Surv(t2, d3) ~ group) |>
+  autoplot(fun = "cumhaz",
+           mark.time = TRUE) + 
+  ylab("Cumulative hazard")
+  
+plot_cuhaz_bmt |> print()
-
-
Disease-Free Cumulative Hazard by Disease Group

+
+
+Figure 6.3: Disease-Free Cumulative Hazard by Disease Group +
+ +
+
-
Show R code
library(muhaz)
+
Show R code
plot_cuhaz_bmt + 
+  scale_y_log10() +
+  scale_x_log10()
+
+
+
+Figure 6.4: Disease-Free Cumulative Hazard by Disease Group (log-scale) +
+ +
+
+
+
+
+

+6.2.3 Smoothed hazard functions

+
+

The Nelson-Aalen estimate of the cumulative hazard is usually used for estimates of the hazard. Since the hazard is the derivative of the cumulative hazard, we need a smooth estimate of the cumulative hazard, which is provided by smoothing the step-function cumulative hazard.

+

The R package muhaz handles this for us. What we are looking for is whether the hazard function is more or less the same shape, increasing, decreasing, constant, etc. Are the hazards “proportional”?

+
+
+
Show R code
library(muhaz)
 
 muhaz(bmt$t2,bmt$d3,bmt$group=="High Risk AML") |> plot(lwd=2,col=3)
 muhaz(bmt$t2,bmt$d3,bmt$group=="ALL") |> lines(lwd=2,col=1)
@@ -675,16 +743,20 @@
 legend("topright",c("ALL","Low Risk AML","High Risk AML"),col=1:3,lwd=2)
-
Smoothed Hazard Rate Estimates by Disease Group

+
Smoothed Hazard Rate Estimates by Disease Group

+

Group 3 was plotted first because it has the highest hazard.

-

We will see that except for an initial blip in the high risk AML group, the hazards look roughly proportional . They are all strongly decreasing.

-

-6.1.6 Fitting the Proportional Hazards Model

+

Except for an initial blip in the high risk AML group, the hazards look roughly proportional. They are all strongly decreasing.

+ +

+6.2.4 Fitting the Proportional Hazards Model

+

How do we fit a proportional hazards regression model? We need to estimate the coefficients of the covariates, and we need to estimate the base hazard \(h_0(t)\). For the covariates, supposing for simplicity that there are no tied event times, let the event times for the whole data set be \(t_1, t_2,\ldots,t_D\). Let the risk set at time \(t_i\) be \(R(t_i)\) and

+

\[ \begin{aligned} \eta(\boldsymbol{x}) &= \beta_1x_{1}+\cdots+\beta_p x_{p}\\ @@ -692,6 +764,7 @@ h(t|X=x)&= h_0(t)e^{\eta(\boldsymbol{x})}=\theta(\boldsymbol{x}) h_0(t) \end{aligned} \]

+

Conditional on a single failure at time \(t\), the probability that the event is due to subject \(f\in R(t)\) is approximately

\[ \begin{aligned} @@ -713,12 +786,12 @@

and we can numerically maximize this with respect to the coefficients \(\boldsymbol{\beta}\) that specify \(\eta(\boldsymbol{x}) = \boldsymbol{x}'\boldsymbol{\beta}\). When there are tied event times adjustments need to be made, but the likelihood is still similar. Note that we don’t need to know the base hazard to solve for the coefficients.

Once we have coefficient estimates \(\hat{\boldsymbol{\beta}} =(\hat \beta_1,\ldots,\hat\beta_p)\), this also defines \(\hat\eta(x)\) and \(\hat\theta(x)\) and then the estimated base cumulative hazard function is \[\hat H(t)= \sum_{t_i < t} \frac{d_i}{\sum_{k\in R(t_i)} \theta(x_k)}\] which reduces to the Nelson-Aalen estimate when there are no covariates. There are numerous other estimates that have been proposed as well.

-

-6.2 Cox Model for the bmt data

-

-6.2.1 Fit the model

+

+6.3 Cox Model for the bmt data

+

+6.3.1 Fit the model

-
Show R code

+
Show R code

 bmt.cox <- coxph(Surv(t2, d3) ~ group, data = bmt)
 summary(bmt.cox)
 #> Call:
@@ -758,11 +831,11 @@
 Score = log-rank test, as with comparison of survival functions.
 
 

The likelihood ratio test is probably best in smaller samples, followed by the Wald test.

-

-6.2.2 Survival Curves from the Cox Model

+

+6.3.2 Survival Curves from the Cox Model

We can take a look at the resulting group-specific curves:

-
Show R code

+
Show R code

 #| fig-cap: "Survival Functions for Three Groups by KM and Cox Model"
 
 km_fit = survfit(Surv(t2, d3) ~ group, data = as.data.frame(bmt))
@@ -790,7 +863,7 @@
   suppressWarnings() # ggsurvplot() throws some warnings that aren't too worrying
-

+

@@ -799,11 +872,11 @@

Otherwise (that is, if the newdata argument is missing), a curve is produced for a single “pseudo” subject with covariate values equal to the means component of the fit.

The resulting curve(s) almost never make sense, but the default remains due to an unwarranted attachment to the option shown by some users and by other packages.

Two particularly egregious examples are factor variables and interactions. Suppose one were studying interspecies transmission of a virus, and the data set has a factor variable with levels (“pig”, “chicken”) and about equal numbers of observations for each. The “mean” covariate level will be 0.5 – is this a flying pig?

-

-6.2.3 Examining survfit +

+6.3.3 Examining survfit

-
Show R code
survfit(Surv(t2, d3)~group,data=bmt)
+
Show R code
survfit(Surv(t2, d3)~group,data=bmt)
 #> Call: survfit(formula = Surv(t2, d3) ~ group, data = bmt)
 #> 
 #>                      n events median 0.95LCL 0.95UCL
@@ -813,7 +886,7 @@
 
-
Show R code
survfit(Surv(t2, d3)~group,data=bmt) |> summary()
+
Show R code
survfit(Surv(t2, d3)~group,data=bmt) |> summary()
 #> Call: survfit(formula = Surv(t2, d3) ~ group, data = bmt)
 #> 
 #>                 group=ALL 
@@ -908,7 +981,7 @@
 
-
Show R code
survfit(bmt.cox)
+
Show R code
survfit(bmt.cox)
 #> Call: survfit(formula = bmt.cox)
 #> 
 #>        n events median 0.95LCL 0.95UCL
@@ -923,7 +996,7 @@
 
-
Show R code
bmt.cox |> 
+
Show R code
bmt.cox |> 
   survfit(newdata = tibble(group = unique(bmt$group))) |> 
   summary()
 #> Call: survfit(formula = bmt.cox, newdata = tibble(group = unique(bmt$group)))
@@ -1007,10 +1080,10 @@
 #>  2204      9       1     0.313     0.520     0.182
-

-6.3 Adjustment for Ties (optional)

-

-6.3.1

+

+6.4 Adjustment for Ties (optional)

+

+6.4.1

At each time \(t_i\) at which more than one of the subjects has an event, let \(d_i\) be the number of events at that time, \(D_i\) the set of subjects with events at that time, and let \(s_i\) be a covariate vector for an artificial subject obtained by adding up the covariate values for the subjects with an event at time \(t_i\). Let \[\bar\eta_i = \beta_1s_{i1}+\cdots+\beta_ps_{ip}\] and \(\bar\theta_i = \text{exp}\left\{\bar\eta_i\right\}\).

Let \(s_i\) be a covariate vector for an artificial subject obtained by adding up the covariate values for the subjects with an event at time \(t_i\). Note that

\[ @@ -1681,539 +1754,638 @@

diff --git a/proportional-hazards-models_files/figure-html/fig-HR-bmt-1.png b/proportional-hazards-models_files/figure-html/fig-HR-bmt-1.png new file mode 100644 index 000000000..82124c496 Binary files /dev/null and b/proportional-hazards-models_files/figure-html/fig-HR-bmt-1.png differ diff --git a/proportional-hazards-models_files/figure-html/fig-HR-bmt-inset-1.png b/proportional-hazards-models_files/figure-html/fig-HR-bmt-inset-1.png new file mode 100644 index 000000000..2561b4b03 Binary files /dev/null and b/proportional-hazards-models_files/figure-html/fig-HR-bmt-inset-1.png differ diff --git a/proportional-hazards-models_files/figure-html/fig-cuhaz-bmt-1.png b/proportional-hazards-models_files/figure-html/fig-cuhaz-bmt-1.png new file mode 100644 index 000000000..ec3033609 Binary files /dev/null and b/proportional-hazards-models_files/figure-html/fig-cuhaz-bmt-1.png differ diff --git a/proportional-hazards-models_files/figure-html/fig-cuhaz-bmt-loglog-1.png b/proportional-hazards-models_files/figure-html/fig-cuhaz-bmt-loglog-1.png new file mode 100644 index 000000000..19879f72c Binary files /dev/null and b/proportional-hazards-models_files/figure-html/fig-cuhaz-bmt-loglog-1.png differ diff --git a/proportional-hazards-models_files/figure-html/unnamed-chunk-10-1.png b/proportional-hazards-models_files/figure-html/unnamed-chunk-10-1.png new file mode 100644 index 000000000..b57649186 Binary files /dev/null and b/proportional-hazards-models_files/figure-html/unnamed-chunk-10-1.png differ diff --git a/proportional-hazards-models_files/figure-html/unnamed-chunk-8-1.png b/proportional-hazards-models_files/figure-html/unnamed-chunk-8-1.png new file mode 100644 index 000000000..27823dc9d Binary files /dev/null and b/proportional-hazards-models_files/figure-html/unnamed-chunk-8-1.png differ diff --git a/search.json b/search.json index 430f8c435..6f90da5d4 100644 --- a/search.json +++ b/search.json @@ -193,7 +193,7 @@ "href": "Linear-models-overview.html#model-selection-1", "title": "\n2  Linear (Gaussian) Models\n", "section": "\n2.9 Model selection", - "text": "2.9 Model selection\n(adapted from Dobson and Barnett (2018) §6.3.3; for more information on prediction, see James et al. (2013) and Harrell (2015)).\n\nIf we have a lot of covariates in our dataset, we might want to choose a small subset to use in our model.\nThere are a few possible metrics to consider for choosing a “best” model.\n\n\n2.9.1 Mean squared error\nWe might want to minimize the mean squared error, \\(\\text E[(y-\\hat y)^2]\\), for new observations that weren’t in our data set when we fit the model.\nUnfortunately, \\[\\frac{1}{n}\\sum_{i=1}^n (y_i-\\hat y_i)^2\\] gives a biased estimate of \\(\\text E[(y-\\hat y)^2]\\) for new data. If we want an unbiased estimate, we will have to be clever.\n\nCross-validation\n\nShow R codedata(\"carbohydrate\", package = \"dobson\")\nlibrary(cvTools)\nfull_model <- lm(carbohydrate ~ ., data = carbohydrate)\ncv_full = \n full_model |> cvFit(\n data = carbohydrate, K = 5, R = 10,\n y = carbohydrate$carbohydrate)\n\nreduced_model = update(full_model, \n formula = ~ . - age)\n\ncv_reduced = \n reduced_model |> cvFit(\n data = carbohydrate, K = 5, R = 10,\n y = carbohydrate$carbohydrate)\n\n\n\n\nShow R coderesults_reduced = \n tibble(\n model = \"wgt+protein\",\n errs = cv_reduced$reps[])\nresults_full = \n tibble(model = \"wgt+age+protein\",\n errs = cv_full$reps[])\n\ncv_results = \n bind_rows(results_reduced, results_full)\n\ncv_results |> \n ggplot(aes(y = model, x = errs)) +\n geom_boxplot()\n\n\n\n\n\n\n\n\ncomparing metrics\n\nShow R code\ncompare_results = tribble(\n ~ model, ~ cvRMSE, ~ r.squared, ~adj.r.squared, ~ trainRMSE, ~loglik,\n \"full\", cv_full$cv, summary(full_model)$r.squared, summary(full_model)$adj.r.squared, sigma(full_model), logLik(full_model) |> as.numeric(),\n \"reduced\", cv_reduced$cv, summary(reduced_model)$r.squared, summary(reduced_model)$adj.r.squared, sigma(reduced_model), logLik(reduced_model) |> as.numeric())\n\ncompare_results\n\n\n\nmodel\ncvRMSE\nr.squared\nadj.r.squared\ntrainRMSE\nloglik\n\n\n\nfull\n6.818\n0.4805\n0.3831\n5.956\n-61.84\n\n\nreduced\n6.703\n0.4454\n0.3802\n5.971\n-62.49\n\n\n\n\n\n\n\nShow R codeanova(full_model, reduced_model)\n\n\n\nRes.Df\nRSS\nDf\nSum of Sq\nF\nPr(>F)\n\n\n\n16\n567.7\nNA\nNA\nNA\nNA\n\n\n17\n606.0\n-1\n-38.36\n1.081\n0.3139\n\n\n\n\n\nstepwise regression\n\nShow R codelibrary(olsrr)\nolsrr:::ols_step_both_aic(full_model)\n#> \n#> \n#> Stepwise Summary \n#> -------------------------------------------------------------------------\n#> Step Variable AIC SBC SBIC R2 Adj. R2 \n#> -------------------------------------------------------------------------\n#> 0 Base Model 140.773 142.764 83.068 0.00000 0.00000 \n#> 1 protein (+) 137.950 140.937 80.438 0.21427 0.17061 \n#> 2 weight (+) 132.981 136.964 77.191 0.44544 0.38020 \n#> -------------------------------------------------------------------------\n#> \n#> Final Model Output \n#> ------------------\n#> \n#> Model Summary \n#> ---------------------------------------------------------------\n#> R 0.667 RMSE 5.505 \n#> R-Squared 0.445 MSE 35.648 \n#> Adj. R-Squared 0.380 Coef. Var 15.879 \n#> Pred R-Squared 0.236 AIC 132.981 \n#> MAE 4.593 SBC 136.964 \n#> ---------------------------------------------------------------\n#> RMSE: Root Mean Square Error \n#> MSE: Mean Square Error \n#> MAE: Mean Absolute Error \n#> AIC: Akaike Information Criteria \n#> SBC: Schwarz Bayesian Criteria \n#> \n#> ANOVA \n#> -------------------------------------------------------------------\n#> Sum of \n#> Squares DF Mean Square F Sig. \n#> -------------------------------------------------------------------\n#> Regression 486.778 2 243.389 6.827 0.0067 \n#> Residual 606.022 17 35.648 \n#> Total 1092.800 19 \n#> -------------------------------------------------------------------\n#> \n#> Parameter Estimates \n#> ----------------------------------------------------------------------------------------\n#> model Beta Std. Error Std. Beta t Sig lower upper \n#> ----------------------------------------------------------------------------------------\n#> (Intercept) 33.130 12.572 2.635 0.017 6.607 59.654 \n#> protein 1.824 0.623 0.534 2.927 0.009 0.509 3.139 \n#> weight -0.222 0.083 -0.486 -2.662 0.016 -0.397 -0.046 \n#> ----------------------------------------------------------------------------------------\n\n\nLasso\n\\[\\arg min_{\\theta} \\ell(\\theta) + \\lambda \\sum_{j=1}^p|\\beta_j|\\]\n\nShow R codelibrary(glmnet)\ny = carbohydrate$carbohydrate\nx = carbohydrate |> \n select(age, weight, protein) |> \n as.matrix()\nfit = glmnet(x,y)\n\n\n\n\nShow R codeautoplot(fit, xvar = 'lambda')\n\n\n\nFigure 2.19: Lasso selection\n\n\n\n\n\n\n\n\n\nShow R codecvfit = cv.glmnet(x,y)\nplot(cvfit)\n\n\n\n\n\n\n\n\n\nShow R codecoef(cvfit, s = \"lambda.1se\")\n#> 4 x 1 sparse Matrix of class \"dgCMatrix\"\n#> s1\n#> (Intercept) 34.1090\n#> age . \n#> weight -0.1041\n#> protein 0.9441", + "text": "2.9 Model selection\n(adapted from Dobson and Barnett (2018) §6.3.3; for more information on prediction, see James et al. (2013) and Harrell (2015)).\n\nIf we have a lot of covariates in our dataset, we might want to choose a small subset to use in our model.\nThere are a few possible metrics to consider for choosing a “best” model.\n\n\n2.9.1 Mean squared error\nWe might want to minimize the mean squared error, \\(\\text E[(y-\\hat y)^2]\\), for new observations that weren’t in our data set when we fit the model.\nUnfortunately, \\[\\frac{1}{n}\\sum_{i=1}^n (y_i-\\hat y_i)^2\\] gives a biased estimate of \\(\\text E[(y-\\hat y)^2]\\) for new data. If we want an unbiased estimate, we will have to be clever.\n\nCross-validation\n\nShow R codedata(\"carbohydrate\", package = \"dobson\")\nlibrary(cvTools)\nfull_model <- lm(carbohydrate ~ ., data = carbohydrate)\ncv_full = \n full_model |> cvFit(\n data = carbohydrate, K = 5, R = 10,\n y = carbohydrate$carbohydrate)\n\nreduced_model = update(full_model, \n formula = ~ . - age)\n\ncv_reduced = \n reduced_model |> cvFit(\n data = carbohydrate, K = 5, R = 10,\n y = carbohydrate$carbohydrate)\n\n\n\n\nShow R coderesults_reduced = \n tibble(\n model = \"wgt+protein\",\n errs = cv_reduced$reps[])\nresults_full = \n tibble(model = \"wgt+age+protein\",\n errs = cv_full$reps[])\n\ncv_results = \n bind_rows(results_reduced, results_full)\n\ncv_results |> \n ggplot(aes(y = model, x = errs)) +\n geom_boxplot()\n\n\n\n\n\n\n\n\ncomparing metrics\n\nShow R code\ncompare_results = tribble(\n ~ model, ~ cvRMSE, ~ r.squared, ~adj.r.squared, ~ trainRMSE, ~loglik,\n \"full\", cv_full$cv, summary(full_model)$r.squared, summary(full_model)$adj.r.squared, sigma(full_model), logLik(full_model) |> as.numeric(),\n \"reduced\", cv_reduced$cv, summary(reduced_model)$r.squared, summary(reduced_model)$adj.r.squared, sigma(reduced_model), logLik(reduced_model) |> as.numeric())\n\ncompare_results\n\n\n\nmodel\ncvRMSE\nr.squared\nadj.r.squared\ntrainRMSE\nloglik\n\n\n\nfull\n6.995\n0.4805\n0.3831\n5.956\n-61.84\n\n\nreduced\n6.595\n0.4454\n0.3802\n5.971\n-62.49\n\n\n\n\n\n\n\nShow R codeanova(full_model, reduced_model)\n\n\n\nRes.Df\nRSS\nDf\nSum of Sq\nF\nPr(>F)\n\n\n\n16\n567.7\nNA\nNA\nNA\nNA\n\n\n17\n606.0\n-1\n-38.36\n1.081\n0.3139\n\n\n\n\n\nstepwise regression\n\nShow R codelibrary(olsrr)\nolsrr:::ols_step_both_aic(full_model)\n#> \n#> \n#> Stepwise Summary \n#> -------------------------------------------------------------------------\n#> Step Variable AIC SBC SBIC R2 Adj. R2 \n#> -------------------------------------------------------------------------\n#> 0 Base Model 140.773 142.764 83.068 0.00000 0.00000 \n#> 1 protein (+) 137.950 140.937 80.438 0.21427 0.17061 \n#> 2 weight (+) 132.981 136.964 77.191 0.44544 0.38020 \n#> -------------------------------------------------------------------------\n#> \n#> Final Model Output \n#> ------------------\n#> \n#> Model Summary \n#> ---------------------------------------------------------------\n#> R 0.667 RMSE 5.505 \n#> R-Squared 0.445 MSE 35.648 \n#> Adj. R-Squared 0.380 Coef. Var 15.879 \n#> Pred R-Squared 0.236 AIC 132.981 \n#> MAE 4.593 SBC 136.964 \n#> ---------------------------------------------------------------\n#> RMSE: Root Mean Square Error \n#> MSE: Mean Square Error \n#> MAE: Mean Absolute Error \n#> AIC: Akaike Information Criteria \n#> SBC: Schwarz Bayesian Criteria \n#> \n#> ANOVA \n#> -------------------------------------------------------------------\n#> Sum of \n#> Squares DF Mean Square F Sig. \n#> -------------------------------------------------------------------\n#> Regression 486.778 2 243.389 6.827 0.0067 \n#> Residual 606.022 17 35.648 \n#> Total 1092.800 19 \n#> -------------------------------------------------------------------\n#> \n#> Parameter Estimates \n#> ----------------------------------------------------------------------------------------\n#> model Beta Std. Error Std. Beta t Sig lower upper \n#> ----------------------------------------------------------------------------------------\n#> (Intercept) 33.130 12.572 2.635 0.017 6.607 59.654 \n#> protein 1.824 0.623 0.534 2.927 0.009 0.509 3.139 \n#> weight -0.222 0.083 -0.486 -2.662 0.016 -0.397 -0.046 \n#> ----------------------------------------------------------------------------------------\n\n\nLasso\n\\[\\arg min_{\\theta} \\ell(\\theta) + \\lambda \\sum_{j=1}^p|\\beta_j|\\]\n\nShow R codelibrary(glmnet)\ny = carbohydrate$carbohydrate\nx = carbohydrate |> \n select(age, weight, protein) |> \n as.matrix()\nfit = glmnet(x,y)\n\n\n\n\nShow R codeautoplot(fit, xvar = 'lambda')\n\n\n\nFigure 2.19: Lasso selection\n\n\n\n\n\n\n\n\n\nShow R codecvfit = cv.glmnet(x,y)\nplot(cvfit)\n\n\n\n\n\n\n\n\n\nShow R codecoef(cvfit, s = \"lambda.1se\")\n#> 4 x 1 sparse Matrix of class \"dgCMatrix\"\n#> s1\n#> (Intercept) 34.1090\n#> age . \n#> weight -0.1041\n#> protein 0.9441", "crumbs": [ "Generalized Linear Models", "2  Linear (Gaussian) Models" @@ -628,11 +628,22 @@ ] }, { - "objectID": "proportional-hazards-models.html#the-proportional-hazards-model", - "href": "proportional-hazards-models.html#the-proportional-hazards-model", + "objectID": "proportional-hazards-models.html#introduction", + "href": "proportional-hazards-models.html#introduction", "title": "\n6  Proportional Hazards Models\n", - "section": "\n6.1 The proportional hazards model", - "text": "6.1 The proportional hazards model\n\n6.1.1 Background on the Proportional Hazards Model\nThe exponential distribution has constant hazard:\n\\[\n\\begin{aligned}\nf(t) &= \\lambda e^{-\\lambda t}\\\\\nS(t) &= e^{-\\lambda t}\\\\\nh(t) &= \\lambda\n\\end{aligned}\n\\]\nLet’s make two generalizations. First, we let the hazard depend on some covariates \\(x_1,x_2, \\dots, x_p\\); we will indicate this dependence by extending our notation for hazard:\n\\[h(t|\\boldsymbol x) \\stackrel{\\text{def}}{=}p(T=t|T\\ge t, \\boldsymbol X = \\boldsymbol x)\\]\nSecond, we let the base hazard depend on \\(t\\), but not on the covariates (for now). We can do this using either parametric or semi-parametric approaches.\n\n6.1.2 Cox’s Proportional Hazards Model\nThe generalization is that the hazard function is\n\\[\n\\begin{aligned}\nh(t|x)&= h_0(t)\\theta(x)\\\\\n\\theta(x) &= \\text{exp}\\left\\{\\eta(x)\\right\\}\\\\\n\\eta(x) &= x'\\beta\\\\\n&\\stackrel{\\text{def}}{=}\\beta_1x_1+\\cdots+\\beta_px_p\n\\end{aligned}\n\\]\nThe relationship between \\(h(t|x)\\) and \\(\\eta(x)\\) has a log link (that is, \\(\\text{log}\\left\\{h(t|x)\\right\\} = \\text{log}\\left\\{h_0(t)\\right\\} + \\eta(x)\\)), as in a generalized linear model.\nThis model is semi-parametric, because the linear predictor depends on estimated parameters but the base hazard function is unspecified. There is no constant term in \\(\\eta(x)\\), because it is absorbed in the base hazard.\nAlternatively, we could define \\(\\beta_0(t) = \\text{log}\\left\\{h_0(t)\\right\\}\\), and then \\(\\eta(x,t) = \\beta_0(t) + \\beta_1x_1+\\cdots+\\beta_px_p\\).\nFor two different individuals with covariate patterns \\(\\boldsymbol x_1\\) and \\(\\boldsymbol x_2\\), the ratio of the hazard functions (a.k.a. hazard ratio, a.k.a. relative hazard) is:\n\\[\n\\begin{aligned}\n\\frac{h(t|\\boldsymbol x_1)}{h(t|\\boldsymbol x_2)}\n&=\\frac{h_0(t)\\theta(\\boldsymbol x_1)}{h_0(t)\\theta(\\boldsymbol x_2)}\\\\\n&=\\frac{\\theta(\\boldsymbol x_1)}{\\theta(\\boldsymbol x_2)}\\\\\n\\end{aligned}\n\\]\nUnder the proportional hazards model, this ratio (a.k.a. proportion) does not depend on \\(t\\). This property is a structural limitation of the model; it is called the proportional hazards assumption.\n\nDefinition 6.1 (proportional hazards) A conditional probability distribution \\(p(T|X)\\) has proportional hazards if the hazard ratio \\(h(t|\\boldsymbol x_1)/h(t|\\boldsymbol x_2)\\) does not depend on \\(t\\). Mathematically, it can be written as:\n\\[\n\\frac{h(t|\\boldsymbol x_1)}{h(t|\\boldsymbol x_2)}\n= \\theta(\\boldsymbol x_1,\\boldsymbol x_2)\n\\]\n\nAs we saw above, Cox’s proportional hazards model has this property, with \\(\\theta(\\boldsymbol x_1,\\boldsymbol x_2) = \\frac{\\theta(\\boldsymbol x_1)}{\\theta(\\boldsymbol x_2)}\\).\n\n\n\n\n\n\nNote\n\n\n\nWe are using two similar notations, \\(\\theta(\\boldsymbol x_1,\\boldsymbol x_2)\\) and \\(\\theta(\\boldsymbol x)\\). We can link these notations if we define \\(\\theta(\\boldsymbol x) \\stackrel{\\text{def}}{=}\\theta(\\boldsymbol x, \\boldsymbol 0)\\) and \\(\\theta(\\boldsymbol 0) = 1\\).\n\n\nIt also has additional notable properties:\n\\[\n\\begin{aligned}\n\\frac{h(t|\\boldsymbol x_1)}{h(t|\\boldsymbol x_2)}\n&=\\frac{\\theta(\\boldsymbol x_1)}{\\theta(\\boldsymbol x_2)}\\\\\n&=\\frac{\\text{exp}\\left\\{\\eta(\\boldsymbol x_1)\\right\\}}{\\text{exp}\\left\\{\\eta(\\boldsymbol x_2)\\right\\}}\\\\\n&=\\text{exp}\\left\\{\\eta(\\boldsymbol x_1)-\\eta(\\boldsymbol x_2)\\right\\}\\\\\n&=\\text{exp}\\left\\{\\boldsymbol x_1'\\beta-\\boldsymbol x_2'\\beta\\right\\}\\\\\n&=\\text{exp}\\left\\{(\\boldsymbol x_1 - \\boldsymbol x_2)'\\beta\\right\\}\\\\\n\\end{aligned}\n\\]\nHence on the log scale, we have:\n\\[\n\\begin{aligned}\n\\text{log}\\left\\{\\frac{h(t|\\boldsymbol x_1)}{h(t|\\boldsymbol x_2)}\\right\\}\n&=\\eta(\\boldsymbol x_1)-\\eta(\\boldsymbol x_2)\\\\\n&= \\boldsymbol x_1'\\beta-\\boldsymbol x_2'\\beta\\\\\n&= (\\boldsymbol x_1 - \\boldsymbol x_2)'\\beta\n\\end{aligned}\n\\]\nIf only one covariate \\(x_j\\) is changing, then:\n\\[\n\\begin{aligned}\n\\text{log}\\left\\{\\frac{h(t|\\boldsymbol x_1)}{h(t|\\boldsymbol x_2)}\\right\\}\n&= (x_{1j} - x_{2j}) \\cdot \\beta_j\\\\\n&\\propto (x_{1j} - x_{2j})\n\\end{aligned}\n\\]\nThat is, under Cox’s model \\(h(t|\\boldsymbol x) = h_0(t)\\text{exp}\\left\\{\\boldsymbol x'\\beta\\right\\}\\), the log of the hazard ratio is proportional to the difference in \\(x_j\\), with the proportionality coefficient equal to \\(\\beta_j\\).\nFurther,\n\\[\n\\begin{aligned}\n\\text{log}\\left\\{h(t|\\boldsymbol x)\\right\\}\n&=\\text{log}\\left\\{h_0(t)\\right\\} + x'\\beta\n\\end{aligned}\n\\]\nThat is, the covariate effects are additive on the log-hazard scale.\nSee also:\nhttps://en.wikipedia.org/wiki/Proportional_hazards_model#Why_it_is_called_%22proportional%22\n\n6.1.3 Additional properties of the proportional hazards model\nIf \\(h(t|x)= h_0(t)\\theta(x)\\), then:\nCumulative hazards are also proportional to \\(H_0(t)\\)\n\n\\[\n\\begin{aligned}\nH(t|x)\n&\\stackrel{\\text{def}}{=}\\int_{u=0}^t h(u)du\\\\\n&= \\int_{u=0}^t h_0(u)\\theta(x)du\\\\\n&= \\theta(x)\\int_{u=0}^t h_0(u)du\\\\\n&= \\theta(x)H_0(t)\n\\end{aligned}\n\\]\nwhere \\(H_0(t) \\stackrel{\\text{def}}{=}H(t|0) = \\int_{u=0}^t h_0(u)du\\).\nSurvival functions are exponential multiples of \\(S_0(t)\\)\n\n\\[\n\\begin{aligned}\nS(t|x)\n&= \\text{exp}\\left\\{-H(t|x)\\right\\}\\\\\n&= \\text{exp}\\left\\{-\\theta(x)\\cdot H_0(t)\\right\\}\\\\\n&= \\left(\\text{exp}\\left\\{- H_0(t)\\right\\}\\right)^{\\theta(x)}\\\\\n&= \\left(S_0(t)\\right)^{\\theta(x)}\\\\\n\\end{aligned}\n\\]\nwhere \\(S_0(t) \\stackrel{\\text{def}}{=}P(T\\ge t | \\boldsymbol X = 0)\\) is the survival function for an individual whose covariates are all equal to their default values.\n\n6.1.4 Testing the proportional hazards assumption\nThe Nelson-Aalen estimate of the cumulative hazard is usually used for estimates of the hazard and often the cumulative hazard.\nIf the hazards of the three groups are proportional, that means that the ratio of the hazards is constant over \\(t\\). We can test this using the ratios of the estimated cumulative hazards, which also would be proportional, as shown above.\n\nShow R code\nlibrary(KMsurv)\nlibrary(survival)\ndata(bmt)\n\nbmt = \n bmt |> \n as_tibble() |> \n mutate(\n group = \n group |> \n factor(\n labels = c(\"ALL\",\"Low Risk AML\",\"High Risk AML\")))\n\nnafit = survfit(\n formula = Surv(t2,d3) ~ group,\n type = \"fleming-harrington\",\n data = bmt)\n\nbmt_curves = tibble(timevec = 1:1000)\nsf1 <- with(nafit[1], stepfun(time,c(1,surv)))\nsf2 <- with(nafit[2], stepfun(time,c(1,surv)))\nsf3 <- with(nafit[3], stepfun(time,c(1,surv)))\n\nbmt_curves = \n bmt_curves |> \n mutate(\n cumhaz1 = -log(sf1(timevec)),\n cumhaz2 = -log(sf2(timevec)),\n cumhaz3 = -log(sf3(timevec)))\n\n\n\nShow R codelibrary(ggplot2)\nbmt_rel_hazard_plot = \n bmt_curves |> \n ggplot(\n aes(\n x = timevec,\n y = cumhaz1/cumhaz2)\n ) +\n geom_line(aes(col = \"ALL/Low Risk AML\")) + \n ylab(\"Hazard Ratio\") +\n xlab(\"Time\") + \n ylim(0,6) +\n geom_line(aes(y = cumhaz3/cumhaz1, col = \"High Risk AML/ALL\")) +\n geom_line(aes(y = cumhaz3/cumhaz2, col = \"High Risk AML/Low Risk AML\")) +\n theme_bw() +\n labs(colour = \"Comparison\") +\n theme(legend.position=\"bottom\")\n\nprint(bmt_rel_hazard_plot)\n\n\nHazard Ratios by Disease Group\n\n\n\n\nWe can zoom in on 30-300 days to take a closer look:\n\nShow R codebmt_rel_hazard_plot + xlim(c(30,300))\n\n\nHazard Ratios by Disease Group (30-300 Days)\n\n\n\n\n\n6.1.5 Smoothed hazard functions\nThe Nelson-Aalen estimate of the cumulative hazard is usually used for estimates of the hazard. Since the hazard is the derivative of the cumulative hazard, we need a smooth estimate of the cumulative hazard, which is provided by smoothing the step-function cumulative hazard.\nThe R package muhaz handles this for us. What we are looking for is whether the hazard function is more or less the same shape, increasing, decreasing, constant, etc. Are the hazards “proportional”?\n\nShow R codeplot(\n survfit(Surv(t2,d3)~group,data=bmt),\n col=1:3,\n lwd=2,\n fun=\"cumhaz\",\n mark.time = TRUE)\nlegend(\"bottomright\",c(\"ALL\",\"Low Risk AML\",\"High Risk AML\"),col=1:3,lwd=2)\n\n\nDisease-Free Cumulative Hazard by Disease Group\n\n\n\n\n\nShow R codelibrary(muhaz)\n\nmuhaz(bmt$t2,bmt$d3,bmt$group==\"High Risk AML\") |> plot(lwd=2,col=3)\nmuhaz(bmt$t2,bmt$d3,bmt$group==\"ALL\") |> lines(lwd=2,col=1)\nmuhaz(bmt$t2,bmt$d3,bmt$group==\"Low Risk AML\") |> lines(lwd=2,col=2)\nlegend(\"topright\",c(\"ALL\",\"Low Risk AML\",\"High Risk AML\"),col=1:3,lwd=2)\n\n\nSmoothed Hazard Rate Estimates by Disease Group\n\n\n\n\nGroup 3 was plotted first because it has the highest hazard.\nWe will see that except for an initial blip in the high risk AML group, the hazards look roughly proportional . They are all strongly decreasing.\n\n6.1.6 Fitting the Proportional Hazards Model\nHow do we fit a proportional hazards regression model? We need to estimate the coefficients of the covariates, and we need to estimate the base hazard \\(h_0(t)\\). For the covariates, supposing for simplicity that there are no tied event times, let the event times for the whole data set be \\(t_1, t_2,\\ldots,t_D\\). Let the risk set at time \\(t_i\\) be \\(R(t_i)\\) and\n\\[\n\\begin{aligned}\n\\eta(\\boldsymbol{x}) &= \\beta_1x_{1}+\\cdots+\\beta_p x_{p}\\\\\n\\theta(\\boldsymbol{x}) &= e^{\\eta(\\boldsymbol{x})}\\\\\nh(t|X=x)&= h_0(t)e^{\\eta(\\boldsymbol{x})}=\\theta(\\boldsymbol{x}) h_0(t)\n\\end{aligned}\n\\]\nConditional on a single failure at time \\(t\\), the probability that the event is due to subject \\(f\\in R(t)\\) is approximately\n\\[\n\\begin{aligned}\n\\Pr(f \\text{ fails}|\\text{1 failure at } t)\n&= \\frac{h_0(t)e^{\\eta(\\boldsymbol{x}_f)}}{\\sum_{k \\in R(t)}h_0(t)e^{\\eta(\\boldsymbol{x}_f)}}\\\\\n&=\\frac{\\theta(\\boldsymbol{x}_f)}{\\sum_{k \\in R(t)} \\theta(\\boldsymbol{x}_k)}\n\\end{aligned}\n\\]\nThe logic behind this has several steps. We first fix (ex post) the failure times and note that in this discrete context, the probability \\(p_j\\) that a subject \\(j\\) in the risk set fails at time \\(t\\) is just the hazard of that subject at that time.\nIf all of the \\(p_j\\) are small, the chance that exactly one subject fails is\n\\[\n\\sum_{k\\in R(t)}p_k\\left[\\prod_{m\\in R(t), m\\ne k} (1-p_m)\\right]\\approx\\sum_{k\\in R(t)}p_k\n\\]\nIf subject \\(i\\) is the one who experiences the event of interest at time \\(t_i\\), then the partial likelihood is\n\\[\n\\mathcal L^*(\\beta|T)=\n\\prod_i \\frac{\\theta(x_i)}{\\sum_{k \\in R(t_i)} \\theta(\\boldsymbol{x}_k)}\n\\]\nand we can numerically maximize this with respect to the coefficients \\(\\boldsymbol{\\beta}\\) that specify \\(\\eta(\\boldsymbol{x}) = \\boldsymbol{x}'\\boldsymbol{\\beta}\\). When there are tied event times adjustments need to be made, but the likelihood is still similar. Note that we don’t need to know the base hazard to solve for the coefficients.\nOnce we have coefficient estimates \\(\\hat{\\boldsymbol{\\beta}} =(\\hat \\beta_1,\\ldots,\\hat\\beta_p)\\), this also defines \\(\\hat\\eta(x)\\) and \\(\\hat\\theta(x)\\) and then the estimated base cumulative hazard function is \\[\\hat H(t)=\n\\sum_{t_i < t} \\frac{d_i}{\\sum_{k\\in R(t_i)} \\theta(x_k)}\\] which reduces to the Nelson-Aalen estimate when there are no covariates. There are numerous other estimates that have been proposed as well.", + "section": "\n6.1 Introduction", + "text": "6.1 Introduction\n\nThe exponential distribution has constant hazard:\n\\[\n\\begin{aligned}\nf(t) &= \\lambda e^{-\\lambda t}\\\\\nS(t) &= e^{-\\lambda t}\\\\\nh(t) &= \\lambda\n\\end{aligned}\n\\]\n\nLet’s make two generalizations. First, we let the hazard depend on some covariates \\(x_1,x_2, \\dots, x_p\\); we will indicate this dependence by extending our notation for hazard:\n\n\\[h(t|\\boldsymbol x) \\stackrel{\\text{def}}{=}p(T=t|T\\ge t, \\boldsymbol X = \\boldsymbol x)\\]\n\nSecond, we let the base hazard depend on \\(t\\), but not on the covariates (for now). We can do this using either parametric or semi-parametric approaches.", + "crumbs": [ + "Time to Event Models", + "6  Proportional Hazards Models" + ] + }, + { + "objectID": "proportional-hazards-models.html#coxs-proportional-hazards-model", + "href": "proportional-hazards-models.html#coxs-proportional-hazards-model", + "title": "\n6  Proportional Hazards Models\n", + "section": "\n6.2 Cox’s Proportional Hazards Model", + "text": "6.2 Cox’s Proportional Hazards Model\n\n\nThe generalization is that the hazard function is:\n\n\\[\n\\begin{aligned}\nh(t|x)&= h_0(t)\\theta(x)\\\\\n\\theta(x) &= \\text{exp}\\left\\{\\eta(x)\\right\\}\\\\\n\\eta(x) &= x'\\beta\\\\\n&\\stackrel{\\text{def}}{=}\\beta_1x_1+\\cdots+\\beta_px_p\n\\end{aligned}\n\\]\n\nThe relationship between \\(h(t|x)\\) and \\(\\eta(x)\\) has a log link (that is, \\(\\text{log}\\left\\{h(t|x)\\right\\} = \\text{log}\\left\\{h_0(t)\\right\\} + \\eta(x)\\)), as in a generalized linear model.\nThis model is semi-parametric, because the linear predictor depends on estimated parameters but the base hazard function is unspecified. There is no constant term in \\(\\eta(x)\\), because it is absorbed in the base hazard.\n\n\nAlternatively, we could define \\(\\beta_0(t) = \\text{log}\\left\\{h_0(t)\\right\\}\\), and then:\n\\[\\eta(x,t) = \\beta_0(t) + \\beta_1x_1+\\cdots+\\beta_px_p\\]\n\n\nFor two different individuals with covariate patterns \\(\\boldsymbol x_1\\) and \\(\\boldsymbol x_2\\), the ratio of the hazard functions (a.k.a. hazard ratio, a.k.a. relative hazard) is:\n\n\\[\n\\begin{aligned}\n\\frac{h(t|\\boldsymbol x_1)}{h(t|\\boldsymbol x_2)}\n&=\\frac{h_0(t)\\theta(\\boldsymbol x_1)}{h_0(t)\\theta(\\boldsymbol x_2)}\\\\\n&=\\frac{\\theta(\\boldsymbol x_1)}{\\theta(\\boldsymbol x_2)}\\\\\n\\end{aligned}\n\\] ::: notes Under the proportional hazards model, this ratio (a.k.a. proportion) does not depend on \\(t\\). This property is a structural limitation of the model; it is called the proportional hazards assumption. :::\n\n\nDefinition 6.1 (proportional hazards) A conditional probability distribution \\(p(T|X)\\) has proportional hazards if the hazard ratio \\(h(t|\\boldsymbol x_1)/h(t|\\boldsymbol x_2)\\) does not depend on \\(t\\). Mathematically, it can be written as:\n\\[\n\\frac{h(t|\\boldsymbol x_1)}{h(t|\\boldsymbol x_2)}\n= \\theta(\\boldsymbol x_1,\\boldsymbol x_2)\n\\]\n\n\nAs we saw above, Cox’s proportional hazards model has this property, with \\(\\theta(\\boldsymbol x_1,\\boldsymbol x_2) = \\frac{\\theta(\\boldsymbol x_1)}{\\theta(\\boldsymbol x_2)}\\).\n\n\n\n\n\n\n\n\nNote\n\n\n\nWe are using two similar notations, \\(\\theta(\\boldsymbol x_1,\\boldsymbol x_2)\\) and \\(\\theta(\\boldsymbol x)\\). We can link these notations if we define \\(\\theta(\\boldsymbol x) \\stackrel{\\text{def}}{=}\\theta(\\boldsymbol x, \\boldsymbol 0)\\) and \\(\\theta(\\boldsymbol 0) = 1\\).\n\n\n\nThe proportional hazards model also has additional notable properties:\n\\[\n\\begin{aligned}\n\\frac{h(t|\\boldsymbol x_1)}{h(t|\\boldsymbol x_2)}\n&=\\frac{\\theta(\\boldsymbol x_1)}{\\theta(\\boldsymbol x_2)}\\\\\n&=\\frac{\\text{exp}\\left\\{\\eta(\\boldsymbol x_1)\\right\\}}{\\text{exp}\\left\\{\\eta(\\boldsymbol x_2)\\right\\}}\\\\\n&=\\text{exp}\\left\\{\\eta(\\boldsymbol x_1)-\\eta(\\boldsymbol x_2)\\right\\}\\\\\n&=\\text{exp}\\left\\{\\boldsymbol x_1'\\beta-\\boldsymbol x_2'\\beta\\right\\}\\\\\n&=\\text{exp}\\left\\{(\\boldsymbol x_1 - \\boldsymbol x_2)'\\beta\\right\\}\\\\\n\\end{aligned}\n\\]\n\nHence on the log scale, we have:\n\\[\n\\begin{aligned}\n\\text{log}\\left\\{\\frac{h(t|\\boldsymbol x_1)}{h(t|\\boldsymbol x_2)}\\right\\}\n&=\\eta(\\boldsymbol x_1)-\\eta(\\boldsymbol x_2)\\\\\n&= \\boldsymbol x_1'\\beta-\\boldsymbol x_2'\\beta\\\\\n&= (\\boldsymbol x_1 - \\boldsymbol x_2)'\\beta\n\\end{aligned}\n\\]\n\nIf only one covariate \\(x_j\\) is changing, then:\n\\[\n\\begin{aligned}\n\\text{log}\\left\\{\\frac{h(t|\\boldsymbol x_1)}{h(t|\\boldsymbol x_2)}\\right\\}\n&= (x_{1j} - x_{2j}) \\cdot \\beta_j\\\\\n&\\propto (x_{1j} - x_{2j})\n\\end{aligned}\n\\]\n\nThat is, under Cox’s model \\(h(t|\\boldsymbol x) = h_0(t)\\text{exp}\\left\\{\\boldsymbol x'\\beta\\right\\}\\), the log of the hazard ratio is proportional to the difference in \\(x_j\\), with the proportionality coefficient equal to \\(\\beta_j\\).\n\n\nFurther,\n\\[\n\\begin{aligned}\n\\text{log}\\left\\{h(t|\\boldsymbol x)\\right\\}\n&=\\text{log}\\left\\{h_0(t)\\right\\} + x'\\beta\n\\end{aligned}\n\\]\n\nThat is, the covariate effects are additive on the log-hazard scale; hazard functions for different covariate patterns should be vertical shifts of each other.\nSee also:\nhttps://en.wikipedia.org/wiki/Proportional_hazards_model#Why_it_is_called_%22proportional%22\n\n\n6.2.1 Additional properties of the proportional hazards model\nIf \\(h(t|x)= h_0(t)\\theta(x)\\), then:\n\nTheorem 6.1 (Cumulative hazards are also proportional to \\(H_0(t)\\)) \\[\n\\begin{aligned}\nH(t|x)\n&\\stackrel{\\text{def}}{=}\\int_{u=0}^t h(u)du\\\\\n&= \\int_{u=0}^t h_0(u)\\theta(x)du\\\\\n&= \\theta(x)\\int_{u=0}^t h_0(u)du\\\\\n&= \\theta(x)H_0(t)\n\\end{aligned}\n\\]\nwhere \\(H_0(t) \\stackrel{\\text{def}}{=}H(t|0) = \\int_{u=0}^t h_0(u)du\\).\n\n\n\nTheorem 6.2 (The logarithms of cumulative hazard should be parallel) \\[\n\\text{log}\\left\\{H(t|\\tilde{x})\\right\\} =\\text{log}\\left\\{H_0(t)\\right\\} + \\tilde{x}'\\tilde{\\beta}\n\\]\n\n\n\nTheorem 6.3 (Survival functions are exponential multiples of \\(S_0(t)\\)) \\[\n\\begin{aligned}\nS(t|x)\n&= \\text{exp}\\left\\{-H(t|x)\\right\\}\\\\\n&= \\text{exp}\\left\\{-\\theta(x)\\cdot H_0(t)\\right\\}\\\\\n&= \\left(\\text{exp}\\left\\{- H_0(t)\\right\\}\\right)^{\\theta(x)}\\\\\n&= \\left[S_0(t)\\right]^{\\theta(x)}\\\\\n\\end{aligned}\n\\]\nwhere \\(S_0(t) \\stackrel{\\text{def}}{=}P(T\\ge t | \\boldsymbol X = 0)\\) is the survival function for an individual whose covariates are all equal to their default values.\n\n\n6.2.2 Testing the proportional hazards assumption\n\nThe Nelson-Aalen estimate of the cumulative hazard is usually used for estimates of the hazard and often the cumulative hazard.\nIf the hazards of the three groups are proportional, that means that the ratio of the hazards is constant over \\(t\\). We can test this using the ratios of the estimated cumulative hazards, which also would be proportional, as shown above.\n\n\nShow R code\nlibrary(KMsurv)\nlibrary(survival)\nlibrary(dplyr)\ndata(bmt)\n\nbmt = \n bmt |> \n as_tibble() |> \n mutate(\n group = \n group |> \n factor(\n labels = c(\"ALL\",\"Low Risk AML\",\"High Risk AML\")))\n\nnafit = survfit(\n formula = Surv(t2,d3) ~ group,\n type = \"fleming-harrington\",\n data = bmt)\n\nbmt_curves = tibble(timevec = 1:1000)\nsf1 <- with(nafit[1], stepfun(time,c(1,surv)))\nsf2 <- with(nafit[2], stepfun(time,c(1,surv)))\nsf3 <- with(nafit[3], stepfun(time,c(1,surv)))\n\nbmt_curves = \n bmt_curves |> \n mutate(\n cumhaz1 = -log(sf1(timevec)),\n cumhaz2 = -log(sf2(timevec)),\n cumhaz3 = -log(sf3(timevec)))\n\n\n\nShow R codelibrary(ggplot2)\nbmt_rel_hazard_plot = \n bmt_curves |> \n ggplot(\n aes(\n x = timevec,\n y = cumhaz1/cumhaz2)\n ) +\n geom_line(aes(col = \"ALL/Low Risk AML\")) + \n ylab(\"Hazard Ratio\") +\n xlab(\"Time\") + \n ylim(0,6) +\n geom_line(aes(y = cumhaz3/cumhaz1, col = \"High Risk AML/ALL\")) +\n geom_line(aes(y = cumhaz3/cumhaz2, col = \"High Risk AML/Low Risk AML\")) +\n theme_bw() +\n labs(colour = \"Comparison\") +\n theme(legend.position=\"bottom\")\n\nprint(bmt_rel_hazard_plot)\n\n\n\nFigure 6.1: Hazard Ratios by Disease Group for bmt data\n\n\n\n\n\n\n\n\nWe can zoom in on 30-300 days to take a closer look:\n\nShow R codebmt_rel_hazard_plot + xlim(c(30,300))\n\n\n\nFigure 6.2: Hazard Ratios by Disease Group (30-300 Days)\n\n\n\n\n\n\n\n\nThe cumulative hazard curves should also be proportional\n\nShow R codelibrary(ggfortify)\nplot_cuhaz_bmt =\n bmt |>\n survfit(formula = Surv(t2, d3) ~ group) |>\n autoplot(fun = \"cumhaz\",\n mark.time = TRUE) + \n ylab(\"Cumulative hazard\")\n \nplot_cuhaz_bmt |> print()\n\n\n\nFigure 6.3: Disease-Free Cumulative Hazard by Disease Group\n\n\n\n\n\n\n\n\n\nShow R codeplot_cuhaz_bmt + \n scale_y_log10() +\n scale_x_log10()\n\n\n\nFigure 6.4: Disease-Free Cumulative Hazard by Disease Group (log-scale)\n\n\n\n\n\n\n\n\n6.2.3 Smoothed hazard functions\n\nThe Nelson-Aalen estimate of the cumulative hazard is usually used for estimates of the hazard. Since the hazard is the derivative of the cumulative hazard, we need a smooth estimate of the cumulative hazard, which is provided by smoothing the step-function cumulative hazard.\nThe R package muhaz handles this for us. What we are looking for is whether the hazard function is more or less the same shape, increasing, decreasing, constant, etc. Are the hazards “proportional”?\n\n\nShow R codelibrary(muhaz)\n\nmuhaz(bmt$t2,bmt$d3,bmt$group==\"High Risk AML\") |> plot(lwd=2,col=3)\nmuhaz(bmt$t2,bmt$d3,bmt$group==\"ALL\") |> lines(lwd=2,col=1)\nmuhaz(bmt$t2,bmt$d3,bmt$group==\"Low Risk AML\") |> lines(lwd=2,col=2)\nlegend(\"topright\",c(\"ALL\",\"Low Risk AML\",\"High Risk AML\"),col=1:3,lwd=2)\n\n\nSmoothed Hazard Rate Estimates by Disease Group\n\n\n\n\n\nGroup 3 was plotted first because it has the highest hazard.\nExcept for an initial blip in the high risk AML group, the hazards look roughly proportional. They are all strongly decreasing.\n\n\n6.2.4 Fitting the Proportional Hazards Model\n\nHow do we fit a proportional hazards regression model? We need to estimate the coefficients of the covariates, and we need to estimate the base hazard \\(h_0(t)\\). For the covariates, supposing for simplicity that there are no tied event times, let the event times for the whole data set be \\(t_1, t_2,\\ldots,t_D\\). Let the risk set at time \\(t_i\\) be \\(R(t_i)\\) and\n\n\\[\n\\begin{aligned}\n\\eta(\\boldsymbol{x}) &= \\beta_1x_{1}+\\cdots+\\beta_p x_{p}\\\\\n\\theta(\\boldsymbol{x}) &= e^{\\eta(\\boldsymbol{x})}\\\\\nh(t|X=x)&= h_0(t)e^{\\eta(\\boldsymbol{x})}=\\theta(\\boldsymbol{x}) h_0(t)\n\\end{aligned}\n\\]\n\nConditional on a single failure at time \\(t\\), the probability that the event is due to subject \\(f\\in R(t)\\) is approximately\n\\[\n\\begin{aligned}\n\\Pr(f \\text{ fails}|\\text{1 failure at } t)\n&= \\frac{h_0(t)e^{\\eta(\\boldsymbol{x}_f)}}{\\sum_{k \\in R(t)}h_0(t)e^{\\eta(\\boldsymbol{x}_f)}}\\\\\n&=\\frac{\\theta(\\boldsymbol{x}_f)}{\\sum_{k \\in R(t)} \\theta(\\boldsymbol{x}_k)}\n\\end{aligned}\n\\]\nThe logic behind this has several steps. We first fix (ex post) the failure times and note that in this discrete context, the probability \\(p_j\\) that a subject \\(j\\) in the risk set fails at time \\(t\\) is just the hazard of that subject at that time.\nIf all of the \\(p_j\\) are small, the chance that exactly one subject fails is\n\\[\n\\sum_{k\\in R(t)}p_k\\left[\\prod_{m\\in R(t), m\\ne k} (1-p_m)\\right]\\approx\\sum_{k\\in R(t)}p_k\n\\]\nIf subject \\(i\\) is the one who experiences the event of interest at time \\(t_i\\), then the partial likelihood is\n\\[\n\\mathcal L^*(\\beta|T)=\n\\prod_i \\frac{\\theta(x_i)}{\\sum_{k \\in R(t_i)} \\theta(\\boldsymbol{x}_k)}\n\\]\nand we can numerically maximize this with respect to the coefficients \\(\\boldsymbol{\\beta}\\) that specify \\(\\eta(\\boldsymbol{x}) = \\boldsymbol{x}'\\boldsymbol{\\beta}\\). When there are tied event times adjustments need to be made, but the likelihood is still similar. Note that we don’t need to know the base hazard to solve for the coefficients.\nOnce we have coefficient estimates \\(\\hat{\\boldsymbol{\\beta}} =(\\hat \\beta_1,\\ldots,\\hat\\beta_p)\\), this also defines \\(\\hat\\eta(x)\\) and \\(\\hat\\theta(x)\\) and then the estimated base cumulative hazard function is \\[\\hat H(t)=\n\\sum_{t_i < t} \\frac{d_i}{\\sum_{k\\in R(t_i)} \\theta(x_k)}\\] which reduces to the Nelson-Aalen estimate when there are no covariates. There are numerous other estimates that have been proposed as well.", "crumbs": [ "Time to Event Models", "6  Proportional Hazards Models" @@ -642,8 +653,8 @@ "objectID": "proportional-hazards-models.html#cox-model-for-the-bmt-data", "href": "proportional-hazards-models.html#cox-model-for-the-bmt-data", "title": "\n6  Proportional Hazards Models\n", - "section": "\n6.2 Cox Model for the bmt data", - "text": "6.2 Cox Model for the bmt data\n\n6.2.1 Fit the model\n\nShow R code\nbmt.cox <- coxph(Surv(t2, d3) ~ group, data = bmt)\nsummary(bmt.cox)\n#> Call:\n#> coxph(formula = Surv(t2, d3) ~ group, data = bmt)\n#> \n#> n= 137, number of events= 83 \n#> \n#> coef exp(coef) se(coef) z Pr(>|z|) \n#> groupLow Risk AML -0.574 0.563 0.287 -2.00 0.046 *\n#> groupHigh Risk AML 0.383 1.467 0.267 1.43 0.152 \n#> ---\n#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n#> \n#> exp(coef) exp(-coef) lower .95 upper .95\n#> groupLow Risk AML 0.563 1.776 0.321 0.989\n#> groupHigh Risk AML 1.467 0.682 0.869 2.478\n#> \n#> Concordance= 0.625 (se = 0.03 )\n#> Likelihood ratio test= 13.4 on 2 df, p=0.001\n#> Wald test = 13 on 2 df, p=0.001\n#> Score (logrank) test = 13.8 on 2 df, p=0.001\n\n\nThe table provides hypothesis tests comparing groups 2 and 3 to group 1. Group 3 has the highest hazard, so the most significant comparison is not directly shown.\nThe coefficient 0.3834 is on the log-hazard-ratio scale, as in log-risk-ratio. The next column gives the hazard ratio 1.4673, and a hypothesis (Wald) test.\nThe (not shown) group 3 vs. group 2 log hazard ratio is 0.3834 + 0.5742 = 0.9576. The hazard ratio is then exp(0.9576) or 2.605.\nInference on all coefficients and combinations can be constructed using coef(bmt.cox) and vcov(bmt.cox) as with logistic and poisson regression.\nConcordance is agreement of first failure between pairs of subjects and higher predicted risk between those subjects, omitting non-informative pairs.\nThe Rsquare value is Cox and Snell’s pseudo R-squared and is not very useful.\nsummary() prints three tests for whether the model with the group covariate is better than the one without\n\n\nLikelihood ratio test (chi-squared)\n\nWald test (also chi-squared), obtained by adding the squares of the z-scores\n\nScore = log-rank test, as with comparison of survival functions.\n\nThe likelihood ratio test is probably best in smaller samples, followed by the Wald test.\n\n6.2.2 Survival Curves from the Cox Model\nWe can take a look at the resulting group-specific curves:\n\nShow R code\n#| fig-cap: \"Survival Functions for Three Groups by KM and Cox Model\"\n\nkm_fit = survfit(Surv(t2, d3) ~ group, data = as.data.frame(bmt))\n\ncox_fit = survfit(\n bmt.cox, \n newdata = \n data.frame(\n group = unique(bmt$group), \n row.names = unique(bmt$group)))\n\nlibrary(survminer)\n\nlist(KM = km_fit, Cox = cox_fit) |> \n survminer::ggsurvplot(\n # facet.by = \"group\",\n legend = \"bottom\", \n legend.title = \"\",\n combine = TRUE, \n fun = 'pct', \n size = .5,\n ggtheme = theme_bw(), \n conf.int = FALSE, \n censor = FALSE) |> \n suppressWarnings() # ggsurvplot() throws some warnings that aren't too worrying\n\n\n\n\n\n\n\nWhen we use survfit() with a Cox model, we have to specify the covariate levels we are interested in; the argument newdata should include a data.frame with the same named columns as the predictors in the Cox model and one or more levels of each.\nOtherwise (that is, if the newdata argument is missing), a curve is produced for a single “pseudo” subject with covariate values equal to the means component of the fit.\nThe resulting curve(s) almost never make sense, but the default remains due to an unwarranted attachment to the option shown by some users and by other packages.\nTwo particularly egregious examples are factor variables and interactions. Suppose one were studying interspecies transmission of a virus, and the data set has a factor variable with levels (“pig”, “chicken”) and about equal numbers of observations for each. The “mean” covariate level will be 0.5 – is this a flying pig?\n\n6.2.3 Examining survfit\n\n\nShow R codesurvfit(Surv(t2, d3)~group,data=bmt)\n#> Call: survfit(formula = Surv(t2, d3) ~ group, data = bmt)\n#> \n#> n events median 0.95LCL 0.95UCL\n#> group=ALL 38 24 418 194 NA\n#> group=Low Risk AML 54 25 2204 704 NA\n#> group=High Risk AML 45 34 183 115 456\n\n\n\nShow R codesurvfit(Surv(t2, d3)~group,data=bmt) |> summary()\n#> Call: survfit(formula = Surv(t2, d3) ~ group, data = bmt)\n#> \n#> group=ALL \n#> time n.risk n.event survival std.err lower 95% CI upper 95% CI\n#> 1 38 1 0.974 0.0260 0.924 1.000\n#> 55 37 1 0.947 0.0362 0.879 1.000\n#> 74 36 1 0.921 0.0437 0.839 1.000\n#> 86 35 1 0.895 0.0498 0.802 0.998\n#> 104 34 1 0.868 0.0548 0.767 0.983\n#> 107 33 1 0.842 0.0592 0.734 0.966\n#> 109 32 1 0.816 0.0629 0.701 0.949\n#> 110 31 1 0.789 0.0661 0.670 0.930\n#> 122 30 2 0.737 0.0714 0.609 0.891\n#> 129 28 1 0.711 0.0736 0.580 0.870\n#> 172 27 1 0.684 0.0754 0.551 0.849\n#> 192 26 1 0.658 0.0770 0.523 0.827\n#> 194 25 1 0.632 0.0783 0.495 0.805\n#> 230 23 1 0.604 0.0795 0.467 0.782\n#> 276 22 1 0.577 0.0805 0.439 0.758\n#> 332 21 1 0.549 0.0812 0.411 0.734\n#> 383 20 1 0.522 0.0817 0.384 0.709\n#> 418 19 1 0.494 0.0819 0.357 0.684\n#> 466 18 1 0.467 0.0818 0.331 0.658\n#> 487 17 1 0.439 0.0815 0.305 0.632\n#> 526 16 1 0.412 0.0809 0.280 0.605\n#> 609 14 1 0.382 0.0803 0.254 0.577\n#> 662 13 1 0.353 0.0793 0.227 0.548\n#> \n#> group=Low Risk AML \n#> time n.risk n.event survival std.err lower 95% CI upper 95% CI\n#> 10 54 1 0.981 0.0183 0.946 1.000\n#> 35 53 1 0.963 0.0257 0.914 1.000\n#> 48 52 1 0.944 0.0312 0.885 1.000\n#> 53 51 1 0.926 0.0356 0.859 0.998\n#> 79 50 1 0.907 0.0394 0.833 0.988\n#> 80 49 1 0.889 0.0428 0.809 0.977\n#> 105 48 1 0.870 0.0457 0.785 0.965\n#> 211 47 1 0.852 0.0483 0.762 0.952\n#> 219 46 1 0.833 0.0507 0.740 0.939\n#> 248 45 1 0.815 0.0529 0.718 0.925\n#> 272 44 1 0.796 0.0548 0.696 0.911\n#> 288 43 1 0.778 0.0566 0.674 0.897\n#> 381 42 1 0.759 0.0582 0.653 0.882\n#> 390 41 1 0.741 0.0596 0.633 0.867\n#> 414 40 1 0.722 0.0610 0.612 0.852\n#> 421 39 1 0.704 0.0621 0.592 0.837\n#> 481 38 1 0.685 0.0632 0.572 0.821\n#> 486 37 1 0.667 0.0642 0.552 0.805\n#> 606 36 1 0.648 0.0650 0.533 0.789\n#> 641 35 1 0.630 0.0657 0.513 0.773\n#> 704 34 1 0.611 0.0663 0.494 0.756\n#> 748 33 1 0.593 0.0669 0.475 0.739\n#> 1063 26 1 0.570 0.0681 0.451 0.720\n#> 1074 25 1 0.547 0.0691 0.427 0.701\n#> 2204 6 1 0.456 0.1012 0.295 0.704\n#> \n#> group=High Risk AML \n#> time n.risk n.event survival std.err lower 95% CI upper 95% CI\n#> 2 45 1 0.978 0.0220 0.936 1.000\n#> 16 44 1 0.956 0.0307 0.897 1.000\n#> 32 43 1 0.933 0.0372 0.863 1.000\n#> 47 42 2 0.889 0.0468 0.802 0.986\n#> 48 40 1 0.867 0.0507 0.773 0.972\n#> 63 39 1 0.844 0.0540 0.745 0.957\n#> 64 38 1 0.822 0.0570 0.718 0.942\n#> 74 37 1 0.800 0.0596 0.691 0.926\n#> 76 36 1 0.778 0.0620 0.665 0.909\n#> 80 35 1 0.756 0.0641 0.640 0.892\n#> 84 34 1 0.733 0.0659 0.615 0.875\n#> 93 33 1 0.711 0.0676 0.590 0.857\n#> 100 32 1 0.689 0.0690 0.566 0.838\n#> 105 31 1 0.667 0.0703 0.542 0.820\n#> 113 30 1 0.644 0.0714 0.519 0.801\n#> 115 29 1 0.622 0.0723 0.496 0.781\n#> 120 28 1 0.600 0.0730 0.473 0.762\n#> 157 27 1 0.578 0.0736 0.450 0.742\n#> 162 26 1 0.556 0.0741 0.428 0.721\n#> 164 25 1 0.533 0.0744 0.406 0.701\n#> 168 24 1 0.511 0.0745 0.384 0.680\n#> 183 23 1 0.489 0.0745 0.363 0.659\n#> 242 22 1 0.467 0.0744 0.341 0.638\n#> 268 21 1 0.444 0.0741 0.321 0.616\n#> 273 20 1 0.422 0.0736 0.300 0.594\n#> 318 19 1 0.400 0.0730 0.280 0.572\n#> 363 18 1 0.378 0.0723 0.260 0.550\n#> 390 17 1 0.356 0.0714 0.240 0.527\n#> 422 16 1 0.333 0.0703 0.221 0.504\n#> 456 15 1 0.311 0.0690 0.201 0.481\n#> 467 14 1 0.289 0.0676 0.183 0.457\n#> 625 13 1 0.267 0.0659 0.164 0.433\n#> 677 12 1 0.244 0.0641 0.146 0.409\n\n\n\nShow R codesurvfit(bmt.cox)\n#> Call: survfit(formula = bmt.cox)\n#> \n#> n events median 0.95LCL 0.95UCL\n#> [1,] 137 83 422 268 NA\nsurvfit(bmt.cox, newdata = tibble(group = unique(bmt$group)))\n#> Call: survfit(formula = bmt.cox, newdata = tibble(group = unique(bmt$group)))\n#> \n#> n events median 0.95LCL 0.95UCL\n#> 1 137 83 422 268 NA\n#> 2 137 83 NA 625 NA\n#> 3 137 83 268 162 467\n\n\n\nShow R codebmt.cox |> \n survfit(newdata = tibble(group = unique(bmt$group))) |> \n summary()\n#> Call: survfit(formula = bmt.cox, newdata = tibble(group = unique(bmt$group)))\n#> \n#> time n.risk n.event survival1 survival2 survival3\n#> 1 137 1 0.993 0.996 0.989\n#> 2 136 1 0.985 0.992 0.978\n#> 10 135 1 0.978 0.987 0.968\n#> 16 134 1 0.970 0.983 0.957\n#> 32 133 1 0.963 0.979 0.946\n#> 35 132 1 0.955 0.975 0.935\n#> 47 131 2 0.941 0.966 0.914\n#> 48 129 2 0.926 0.957 0.893\n#> 53 127 1 0.918 0.953 0.882\n#> 55 126 1 0.911 0.949 0.872\n#> 63 125 1 0.903 0.944 0.861\n#> 64 124 1 0.896 0.940 0.851\n#> 74 123 2 0.881 0.931 0.830\n#> 76 121 1 0.873 0.926 0.819\n#> 79 120 1 0.865 0.922 0.809\n#> 80 119 2 0.850 0.913 0.788\n#> 84 117 1 0.843 0.908 0.778\n#> 86 116 1 0.835 0.903 0.768\n#> 93 115 1 0.827 0.899 0.757\n#> 100 114 1 0.820 0.894 0.747\n#> 104 113 1 0.812 0.889 0.737\n#> 105 112 2 0.797 0.880 0.717\n#> 107 110 1 0.789 0.875 0.707\n#> 109 109 1 0.782 0.870 0.697\n#> 110 108 1 0.774 0.866 0.687\n#> 113 107 1 0.766 0.861 0.677\n#> 115 106 1 0.759 0.856 0.667\n#> 120 105 1 0.751 0.851 0.657\n#> 122 104 2 0.735 0.841 0.637\n#> 129 102 1 0.727 0.836 0.627\n#> 157 101 1 0.720 0.831 0.617\n#> 162 100 1 0.712 0.826 0.607\n#> 164 99 1 0.704 0.821 0.598\n#> 168 98 1 0.696 0.815 0.588\n#> 172 97 1 0.688 0.810 0.578\n#> 183 96 1 0.680 0.805 0.568\n#> 192 95 1 0.672 0.800 0.558\n#> 194 94 1 0.664 0.794 0.549\n#> 211 93 1 0.656 0.789 0.539\n#> 219 92 1 0.648 0.783 0.530\n#> 230 90 1 0.640 0.778 0.520\n#> 242 89 1 0.632 0.773 0.511\n#> 248 88 1 0.624 0.767 0.501\n#> 268 87 1 0.616 0.761 0.492\n#> 272 86 1 0.608 0.756 0.482\n#> 273 85 1 0.600 0.750 0.473\n#> 276 84 1 0.592 0.745 0.464\n#> 288 83 1 0.584 0.739 0.454\n#> 318 82 1 0.576 0.733 0.445\n#> 332 81 1 0.568 0.727 0.436\n#> 363 80 1 0.560 0.722 0.427\n#> 381 79 1 0.552 0.716 0.418\n#> 383 78 1 0.544 0.710 0.409\n#> 390 77 2 0.528 0.698 0.392\n#> 414 75 1 0.520 0.692 0.383\n#> 418 74 1 0.512 0.686 0.374\n#> 421 73 1 0.504 0.680 0.366\n#> 422 72 1 0.496 0.674 0.357\n#> 456 71 1 0.488 0.667 0.349\n#> 466 70 1 0.480 0.661 0.340\n#> 467 69 1 0.472 0.655 0.332\n#> 481 68 1 0.464 0.649 0.324\n#> 486 67 1 0.455 0.642 0.315\n#> 487 66 1 0.447 0.636 0.307\n#> 526 65 1 0.439 0.629 0.299\n#> 606 63 1 0.431 0.623 0.291\n#> 609 62 1 0.423 0.616 0.283\n#> 625 61 1 0.415 0.609 0.275\n#> 641 60 1 0.407 0.603 0.267\n#> 662 59 1 0.399 0.596 0.260\n#> 677 58 1 0.391 0.589 0.252\n#> 704 57 1 0.383 0.582 0.244\n#> 748 56 1 0.374 0.575 0.237\n#> 1063 47 1 0.365 0.567 0.228\n#> 1074 46 1 0.356 0.559 0.220\n#> 2204 9 1 0.313 0.520 0.182", + "section": "\n6.3 Cox Model for the bmt data", + "text": "6.3 Cox Model for the bmt data\n\n6.3.1 Fit the model\n\nShow R code\nbmt.cox <- coxph(Surv(t2, d3) ~ group, data = bmt)\nsummary(bmt.cox)\n#> Call:\n#> coxph(formula = Surv(t2, d3) ~ group, data = bmt)\n#> \n#> n= 137, number of events= 83 \n#> \n#> coef exp(coef) se(coef) z Pr(>|z|) \n#> groupLow Risk AML -0.574 0.563 0.287 -2.00 0.046 *\n#> groupHigh Risk AML 0.383 1.467 0.267 1.43 0.152 \n#> ---\n#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n#> \n#> exp(coef) exp(-coef) lower .95 upper .95\n#> groupLow Risk AML 0.563 1.776 0.321 0.989\n#> groupHigh Risk AML 1.467 0.682 0.869 2.478\n#> \n#> Concordance= 0.625 (se = 0.03 )\n#> Likelihood ratio test= 13.4 on 2 df, p=0.001\n#> Wald test = 13 on 2 df, p=0.001\n#> Score (logrank) test = 13.8 on 2 df, p=0.001\n\n\nThe table provides hypothesis tests comparing groups 2 and 3 to group 1. Group 3 has the highest hazard, so the most significant comparison is not directly shown.\nThe coefficient 0.3834 is on the log-hazard-ratio scale, as in log-risk-ratio. The next column gives the hazard ratio 1.4673, and a hypothesis (Wald) test.\nThe (not shown) group 3 vs. group 2 log hazard ratio is 0.3834 + 0.5742 = 0.9576. The hazard ratio is then exp(0.9576) or 2.605.\nInference on all coefficients and combinations can be constructed using coef(bmt.cox) and vcov(bmt.cox) as with logistic and poisson regression.\nConcordance is agreement of first failure between pairs of subjects and higher predicted risk between those subjects, omitting non-informative pairs.\nThe Rsquare value is Cox and Snell’s pseudo R-squared and is not very useful.\nsummary() prints three tests for whether the model with the group covariate is better than the one without\n\n\nLikelihood ratio test (chi-squared)\n\nWald test (also chi-squared), obtained by adding the squares of the z-scores\n\nScore = log-rank test, as with comparison of survival functions.\n\nThe likelihood ratio test is probably best in smaller samples, followed by the Wald test.\n\n6.3.2 Survival Curves from the Cox Model\nWe can take a look at the resulting group-specific curves:\n\nShow R code\n#| fig-cap: \"Survival Functions for Three Groups by KM and Cox Model\"\n\nkm_fit = survfit(Surv(t2, d3) ~ group, data = as.data.frame(bmt))\n\ncox_fit = survfit(\n bmt.cox, \n newdata = \n data.frame(\n group = unique(bmt$group), \n row.names = unique(bmt$group)))\n\nlibrary(survminer)\n\nlist(KM = km_fit, Cox = cox_fit) |> \n survminer::ggsurvplot(\n # facet.by = \"group\",\n legend = \"bottom\", \n legend.title = \"\",\n combine = TRUE, \n fun = 'pct', \n size = .5,\n ggtheme = theme_bw(), \n conf.int = FALSE, \n censor = FALSE) |> \n suppressWarnings() # ggsurvplot() throws some warnings that aren't too worrying\n\n\n\n\n\n\n\nWhen we use survfit() with a Cox model, we have to specify the covariate levels we are interested in; the argument newdata should include a data.frame with the same named columns as the predictors in the Cox model and one or more levels of each.\nOtherwise (that is, if the newdata argument is missing), a curve is produced for a single “pseudo” subject with covariate values equal to the means component of the fit.\nThe resulting curve(s) almost never make sense, but the default remains due to an unwarranted attachment to the option shown by some users and by other packages.\nTwo particularly egregious examples are factor variables and interactions. Suppose one were studying interspecies transmission of a virus, and the data set has a factor variable with levels (“pig”, “chicken”) and about equal numbers of observations for each. The “mean” covariate level will be 0.5 – is this a flying pig?\n\n6.3.3 Examining survfit\n\n\nShow R codesurvfit(Surv(t2, d3)~group,data=bmt)\n#> Call: survfit(formula = Surv(t2, d3) ~ group, data = bmt)\n#> \n#> n events median 0.95LCL 0.95UCL\n#> group=ALL 38 24 418 194 NA\n#> group=Low Risk AML 54 25 2204 704 NA\n#> group=High Risk AML 45 34 183 115 456\n\n\n\nShow R codesurvfit(Surv(t2, d3)~group,data=bmt) |> summary()\n#> Call: survfit(formula = Surv(t2, d3) ~ group, data = bmt)\n#> \n#> group=ALL \n#> time n.risk n.event survival std.err lower 95% CI upper 95% CI\n#> 1 38 1 0.974 0.0260 0.924 1.000\n#> 55 37 1 0.947 0.0362 0.879 1.000\n#> 74 36 1 0.921 0.0437 0.839 1.000\n#> 86 35 1 0.895 0.0498 0.802 0.998\n#> 104 34 1 0.868 0.0548 0.767 0.983\n#> 107 33 1 0.842 0.0592 0.734 0.966\n#> 109 32 1 0.816 0.0629 0.701 0.949\n#> 110 31 1 0.789 0.0661 0.670 0.930\n#> 122 30 2 0.737 0.0714 0.609 0.891\n#> 129 28 1 0.711 0.0736 0.580 0.870\n#> 172 27 1 0.684 0.0754 0.551 0.849\n#> 192 26 1 0.658 0.0770 0.523 0.827\n#> 194 25 1 0.632 0.0783 0.495 0.805\n#> 230 23 1 0.604 0.0795 0.467 0.782\n#> 276 22 1 0.577 0.0805 0.439 0.758\n#> 332 21 1 0.549 0.0812 0.411 0.734\n#> 383 20 1 0.522 0.0817 0.384 0.709\n#> 418 19 1 0.494 0.0819 0.357 0.684\n#> 466 18 1 0.467 0.0818 0.331 0.658\n#> 487 17 1 0.439 0.0815 0.305 0.632\n#> 526 16 1 0.412 0.0809 0.280 0.605\n#> 609 14 1 0.382 0.0803 0.254 0.577\n#> 662 13 1 0.353 0.0793 0.227 0.548\n#> \n#> group=Low Risk AML \n#> time n.risk n.event survival std.err lower 95% CI upper 95% CI\n#> 10 54 1 0.981 0.0183 0.946 1.000\n#> 35 53 1 0.963 0.0257 0.914 1.000\n#> 48 52 1 0.944 0.0312 0.885 1.000\n#> 53 51 1 0.926 0.0356 0.859 0.998\n#> 79 50 1 0.907 0.0394 0.833 0.988\n#> 80 49 1 0.889 0.0428 0.809 0.977\n#> 105 48 1 0.870 0.0457 0.785 0.965\n#> 211 47 1 0.852 0.0483 0.762 0.952\n#> 219 46 1 0.833 0.0507 0.740 0.939\n#> 248 45 1 0.815 0.0529 0.718 0.925\n#> 272 44 1 0.796 0.0548 0.696 0.911\n#> 288 43 1 0.778 0.0566 0.674 0.897\n#> 381 42 1 0.759 0.0582 0.653 0.882\n#> 390 41 1 0.741 0.0596 0.633 0.867\n#> 414 40 1 0.722 0.0610 0.612 0.852\n#> 421 39 1 0.704 0.0621 0.592 0.837\n#> 481 38 1 0.685 0.0632 0.572 0.821\n#> 486 37 1 0.667 0.0642 0.552 0.805\n#> 606 36 1 0.648 0.0650 0.533 0.789\n#> 641 35 1 0.630 0.0657 0.513 0.773\n#> 704 34 1 0.611 0.0663 0.494 0.756\n#> 748 33 1 0.593 0.0669 0.475 0.739\n#> 1063 26 1 0.570 0.0681 0.451 0.720\n#> 1074 25 1 0.547 0.0691 0.427 0.701\n#> 2204 6 1 0.456 0.1012 0.295 0.704\n#> \n#> group=High Risk AML \n#> time n.risk n.event survival std.err lower 95% CI upper 95% CI\n#> 2 45 1 0.978 0.0220 0.936 1.000\n#> 16 44 1 0.956 0.0307 0.897 1.000\n#> 32 43 1 0.933 0.0372 0.863 1.000\n#> 47 42 2 0.889 0.0468 0.802 0.986\n#> 48 40 1 0.867 0.0507 0.773 0.972\n#> 63 39 1 0.844 0.0540 0.745 0.957\n#> 64 38 1 0.822 0.0570 0.718 0.942\n#> 74 37 1 0.800 0.0596 0.691 0.926\n#> 76 36 1 0.778 0.0620 0.665 0.909\n#> 80 35 1 0.756 0.0641 0.640 0.892\n#> 84 34 1 0.733 0.0659 0.615 0.875\n#> 93 33 1 0.711 0.0676 0.590 0.857\n#> 100 32 1 0.689 0.0690 0.566 0.838\n#> 105 31 1 0.667 0.0703 0.542 0.820\n#> 113 30 1 0.644 0.0714 0.519 0.801\n#> 115 29 1 0.622 0.0723 0.496 0.781\n#> 120 28 1 0.600 0.0730 0.473 0.762\n#> 157 27 1 0.578 0.0736 0.450 0.742\n#> 162 26 1 0.556 0.0741 0.428 0.721\n#> 164 25 1 0.533 0.0744 0.406 0.701\n#> 168 24 1 0.511 0.0745 0.384 0.680\n#> 183 23 1 0.489 0.0745 0.363 0.659\n#> 242 22 1 0.467 0.0744 0.341 0.638\n#> 268 21 1 0.444 0.0741 0.321 0.616\n#> 273 20 1 0.422 0.0736 0.300 0.594\n#> 318 19 1 0.400 0.0730 0.280 0.572\n#> 363 18 1 0.378 0.0723 0.260 0.550\n#> 390 17 1 0.356 0.0714 0.240 0.527\n#> 422 16 1 0.333 0.0703 0.221 0.504\n#> 456 15 1 0.311 0.0690 0.201 0.481\n#> 467 14 1 0.289 0.0676 0.183 0.457\n#> 625 13 1 0.267 0.0659 0.164 0.433\n#> 677 12 1 0.244 0.0641 0.146 0.409\n\n\n\nShow R codesurvfit(bmt.cox)\n#> Call: survfit(formula = bmt.cox)\n#> \n#> n events median 0.95LCL 0.95UCL\n#> [1,] 137 83 422 268 NA\nsurvfit(bmt.cox, newdata = tibble(group = unique(bmt$group)))\n#> Call: survfit(formula = bmt.cox, newdata = tibble(group = unique(bmt$group)))\n#> \n#> n events median 0.95LCL 0.95UCL\n#> 1 137 83 422 268 NA\n#> 2 137 83 NA 625 NA\n#> 3 137 83 268 162 467\n\n\n\nShow R codebmt.cox |> \n survfit(newdata = tibble(group = unique(bmt$group))) |> \n summary()\n#> Call: survfit(formula = bmt.cox, newdata = tibble(group = unique(bmt$group)))\n#> \n#> time n.risk n.event survival1 survival2 survival3\n#> 1 137 1 0.993 0.996 0.989\n#> 2 136 1 0.985 0.992 0.978\n#> 10 135 1 0.978 0.987 0.968\n#> 16 134 1 0.970 0.983 0.957\n#> 32 133 1 0.963 0.979 0.946\n#> 35 132 1 0.955 0.975 0.935\n#> 47 131 2 0.941 0.966 0.914\n#> 48 129 2 0.926 0.957 0.893\n#> 53 127 1 0.918 0.953 0.882\n#> 55 126 1 0.911 0.949 0.872\n#> 63 125 1 0.903 0.944 0.861\n#> 64 124 1 0.896 0.940 0.851\n#> 74 123 2 0.881 0.931 0.830\n#> 76 121 1 0.873 0.926 0.819\n#> 79 120 1 0.865 0.922 0.809\n#> 80 119 2 0.850 0.913 0.788\n#> 84 117 1 0.843 0.908 0.778\n#> 86 116 1 0.835 0.903 0.768\n#> 93 115 1 0.827 0.899 0.757\n#> 100 114 1 0.820 0.894 0.747\n#> 104 113 1 0.812 0.889 0.737\n#> 105 112 2 0.797 0.880 0.717\n#> 107 110 1 0.789 0.875 0.707\n#> 109 109 1 0.782 0.870 0.697\n#> 110 108 1 0.774 0.866 0.687\n#> 113 107 1 0.766 0.861 0.677\n#> 115 106 1 0.759 0.856 0.667\n#> 120 105 1 0.751 0.851 0.657\n#> 122 104 2 0.735 0.841 0.637\n#> 129 102 1 0.727 0.836 0.627\n#> 157 101 1 0.720 0.831 0.617\n#> 162 100 1 0.712 0.826 0.607\n#> 164 99 1 0.704 0.821 0.598\n#> 168 98 1 0.696 0.815 0.588\n#> 172 97 1 0.688 0.810 0.578\n#> 183 96 1 0.680 0.805 0.568\n#> 192 95 1 0.672 0.800 0.558\n#> 194 94 1 0.664 0.794 0.549\n#> 211 93 1 0.656 0.789 0.539\n#> 219 92 1 0.648 0.783 0.530\n#> 230 90 1 0.640 0.778 0.520\n#> 242 89 1 0.632 0.773 0.511\n#> 248 88 1 0.624 0.767 0.501\n#> 268 87 1 0.616 0.761 0.492\n#> 272 86 1 0.608 0.756 0.482\n#> 273 85 1 0.600 0.750 0.473\n#> 276 84 1 0.592 0.745 0.464\n#> 288 83 1 0.584 0.739 0.454\n#> 318 82 1 0.576 0.733 0.445\n#> 332 81 1 0.568 0.727 0.436\n#> 363 80 1 0.560 0.722 0.427\n#> 381 79 1 0.552 0.716 0.418\n#> 383 78 1 0.544 0.710 0.409\n#> 390 77 2 0.528 0.698 0.392\n#> 414 75 1 0.520 0.692 0.383\n#> 418 74 1 0.512 0.686 0.374\n#> 421 73 1 0.504 0.680 0.366\n#> 422 72 1 0.496 0.674 0.357\n#> 456 71 1 0.488 0.667 0.349\n#> 466 70 1 0.480 0.661 0.340\n#> 467 69 1 0.472 0.655 0.332\n#> 481 68 1 0.464 0.649 0.324\n#> 486 67 1 0.455 0.642 0.315\n#> 487 66 1 0.447 0.636 0.307\n#> 526 65 1 0.439 0.629 0.299\n#> 606 63 1 0.431 0.623 0.291\n#> 609 62 1 0.423 0.616 0.283\n#> 625 61 1 0.415 0.609 0.275\n#> 641 60 1 0.407 0.603 0.267\n#> 662 59 1 0.399 0.596 0.260\n#> 677 58 1 0.391 0.589 0.252\n#> 704 57 1 0.383 0.582 0.244\n#> 748 56 1 0.374 0.575 0.237\n#> 1063 47 1 0.365 0.567 0.228\n#> 1074 46 1 0.356 0.559 0.220\n#> 2204 9 1 0.313 0.520 0.182", "crumbs": [ "Time to Event Models", "6  Proportional Hazards Models" @@ -653,8 +664,8 @@ "objectID": "proportional-hazards-models.html#adjustment-for-ties-optional", "href": "proportional-hazards-models.html#adjustment-for-ties-optional", "title": "\n6  Proportional Hazards Models\n", - "section": "\n6.3 Adjustment for Ties (optional)", - "text": "6.3 Adjustment for Ties (optional)\n\n6.3.1 \nAt each time \\(t_i\\) at which more than one of the subjects has an event, let \\(d_i\\) be the number of events at that time, \\(D_i\\) the set of subjects with events at that time, and let \\(s_i\\) be a covariate vector for an artificial subject obtained by adding up the covariate values for the subjects with an event at time \\(t_i\\). Let \\[\\bar\\eta_i = \\beta_1s_{i1}+\\cdots+\\beta_ps_{ip}\\] and \\(\\bar\\theta_i = \\text{exp}\\left\\{\\bar\\eta_i\\right\\}\\).\nLet \\(s_i\\) be a covariate vector for an artificial subject obtained by adding up the covariate values for the subjects with an event at time \\(t_i\\). Note that\n\\[\n\\begin{aligned}\n\\bar\\eta_i &=\\sum_{j \\in D_i}\\beta_1x_{j1}+\\cdots+\\beta_px_{jp}\\\\\n&= \\beta_1s_{i1}+\\cdots+\\beta_ps_{ip}\\\\\n\\bar\\theta_i &= \\text{exp}\\left\\{\\bar\\eta_i\\right\\}\\\\\n&= \\prod_{j \\in D_i}\\theta_i\n\\end{aligned}\n\\]\nBreslow’s method for ties\nBreslow’s method estimates the partial likelihood as\n\\[\n\\begin{aligned}\nL(\\beta|T) &=\n\\prod_i \\frac{\\bar\\theta_i}{[\\sum_{k \\in R(t_i)} \\theta_k]^{d_i}}\\\\\n&= \\prod_i \\prod_{j \\in D_i}\\frac{\\theta_j}{\\sum_{k \\in R(t_i)} \\theta_k}\n\\end{aligned}\n\\]\nThis method is equivalent to treating each event as distinct and using the non-ties formula. It works best when the number of ties is small. It is the default in many statistical packages, including PROC PHREG in SAS.\nEfron’s method for ties\nThe other common method is Efron’s, which is the default in R.\n\\[L(\\beta|T)=\n\\prod_i \\frac{\\bar\\theta_i}{\\prod_{j=1}^{d_i}[\\sum_{k \\in R(t_i)} \\theta_k-\\frac{j-1}{d_i}\\sum_{k \\in D_i} \\theta_k]}\\] This is closer to the exact discrete partial likelihood when there are many ties.\nThe third option in R (and an option also in SAS as discrete) is the “exact” method, which is the same one used for matched logistic regression.\nExample: Breslow’s method\nSuppose as an example we have a time \\(t\\) where there are 20 individuals at risk and three failures. Let the three individuals have risk parameters \\(\\theta_1, \\theta_2, \\theta_3\\) and let the sum of the risk parameters of the remaining 17 individuals be \\(\\theta_R\\). Then the factor in the partial likelihood at time \\(t\\) using Breslow’s method is\n\n\\[\n\\left(\\frac{\\theta_1}{\\theta_R+\\theta_1+\\theta_2+\\theta_3}\\right)\n\\left(\\frac{\\theta_2}{\\theta_R+\\theta_1+\\theta_2+\\theta_3}\\right)\n\\left(\\frac{\\theta_3}{\\theta_R+\\theta_1+\\theta_2+\\theta_3}\\right)\n\\]\n\nIf on the other hand, they had died in the order 1,2, 3, then the contribution to the partial likelihood would be:\n\n\\[\n\\left(\\frac{\\theta_1}{\\theta_R+\\theta_1+\\theta_2+\\theta_3}\\right)\n\\left(\\frac{\\theta_2}{\\theta_R+\\theta_2+\\theta_3}\\right)\n\\left(\\frac{\\theta_3}{\\theta_R+\\theta_3}\\right)\n\\]\n\nas the risk set got smaller with each failure. The exact method roughly averages the results for the six possible orderings of the failures.\nExample: Efron’s method\nBut we don’t know the order they failed in, so instead of reducing the denominator by one risk coefficient each time, we reduce it by the same fraction. This is Efron’s method.\n\n\\[\\left(\\frac{\\theta_1}{\\theta_R+\\theta_1+\\theta_2+\\theta_3}\\right)\n\\left(\\frac{\\theta_2}{\\theta_R+2(\\theta_1+\\theta_2+\\theta_3)/3}\\right)\n\\left(\\frac{\\theta_3}{\\theta_R+(\\theta_1+\\theta_2+\\theta_3)/3}\\right)\\]", + "section": "\n6.4 Adjustment for Ties (optional)", + "text": "6.4 Adjustment for Ties (optional)\n\n6.4.1 \nAt each time \\(t_i\\) at which more than one of the subjects has an event, let \\(d_i\\) be the number of events at that time, \\(D_i\\) the set of subjects with events at that time, and let \\(s_i\\) be a covariate vector for an artificial subject obtained by adding up the covariate values for the subjects with an event at time \\(t_i\\). Let \\[\\bar\\eta_i = \\beta_1s_{i1}+\\cdots+\\beta_ps_{ip}\\] and \\(\\bar\\theta_i = \\text{exp}\\left\\{\\bar\\eta_i\\right\\}\\).\nLet \\(s_i\\) be a covariate vector for an artificial subject obtained by adding up the covariate values for the subjects with an event at time \\(t_i\\). Note that\n\\[\n\\begin{aligned}\n\\bar\\eta_i &=\\sum_{j \\in D_i}\\beta_1x_{j1}+\\cdots+\\beta_px_{jp}\\\\\n&= \\beta_1s_{i1}+\\cdots+\\beta_ps_{ip}\\\\\n\\bar\\theta_i &= \\text{exp}\\left\\{\\bar\\eta_i\\right\\}\\\\\n&= \\prod_{j \\in D_i}\\theta_i\n\\end{aligned}\n\\]\nBreslow’s method for ties\nBreslow’s method estimates the partial likelihood as\n\\[\n\\begin{aligned}\nL(\\beta|T) &=\n\\prod_i \\frac{\\bar\\theta_i}{[\\sum_{k \\in R(t_i)} \\theta_k]^{d_i}}\\\\\n&= \\prod_i \\prod_{j \\in D_i}\\frac{\\theta_j}{\\sum_{k \\in R(t_i)} \\theta_k}\n\\end{aligned}\n\\]\nThis method is equivalent to treating each event as distinct and using the non-ties formula. It works best when the number of ties is small. It is the default in many statistical packages, including PROC PHREG in SAS.\nEfron’s method for ties\nThe other common method is Efron’s, which is the default in R.\n\\[L(\\beta|T)=\n\\prod_i \\frac{\\bar\\theta_i}{\\prod_{j=1}^{d_i}[\\sum_{k \\in R(t_i)} \\theta_k-\\frac{j-1}{d_i}\\sum_{k \\in D_i} \\theta_k]}\\] This is closer to the exact discrete partial likelihood when there are many ties.\nThe third option in R (and an option also in SAS as discrete) is the “exact” method, which is the same one used for matched logistic regression.\nExample: Breslow’s method\nSuppose as an example we have a time \\(t\\) where there are 20 individuals at risk and three failures. Let the three individuals have risk parameters \\(\\theta_1, \\theta_2, \\theta_3\\) and let the sum of the risk parameters of the remaining 17 individuals be \\(\\theta_R\\). Then the factor in the partial likelihood at time \\(t\\) using Breslow’s method is\n\n\\[\n\\left(\\frac{\\theta_1}{\\theta_R+\\theta_1+\\theta_2+\\theta_3}\\right)\n\\left(\\frac{\\theta_2}{\\theta_R+\\theta_1+\\theta_2+\\theta_3}\\right)\n\\left(\\frac{\\theta_3}{\\theta_R+\\theta_1+\\theta_2+\\theta_3}\\right)\n\\]\n\nIf on the other hand, they had died in the order 1,2, 3, then the contribution to the partial likelihood would be:\n\n\\[\n\\left(\\frac{\\theta_1}{\\theta_R+\\theta_1+\\theta_2+\\theta_3}\\right)\n\\left(\\frac{\\theta_2}{\\theta_R+\\theta_2+\\theta_3}\\right)\n\\left(\\frac{\\theta_3}{\\theta_R+\\theta_3}\\right)\n\\]\n\nas the risk set got smaller with each failure. The exact method roughly averages the results for the six possible orderings of the failures.\nExample: Efron’s method\nBut we don’t know the order they failed in, so instead of reducing the denominator by one risk coefficient each time, we reduce it by the same fraction. This is Efron’s method.\n\n\\[\\left(\\frac{\\theta_1}{\\theta_R+\\theta_1+\\theta_2+\\theta_3}\\right)\n\\left(\\frac{\\theta_2}{\\theta_R+2(\\theta_1+\\theta_2+\\theta_3)/3}\\right)\n\\left(\\frac{\\theta_3}{\\theta_R+(\\theta_1+\\theta_2+\\theta_3)/3}\\right)\\]", "crumbs": [ "Time to Event Models", "6  Proportional Hazards Models"