diff --git a/.nojekyll b/.nojekyll
index bfb6ce0..ab4d816 100644
--- a/.nojekyll
+++ b/.nojekyll
@@ -1 +1 @@
-ec20b45e
\ No newline at end of file
+b02e4979
\ No newline at end of file
diff --git a/Linear-models-overview.html b/Linear-models-overview.html
index cdb7114..d5cd97b 100644
--- a/Linear-models-overview.html
+++ b/Linear-models-overview.html
@@ -3926,7 +3926,7 @@
Note
full
-
6.803
+
7.055
0.4805
0.3831
5.956
@@ -3934,7 +3934,7 @@
Note
reduced
-
6.657
+
6.642
0.4454
0.3802
5.971
@@ -4071,10 +4071,10 @@
Note
Show R code
coef(cvfit, s ="lambda.1se")#> 4 x 1 sparse Matrix of class "dgCMatrix"#> s1
-#> (Intercept) 34.2044
+#> (Intercept) 34.1090#> age .
-#> weight -0.0926
-#> protein 0.8582
+#> weight -0.1041
+#> protein 0.9441
diff --git a/Linear-models-overview_files/figure-html/unnamed-chunk-104-1.png b/Linear-models-overview_files/figure-html/unnamed-chunk-104-1.png
index 8b0faf5..fa18b52 100644
Binary files a/Linear-models-overview_files/figure-html/unnamed-chunk-104-1.png and b/Linear-models-overview_files/figure-html/unnamed-chunk-104-1.png differ
diff --git a/Linear-models-overview_files/figure-html/unnamed-chunk-98-1.png b/Linear-models-overview_files/figure-html/unnamed-chunk-98-1.png
index bb1e1c7..b980c7d 100644
Binary files a/Linear-models-overview_files/figure-html/unnamed-chunk-98-1.png and b/Linear-models-overview_files/figure-html/unnamed-chunk-98-1.png differ
diff --git a/Regression-Models-for-Epidemiology.pdf b/Regression-Models-for-Epidemiology.pdf
index de3f3cc..5f09c55 100644
Binary files a/Regression-Models-for-Epidemiology.pdf and b/Regression-Models-for-Epidemiology.pdf differ
diff --git a/count-regression.html b/count-regression.html
index ba0f953..60fddd6 100644
--- a/count-regression.html
+++ b/count-regression.html
@@ -395,9 +395,12 @@
glm1=glm( data =needles,
@@ -1275,7 +1277,7 @@
Vittinghoff, Eric, David V Glidden, Stephen C Shiboski, and Charles E McCulloch. 2012. Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models. 2nd ed. Springer. https://doi.org/10.1007/978-1-4614-1353-0.
The term Survival analysis is a bit misleading. Survival outcomes can sometimes be analyzed using binomial models (logistic regression). Time-to-event models might be a better name.
+
The term survival analysis is a bit misleading. Survival outcomes can sometimes be analyzed using binomial models (logistic regression). Time-to-event models or survival time analysis might be a better name.
@@ -2989,9 +2989,9 @@
Examples
failure of a device or system.- insurance, particularly life insurance, where the event is death.
-
+:::{.callout-note}
-The term *Survival analysis* is a bit misleading. Survival outcomes can sometimes be analyzed using binomial models (logistic regression). *Time-to-event models* might be a better name.
+The term *survival analysis* is a bit misleading. Survival outcomes can sometimes be analyzed using binomial models (logistic regression). *Time-to-event models* or *survival time analysis* might be a better name.:::## Time-to-event outcome distributions
diff --git a/logistic-regression.html b/logistic-regression.html
index be078e8..ebad0d7 100644
--- a/logistic-regression.html
+++ b/logistic-regression.html
@@ -3703,8 +3703,8 @@
<
ggplotly(HL_plot)
-
-
+
+
@@ -3855,8 +3855,8 @@
wcgs_response_resid_plot |>ggplotly()
-
-
+
+
We can see a slight fan-shape here: observations on the right have larger variance (as expected since \(var(\bar y) = \pi(1-\pi)/n\) is maximized when \(\pi = 0.5\)).
Nahhas, Ramzi W. n.d. Introduction to Regression Methods for Public
diff --git a/search.json b/search.json
index f774222..6df530a 100644
--- a/search.json
+++ b/search.json
@@ -34,7 +34,7 @@
"href": "index.html#license",
"title": "Regression Models for Epidemiology",
"section": "License",
- "text": "License\nThis book is licensed to you under Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.\nThe code samples in this book are licensed under Creative Commons CC0 1.0 Universal (CC0 1.0), i.e. public domain.\n\n\n\n\n\n\nDalgaard, Peter. 2008. Introductory Statistics with r. New York, NY: Springer New York. https://link.springer.com/book/10.1007/978-0-387-79054-1.\n\n\nDobson, Annette J, and Adrian G Barnett. 2018. An Introduction to Generalized Linear Models. 4th ed. CRC press. https://doi.org/10.1201/9781315182780.\n\n\nDunn, Peter K, Gordon K Smyth, et al. 2018. Generalized Linear Models with Examples in r. Vol. 53. Springer. https://link.springer.com/book/10.1007/978-1-4419-0118-7.\n\n\nFaraway, Julian J. 2016. Extending the Linear Model with r: Generalized Linear, Mixed Effects and Nonparametric Regression Models. 2nd ed. Chapman; Hall/CRC. https://doi.org/10.1201/9781315382722.\n\n\nFox, John. 2015. Applied Regression Analysis and Generalized Linear Models. Sage publications.\n\n\nHarrell, Frank E. 2015. Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis. 2nd ed. Springer. https://doi.org/10.1007/978-3-319-19425-7.\n\n\nMcCullagh, Peter, and J. A. Nelder. 1989. Generalized Linear Models. 2nd ed. Routledge. https://www.utstat.toronto.edu/~brunner/oldclass/2201s11/readings/glmbook.pdf.\n\n\nMoore, Dirk F. 2016. Applied Survival Analysis Using r. Vol. 473. Springer.\n\n\nNahhas, Ramzi W. n.d. Introduction to Regression Methods for Public Health Using r. CRC Press. https://www.bookdown.org/rwnahhas/RMPH/.\n\n\nSelvin, Steve. 2001. Epidemiologic Analysis: A Case-Oriented Approach. Oxford University Press.\n\n\nVittinghoff, Eric, David V Glidden, Stephen C Shiboski, and Charles E McCulloch. 2012. Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models. 2nd ed. Springer. https://doi.org/10.1007/978-1-4614-1353-0.",
+ "text": "License\nThis book is licensed to you under Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.\nThe code samples in this book are licensed under Creative Commons CC0 1.0 Universal (CC0 1.0), i.e. public domain.\n\n\n\n\n\n\nDalgaard, Peter. 2008. Introductory Statistics with r. New York, NY: Springer New York. https://link.springer.com/book/10.1007/978-0-387-79054-1.\n\n\nDobson, Annette J, and Adrian G Barnett. 2018. An Introduction to Generalized Linear Models. 4th ed. CRC press. https://doi.org/10.1201/9781315182780.\n\n\nDunn, Peter K, Gordon K Smyth, et al. 2018. Generalized Linear Models with Examples in r. Vol. 53. Springer. https://link.springer.com/book/10.1007/978-1-4419-0118-7.\n\n\nFaraway, Julian J. 2016. Extending the Linear Model with r: Generalized Linear, Mixed Effects and Nonparametric Regression Models. 2nd ed. Chapman; Hall/CRC. https://doi.org/10.1201/9781315382722.\n\n\nFox, John. 2015. Applied Regression Analysis and Generalized Linear Models. Sage publications.\n\n\nHarrell, Frank E. 2015. Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis. 2nd ed. Springer. https://doi.org/10.1007/978-3-319-19425-7.\n\n\nMcCullagh, Peter, and J. A. Nelder. 1989. Generalized Linear Models. 2nd ed. Routledge. https://www.utstat.toronto.edu/~brunner/oldclass/2201s11/readings/glmbook.pdf.\n\n\nMoore, Dirk F. 2016. Applied Survival Analysis Using r. Vol. 473. Springer. https://doi.org/10.1007/978-3-319-31245-3.\n\n\nNahhas, Ramzi W. n.d. Introduction to Regression Methods for Public Health Using r. CRC Press. https://www.bookdown.org/rwnahhas/RMPH/.\n\n\nSelvin, Steve. 2001. Epidemiologic Analysis: A Case-Oriented Approach. Oxford University Press.\n\n\nVittinghoff, Eric, David V Glidden, Stephen C Shiboski, and Charles E McCulloch. 2012. Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models. 2nd ed. Springer. https://doi.org/10.1007/978-1-4614-1353-0.",
"crumbs": [
"Preface"
]
@@ -193,7 +193,7 @@
"href": "Linear-models-overview.html#model-selection-1",
"title": "\n2 Linear (Gaussian) Models\n",
"section": "\n2.9 Model selection",
- "text": "2.9 Model selection\n(adapted from Dobson and Barnett (2018) §6.3.3; for more information on prediction, see James et al. (2013) and Harrell (2015)).\n\nIf we have a lot of covariates in our dataset, we might want to choose a small subset to use in our model.\nThere are a few possible metrics to consider for choosing a “best” model.\n\n\n2.9.1 Mean squared error\nWe might want to minimize the mean squared error, \\(\\text E[(y-\\hat y)^2]\\), for new observations that weren’t in our data set when we fit the model.\nUnfortunately, \\[\\frac{1}{n}\\sum_{i=1}^n (y_i-\\hat y_i)^2\\] gives a biased estimate of \\(\\text E[(y-\\hat y)^2]\\) for new data. If we want an unbiased estimate, we will have to be clever.\n\nCross-validation\n\nShow R codedata(\"carbohydrate\", package = \"dobson\")\nlibrary(cvTools)\nfull_model <- lm(carbohydrate ~ ., data = carbohydrate)\ncv_full = \n full_model |> cvFit(\n data = carbohydrate, K = 5, R = 10,\n y = carbohydrate$carbohydrate)\n\nreduced_model = update(full_model, \n formula = ~ . - age)\n\ncv_reduced = \n reduced_model |> cvFit(\n data = carbohydrate, K = 5, R = 10,\n y = carbohydrate$carbohydrate)\n\n\n\n\nShow R coderesults_reduced = \n tibble(\n model = \"wgt+protein\",\n errs = cv_reduced$reps[])\nresults_full = \n tibble(model = \"wgt+age+protein\",\n errs = cv_full$reps[])\n\ncv_results = \n bind_rows(results_reduced, results_full)\n\ncv_results |> \n ggplot(aes(y = model, x = errs)) +\n geom_boxplot()\n\n\n\n\n\n\n\n\ncomparing metrics\n\nShow R code\ncompare_results = tribble(\n ~ model, ~ cvRMSE, ~ r.squared, ~adj.r.squared, ~ trainRMSE, ~loglik,\n \"full\", cv_full$cv, summary(full_model)$r.squared, summary(full_model)$adj.r.squared, sigma(full_model), logLik(full_model) |> as.numeric(),\n \"reduced\", cv_reduced$cv, summary(reduced_model)$r.squared, summary(reduced_model)$adj.r.squared, sigma(reduced_model), logLik(reduced_model) |> as.numeric())\n\ncompare_results\n\n\n\nmodel\ncvRMSE\nr.squared\nadj.r.squared\ntrainRMSE\nloglik\n\n\n\nfull\n6.803\n0.4805\n0.3831\n5.956\n-61.84\n\n\nreduced\n6.657\n0.4454\n0.3802\n5.971\n-62.49\n\n\n\n\n\n\n\nShow R codeanova(full_model, reduced_model)\n\n\n\nRes.Df\nRSS\nDf\nSum of Sq\nF\nPr(>F)\n\n\n\n16\n567.7\nNA\nNA\nNA\nNA\n\n\n17\n606.0\n-1\n-38.36\n1.081\n0.3139\n\n\n\n\n\nstepwise regression\n\nShow R codelibrary(olsrr)\nolsrr:::ols_step_both_aic(full_model)\n#> \n#> \n#> Stepwise Summary \n#> -------------------------------------------------------------------------\n#> Step Variable AIC SBC SBIC R2 Adj. R2 \n#> -------------------------------------------------------------------------\n#> 0 Base Model 140.773 142.764 83.068 0.00000 0.00000 \n#> 1 protein (+) 137.950 140.937 80.438 0.21427 0.17061 \n#> 2 weight (+) 132.981 136.964 77.191 0.44544 0.38020 \n#> -------------------------------------------------------------------------\n#> \n#> Final Model Output \n#> ------------------\n#> \n#> Model Summary \n#> ---------------------------------------------------------------\n#> R 0.667 RMSE 5.505 \n#> R-Squared 0.445 MSE 35.648 \n#> Adj. R-Squared 0.380 Coef. Var 15.879 \n#> Pred R-Squared 0.236 AIC 132.981 \n#> MAE 4.593 SBC 136.964 \n#> ---------------------------------------------------------------\n#> RMSE: Root Mean Square Error \n#> MSE: Mean Square Error \n#> MAE: Mean Absolute Error \n#> AIC: Akaike Information Criteria \n#> SBC: Schwarz Bayesian Criteria \n#> \n#> ANOVA \n#> -------------------------------------------------------------------\n#> Sum of \n#> Squares DF Mean Square F Sig. \n#> -------------------------------------------------------------------\n#> Regression 486.778 2 243.389 6.827 0.0067 \n#> Residual 606.022 17 35.648 \n#> Total 1092.800 19 \n#> -------------------------------------------------------------------\n#> \n#> Parameter Estimates \n#> ----------------------------------------------------------------------------------------\n#> model Beta Std. Error Std. Beta t Sig lower upper \n#> ----------------------------------------------------------------------------------------\n#> (Intercept) 33.130 12.572 2.635 0.017 6.607 59.654 \n#> protein 1.824 0.623 0.534 2.927 0.009 0.509 3.139 \n#> weight -0.222 0.083 -0.486 -2.662 0.016 -0.397 -0.046 \n#> ----------------------------------------------------------------------------------------\n\n\nLasso\n\\[\\arg min_{\\theta} \\ell(\\theta) + \\lambda \\sum_{j=1}^p|\\beta_j|\\]\n\nShow R codelibrary(glmnet)\ny = carbohydrate$carbohydrate\nx = carbohydrate |> \n select(age, weight, protein) |> \n as.matrix()\nfit = glmnet(x,y)\n\n\n\n\nShow R codeautoplot(fit, xvar = 'lambda')\n\n\n\nFigure 2.19: Lasso selection\n\n\n\n\n\n\n\n\n\nShow R codecvfit = cv.glmnet(x,y)\nplot(cvfit)\n\n\n\n\n\n\n\n\n\nShow R codecoef(cvfit, s = \"lambda.1se\")\n#> 4 x 1 sparse Matrix of class \"dgCMatrix\"\n#> s1\n#> (Intercept) 34.2044\n#> age . \n#> weight -0.0926\n#> protein 0.8582",
+ "text": "2.9 Model selection\n(adapted from Dobson and Barnett (2018) §6.3.3; for more information on prediction, see James et al. (2013) and Harrell (2015)).\n\nIf we have a lot of covariates in our dataset, we might want to choose a small subset to use in our model.\nThere are a few possible metrics to consider for choosing a “best” model.\n\n\n2.9.1 Mean squared error\nWe might want to minimize the mean squared error, \\(\\text E[(y-\\hat y)^2]\\), for new observations that weren’t in our data set when we fit the model.\nUnfortunately, \\[\\frac{1}{n}\\sum_{i=1}^n (y_i-\\hat y_i)^2\\] gives a biased estimate of \\(\\text E[(y-\\hat y)^2]\\) for new data. If we want an unbiased estimate, we will have to be clever.\n\nCross-validation\n\nShow R codedata(\"carbohydrate\", package = \"dobson\")\nlibrary(cvTools)\nfull_model <- lm(carbohydrate ~ ., data = carbohydrate)\ncv_full = \n full_model |> cvFit(\n data = carbohydrate, K = 5, R = 10,\n y = carbohydrate$carbohydrate)\n\nreduced_model = update(full_model, \n formula = ~ . - age)\n\ncv_reduced = \n reduced_model |> cvFit(\n data = carbohydrate, K = 5, R = 10,\n y = carbohydrate$carbohydrate)\n\n\n\n\nShow R coderesults_reduced = \n tibble(\n model = \"wgt+protein\",\n errs = cv_reduced$reps[])\nresults_full = \n tibble(model = \"wgt+age+protein\",\n errs = cv_full$reps[])\n\ncv_results = \n bind_rows(results_reduced, results_full)\n\ncv_results |> \n ggplot(aes(y = model, x = errs)) +\n geom_boxplot()\n\n\n\n\n\n\n\n\ncomparing metrics\n\nShow R code\ncompare_results = tribble(\n ~ model, ~ cvRMSE, ~ r.squared, ~adj.r.squared, ~ trainRMSE, ~loglik,\n \"full\", cv_full$cv, summary(full_model)$r.squared, summary(full_model)$adj.r.squared, sigma(full_model), logLik(full_model) |> as.numeric(),\n \"reduced\", cv_reduced$cv, summary(reduced_model)$r.squared, summary(reduced_model)$adj.r.squared, sigma(reduced_model), logLik(reduced_model) |> as.numeric())\n\ncompare_results\n\n\n\nmodel\ncvRMSE\nr.squared\nadj.r.squared\ntrainRMSE\nloglik\n\n\n\nfull\n7.055\n0.4805\n0.3831\n5.956\n-61.84\n\n\nreduced\n6.642\n0.4454\n0.3802\n5.971\n-62.49\n\n\n\n\n\n\n\nShow R codeanova(full_model, reduced_model)\n\n\n\nRes.Df\nRSS\nDf\nSum of Sq\nF\nPr(>F)\n\n\n\n16\n567.7\nNA\nNA\nNA\nNA\n\n\n17\n606.0\n-1\n-38.36\n1.081\n0.3139\n\n\n\n\n\nstepwise regression\n\nShow R codelibrary(olsrr)\nolsrr:::ols_step_both_aic(full_model)\n#> \n#> \n#> Stepwise Summary \n#> -------------------------------------------------------------------------\n#> Step Variable AIC SBC SBIC R2 Adj. R2 \n#> -------------------------------------------------------------------------\n#> 0 Base Model 140.773 142.764 83.068 0.00000 0.00000 \n#> 1 protein (+) 137.950 140.937 80.438 0.21427 0.17061 \n#> 2 weight (+) 132.981 136.964 77.191 0.44544 0.38020 \n#> -------------------------------------------------------------------------\n#> \n#> Final Model Output \n#> ------------------\n#> \n#> Model Summary \n#> ---------------------------------------------------------------\n#> R 0.667 RMSE 5.505 \n#> R-Squared 0.445 MSE 35.648 \n#> Adj. R-Squared 0.380 Coef. Var 15.879 \n#> Pred R-Squared 0.236 AIC 132.981 \n#> MAE 4.593 SBC 136.964 \n#> ---------------------------------------------------------------\n#> RMSE: Root Mean Square Error \n#> MSE: Mean Square Error \n#> MAE: Mean Absolute Error \n#> AIC: Akaike Information Criteria \n#> SBC: Schwarz Bayesian Criteria \n#> \n#> ANOVA \n#> -------------------------------------------------------------------\n#> Sum of \n#> Squares DF Mean Square F Sig. \n#> -------------------------------------------------------------------\n#> Regression 486.778 2 243.389 6.827 0.0067 \n#> Residual 606.022 17 35.648 \n#> Total 1092.800 19 \n#> -------------------------------------------------------------------\n#> \n#> Parameter Estimates \n#> ----------------------------------------------------------------------------------------\n#> model Beta Std. Error Std. Beta t Sig lower upper \n#> ----------------------------------------------------------------------------------------\n#> (Intercept) 33.130 12.572 2.635 0.017 6.607 59.654 \n#> protein 1.824 0.623 0.534 2.927 0.009 0.509 3.139 \n#> weight -0.222 0.083 -0.486 -2.662 0.016 -0.397 -0.046 \n#> ----------------------------------------------------------------------------------------\n\n\nLasso\n\\[\\arg min_{\\theta} \\ell(\\theta) + \\lambda \\sum_{j=1}^p|\\beta_j|\\]\n\nShow R codelibrary(glmnet)\ny = carbohydrate$carbohydrate\nx = carbohydrate |> \n select(age, weight, protein) |> \n as.matrix()\nfit = glmnet(x,y)\n\n\n\n\nShow R codeautoplot(fit, xvar = 'lambda')\n\n\n\nFigure 2.19: Lasso selection\n\n\n\n\n\n\n\n\n\nShow R codecvfit = cv.glmnet(x,y)\nplot(cvfit)\n\n\n\n\n\n\n\n\n\nShow R codecoef(cvfit, s = \"lambda.1se\")\n#> 4 x 1 sparse Matrix of class \"dgCMatrix\"\n#> s1\n#> (Intercept) 34.1090\n#> age . \n#> weight -0.1041\n#> protein 0.9441",
"crumbs": [
"Generalized Linear Models",
"2Linear (Gaussian) Models"
@@ -479,29 +479,7 @@
"href": "count-regression.html#example-needle-sharing",
"title": "\n4 Models for Count Outcomes\n",
"section": "\n4.7 Example: needle-sharing",
- "text": "4.7 Example: needle-sharing\n(adapted from Vittinghoff et al. (2012), §8)\n\nShow R codelibrary(tidyverse)\nlibrary(haven)\nneedles = read_dta(\"inst/extdata/needle_sharing.dta\") |> \n as_tibble() |> \n mutate(polydrug = \n ifelse(polydrug, \"multiple drugs used\", \"one drug used\") |> \n factor() |> \n relevel(ref = \"one drug used\"),\n homeless = ifelse(homeless, \"homeless\", \"not homeless\") |> \n factor() |> relevel(ref = \"not homeless\")) |> \n mutate(sex = factor(sex) |> relevel(ref = \"M\"))\nneedles\n\n\nTable 4.1: Needle-sharing data\n\n\n\n \n\n\n\n\n\n\n\n\nShow R codelibrary(ggplot2)\n\nneedles |> \n ggplot(\n aes(\n x = age,\n y = shared_syr,\n shape = sex,\n col = ethn\n )\n ) + \n geom_point(\n size = 3, \n alpha = .5) +\n facet_grid(\n cols = vars(polydrug), \n rows = vars(homeless)) +\n theme(legend.position = \"bottom\")\n\n\n\nFigure 4.1: Rates of needle sharing",
- "crumbs": [
- "Generalized Linear Models",
- "4Models for Count Outcomes"
- ]
- },
- {
- "objectID": "count-regression.html#covariate-counts",
- "href": "count-regression.html#covariate-counts",
- "title": "\n4 Models for Count Outcomes\n",
- "section": "\n4.8 Covariate counts:",
- "text": "4.8 Covariate counts:\n\nShow R codeneedles |> \n dplyr::select(sex, homeless, polydrug) |> \n summary()\n#> sex homeless polydrug \n#> M :97 not homeless:63 one drug used :109 \n#> F :30 homeless :61 multiple drugs used: 19 \n#> Trans: 1 NA's : 4\n\n\nThere’s only one individual with sex = Trans, which unfortunately isn’t enough data to analyze. We will remove that individual:\n\nShow R code\nneedles = needles |> filter(sex != \"Trans\")",
- "crumbs": [
- "Generalized Linear Models",
- "4Models for Count Outcomes"
- ]
- },
- {
- "objectID": "count-regression.html#model",
- "href": "count-regression.html#model",
- "title": "\n4 Models for Count Outcomes\n",
- "section": "\n4.9 model",
- "text": "4.9 model\n\nShow R codeglm1 = glm(\n data = needles,\n family = stats::poisson,\n shared_syr ~ age + sex + homeless*polydrug\n)\n\nlibrary(parameters)\nglm1 |> parameters(exponentiate = TRUE) |> \n print_md()\n\n\nTable 4.2: Poisson model for needle-sharing data\n\n\n\n\n\n\n\n\n\n\n\n\nParameter\nIRR\nSE\n95% CI\nz\np\n\n\n\n(Intercept)\n4.52\n1.15\n(2.74, 7.45)\n5.90\n< .001\n\n\nage\n0.97\n5.58e-03\n(0.96, 0.98)\n-5.41\n< .001\n\n\nsex (F)\n1.98\n0.23\n(1.58, 2.49)\n5.88\n< .001\n\n\nhomeless (homeless)\n3.58\n0.45\n(2.79, 4.59)\n10.06\n< .001\n\n\npolydrug (multiple drugs used)\n1.45e-07\n5.82e-05\n(0.00, Inf)\n-0.04\n0.969\n\n\nhomeless (homeless) × polydrug (multiple drugs used)\n1.27e+06\n5.12e+08\n(0.00, Inf)\n0.03\n0.972\n\n\n\n\n\n\n\n\n\nShow R codelibrary(ggfortify)\nautoplot(glm1)\n\n\nTable 4.3: Diagnostics for Poisson model\n\n\n\n\n\n\n\n\n\n\n\n–\n\n\nTable 4.4: Negative binomial model for needle-sharing data\n\nShow R codelibrary(MASS) #need this for glm.nb()\nglm1.nb = glm.nb(\n data = needles,\n shared_syr ~ age + sex + homeless*polydrug\n)\nsummary(glm1.nb)\n#> \n#> Call:\n#> glm.nb(formula = shared_syr ~ age + sex + homeless * polydrug, \n#> data = needles, init.theta = 0.08436295825, link = log)\n#> \n#> Coefficients:\n#> Estimate Std. Error z value\n#> (Intercept) 9.91e-01 1.71e+00 0.58\n#> age -2.76e-02 3.82e-02 -0.72\n#> sexF 1.06e+00 8.07e-01 1.32\n#> homelesshomeless 1.65e+00 7.22e-01 2.29\n#> polydrugmultiple drugs used -2.46e+01 3.61e+04 0.00\n#> homelesshomeless:polydrugmultiple drugs used 2.32e+01 3.61e+04 0.00\n#> Pr(>|z|) \n#> (Intercept) 0.563 \n#> age 0.469 \n#> sexF 0.187 \n#> homelesshomeless 0.022 *\n#> polydrugmultiple drugs used 0.999 \n#> homelesshomeless:polydrugmultiple drugs used 0.999 \n#> ---\n#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n#> \n#> (Dispersion parameter for Negative Binomial(0.0844) family taken to be 1)\n#> \n#> Null deviance: 69.193 on 119 degrees of freedom\n#> Residual deviance: 57.782 on 114 degrees of freedom\n#> (7 observations deleted due to missingness)\n#> AIC: 315.5\n#> \n#> Number of Fisher Scoring iterations: 1\n#> \n#> \n#> Theta: 0.0844 \n#> Std. Err.: 0.0197 \n#> \n#> 2 x log-likelihood: -301.5060\n\n\n\n\n\nShow R codetibble(name = names(coef(glm1)), poisson = coef(glm1), nb = coef(glm1.nb))\n\n\nTable 4.5: Poisson versus Negative Binomial Regression coefficient estimates\n\n\n\n \n\n\n\n\n\n\nzero-inflation\n\nShow R codelibrary(glmmTMB)\nzinf_fit1 = glmmTMB(\n family = \"poisson\",\n data = needles,\n formula = shared_syr ~ age + sex + homeless*polydrug,\n ziformula = ~ age + sex + homeless + polydrug # fit won't converge with interaction\n)\n\nzinf_fit1 |> \n parameters(exponentiate = TRUE) |> \n print_md()\n\n\nTable 4.6: Zero-inflated poisson model\n\n\n\n# Fixed Effects\n\n\n\n\n\n\n\n\n\nParameter\nIRR\nSE\n95% CI\nz\np\n\n\n\n(Intercept)\n3.16\n0.82\n(1.90, 5.25)\n4.44\n< .001\n\n\nage\n1.01\n5.88e-03\n(1.00, 1.02)\n1.74\n0.081\n\n\nsex [F]\n3.43\n0.44\n(2.67, 4.40)\n9.68\n< .001\n\n\nhomeless [homeless]\n3.44\n0.47\n(2.63, 4.50)\n9.03\n< .001\n\n\npolydrug [multiple drugs used]\n1.85e-09\n1.21e-05\n(0.00, Inf)\n-3.08e-03\n0.998\n\n\nhomeless [homeless] × polydrug [multiple drugs used]\n1.38e+08\n9.04e+11\n(0.00, Inf)\n2.87e-03\n0.998\n\n\n\n\n# Zero-Inflation\n\n\n\n\n\n\n\n\n\nParameter\nOdds Ratio\nSE\n95% CI\nz\np\n\n\n\n(Intercept)\n0.49\n0.54\n(0.06, 4.25)\n-0.65\n0.514\n\n\nage\n1.05\n0.03\n(1.00, 1.10)\n1.95\n0.051\n\n\nsex [F]\n1.44\n0.84\n(0.46, 4.50)\n0.62\n0.533\n\n\nhomeless [homeless]\n0.68\n0.34\n(0.26, 1.80)\n-0.78\n0.436\n\n\npolydrug [multiple drugs used]\n1.15\n0.91\n(0.24, 5.43)\n0.18\n0.858\n\n\n\n\n\n\n\n\nzero-inflated negative binomial model\n\nShow R codelibrary(glmmTMB)\nzinf_fit1 = glmmTMB(\n family = nbinom2,\n data = needles,\n formula = shared_syr ~ age + sex + homeless*polydrug,\n ziformula = ~ age + sex + homeless + polydrug # fit won't converge with interaction\n)\n\nzinf_fit1 |> \n parameters(exponentiate = TRUE) |> \n print_md()\n\n\nTable 4.7: Zero-inflated negative binomial model\n\n\n\n# Fixed Effects\n\n\n\n\n\n\n\n\n\nParameter\nIRR\nSE\n95% CI\nz\np\n\n\n\n(Intercept)\n1.06\n1.48\n(0.07, 16.52)\n0.04\n0.969\n\n\nage\n1.02\n0.03\n(0.96, 1.08)\n0.53\n0.599\n\n\nsex [F]\n6.86\n6.36\n(1.12, 42.16)\n2.08\n0.038\n\n\nhomeless [homeless]\n6.44\n4.59\n(1.60, 26.01)\n2.62\n0.009\n\n\npolydrug [multiple drugs used]\n8.25e-10\n7.07e-06\n(0.00, Inf)\n-2.44e-03\n0.998\n\n\nhomeless [homeless] × polydrug [multiple drugs used]\n2.36e+08\n2.02e+12\n(0.00, Inf)\n2.25e-03\n0.998\n\n\n\n\n# Zero-Inflation\n\n\n\n\n\n\n\n\n\nParameter\nOdds Ratio\nSE\n95% CI\nz\np\n\n\n\n(Intercept)\n0.10\n0.20\n(1.47e-03, 6.14)\n-1.11\n0.269\n\n\nage\n1.07\n0.04\n(0.99, 1.15)\n1.78\n0.075\n\n\nsex [F]\n2.72\n2.40\n(0.48, 15.33)\n1.13\n0.258\n\n\nhomeless [homeless]\n1.15\n0.86\n(0.27, 4.96)\n0.19\n0.853\n\n\npolydrug [multiple drugs used]\n0.75\n0.86\n(0.08, 7.12)\n-0.25\n0.799\n\n\n\n\n# Dispersion\n\nParameter\nCoefficient\n95% CI\n\n\n(Intercept)\n0.44\n(0.11, 1.71)\n\n\n\n\n\n\n\n\n\n\n\n\n\nDobson, Annette J, and Adrian G Barnett. 2018. An Introduction to Generalized Linear Models. 4th ed. CRC press. https://doi.org/10.1201/9781315182780.\n\n\nVittinghoff, Eric, David V Glidden, Stephen C Shiboski, and Charles E McCulloch. 2012. Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models. 2nd ed. Springer. https://doi.org/10.1007/978-1-4614-1353-0.",
+ "text": "4.7 Example: needle-sharing\n(adapted from Vittinghoff et al. (2012), §8)\n\nShow R codelibrary(tidyverse)\nlibrary(haven)\nneedles = read_dta(\"inst/extdata/needle_sharing.dta\") |> \n as_tibble() |> \n mutate(polydrug = \n ifelse(polydrug, \"multiple drugs used\", \"one drug used\") |> \n factor() |> \n relevel(ref = \"one drug used\"),\n homeless = ifelse(homeless, \"homeless\", \"not homeless\") |> \n factor() |> relevel(ref = \"not homeless\")) |> \n mutate(sex = factor(sex) |> relevel(ref = \"M\"))\nneedles\n\n\nTable 4.1: Needle-sharing data\n\n\n\n \n\n\n\n\n\n\n\n\nShow R codelibrary(ggplot2)\n\nneedles |> \n ggplot(\n aes(\n x = age,\n y = shared_syr,\n shape = sex,\n col = ethn\n )\n ) + \n geom_point(\n size = 3, \n alpha = .5) +\n facet_grid(\n cols = vars(polydrug), \n rows = vars(homeless)) +\n theme(legend.position = \"bottom\")\n\n\n\nFigure 4.1: Rates of needle sharing\n\n\n\n\n\n\n\nCovariate counts:\n\nShow R codeneedles |> \n dplyr::select(sex, homeless, polydrug) |> \n summary()\n#> sex homeless polydrug \n#> M :97 not homeless:63 one drug used :109 \n#> F :30 homeless :61 multiple drugs used: 19 \n#> Trans: 1 NA's : 4\n\n\nThere’s only one individual with sex = Trans, which unfortunately isn’t enough data to analyze. We will remove that individual:\n\nShow R code\nneedles = needles |> filter(sex != \"Trans\")\n\n\n\n4.7.1 models\n\nShow R codeglm1 = glm(\n data = needles,\n family = stats::poisson,\n shared_syr ~ age + sex + homeless*polydrug\n)\n\nlibrary(parameters)\nglm1 |> parameters(exponentiate = TRUE) |> \n print_md()\n\n\nTable 4.2: Poisson model for needle-sharing data\n\n\n\n\n\n\n\n\n\n\n\n\nParameter\nIRR\nSE\n95% CI\nz\np\n\n\n\n(Intercept)\n4.52\n1.15\n(2.74, 7.45)\n5.90\n< .001\n\n\nage\n0.97\n5.58e-03\n(0.96, 0.98)\n-5.41\n< .001\n\n\nsex (F)\n1.98\n0.23\n(1.58, 2.49)\n5.88\n< .001\n\n\nhomeless (homeless)\n3.58\n0.45\n(2.79, 4.59)\n10.06\n< .001\n\n\npolydrug (multiple drugs used)\n1.45e-07\n5.82e-05\n(0.00, Inf)\n-0.04\n0.969\n\n\nhomeless (homeless) × polydrug (multiple drugs used)\n1.27e+06\n5.12e+08\n(0.00, Inf)\n0.03\n0.972\n\n\n\n\n\n\n\n\n\nShow R codelibrary(ggfortify)\nautoplot(glm1)\n\n\nTable 4.3: Diagnostics for Poisson model\n\n\n\n\n\n\n\n\n\n\n\n–\n\n\nTable 4.4: Negative binomial model for needle-sharing data\n\nShow R codelibrary(MASS) #need this for glm.nb()\nglm1.nb = glm.nb(\n data = needles,\n shared_syr ~ age + sex + homeless*polydrug\n)\nsummary(glm1.nb)\n#> \n#> Call:\n#> glm.nb(formula = shared_syr ~ age + sex + homeless * polydrug, \n#> data = needles, init.theta = 0.08436295825, link = log)\n#> \n#> Coefficients:\n#> Estimate Std. Error z value\n#> (Intercept) 9.91e-01 1.71e+00 0.58\n#> age -2.76e-02 3.82e-02 -0.72\n#> sexF 1.06e+00 8.07e-01 1.32\n#> homelesshomeless 1.65e+00 7.22e-01 2.29\n#> polydrugmultiple drugs used -2.46e+01 3.61e+04 0.00\n#> homelesshomeless:polydrugmultiple drugs used 2.32e+01 3.61e+04 0.00\n#> Pr(>|z|) \n#> (Intercept) 0.563 \n#> age 0.469 \n#> sexF 0.187 \n#> homelesshomeless 0.022 *\n#> polydrugmultiple drugs used 0.999 \n#> homelesshomeless:polydrugmultiple drugs used 0.999 \n#> ---\n#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n#> \n#> (Dispersion parameter for Negative Binomial(0.0844) family taken to be 1)\n#> \n#> Null deviance: 69.193 on 119 degrees of freedom\n#> Residual deviance: 57.782 on 114 degrees of freedom\n#> (7 observations deleted due to missingness)\n#> AIC: 315.5\n#> \n#> Number of Fisher Scoring iterations: 1\n#> \n#> \n#> Theta: 0.0844 \n#> Std. Err.: 0.0197 \n#> \n#> 2 x log-likelihood: -301.5060\n\n\n\n\n\nShow R codetibble(name = names(coef(glm1)), poisson = coef(glm1), nb = coef(glm1.nb))\n\n\nTable 4.5: Poisson versus Negative Binomial Regression coefficient estimates\n\n\n\n \n\n\n\n\n\n\nzero-inflation\n\nShow R codelibrary(glmmTMB)\nzinf_fit1 = glmmTMB(\n family = \"poisson\",\n data = needles,\n formula = shared_syr ~ age + sex + homeless*polydrug,\n ziformula = ~ age + sex + homeless + polydrug # fit won't converge with interaction\n)\n\nzinf_fit1 |> \n parameters(exponentiate = TRUE) |> \n print_md()\n\n\nTable 4.6: Zero-inflated poisson model\n\n\n\n# Fixed Effects\n\n\n\n\n\n\n\n\n\nParameter\nIRR\nSE\n95% CI\nz\np\n\n\n\n(Intercept)\n3.16\n0.82\n(1.90, 5.25)\n4.44\n< .001\n\n\nage\n1.01\n5.88e-03\n(1.00, 1.02)\n1.74\n0.081\n\n\nsex [F]\n3.43\n0.44\n(2.67, 4.40)\n9.68\n< .001\n\n\nhomeless [homeless]\n3.44\n0.47\n(2.63, 4.50)\n9.03\n< .001\n\n\npolydrug [multiple drugs used]\n1.85e-09\n1.21e-05\n(0.00, Inf)\n-3.08e-03\n0.998\n\n\nhomeless [homeless] × polydrug [multiple drugs used]\n1.38e+08\n9.04e+11\n(0.00, Inf)\n2.87e-03\n0.998\n\n\n\n\n# Zero-Inflation\n\n\n\n\n\n\n\n\n\nParameter\nOdds Ratio\nSE\n95% CI\nz\np\n\n\n\n(Intercept)\n0.49\n0.54\n(0.06, 4.25)\n-0.65\n0.514\n\n\nage\n1.05\n0.03\n(1.00, 1.10)\n1.95\n0.051\n\n\nsex [F]\n1.44\n0.84\n(0.46, 4.50)\n0.62\n0.533\n\n\nhomeless [homeless]\n0.68\n0.34\n(0.26, 1.80)\n-0.78\n0.436\n\n\npolydrug [multiple drugs used]\n1.15\n0.91\n(0.24, 5.43)\n0.18\n0.858\n\n\n\n\n\n\n\n\nzero-inflated negative binomial model\n\nShow R codelibrary(glmmTMB)\nzinf_fit1 = glmmTMB(\n family = nbinom2,\n data = needles,\n formula = shared_syr ~ age + sex + homeless*polydrug,\n ziformula = ~ age + sex + homeless + polydrug # fit won't converge with interaction\n)\n\nzinf_fit1 |> \n parameters(exponentiate = TRUE) |> \n print_md()\n\n\nTable 4.7: Zero-inflated negative binomial model\n\n\n\n# Fixed Effects\n\n\n\n\n\n\n\n\n\nParameter\nIRR\nSE\n95% CI\nz\np\n\n\n\n(Intercept)\n1.06\n1.48\n(0.07, 16.52)\n0.04\n0.969\n\n\nage\n1.02\n0.03\n(0.96, 1.08)\n0.53\n0.599\n\n\nsex [F]\n6.86\n6.36\n(1.12, 42.16)\n2.08\n0.038\n\n\nhomeless [homeless]\n6.44\n4.59\n(1.60, 26.01)\n2.62\n0.009\n\n\npolydrug [multiple drugs used]\n8.25e-10\n7.07e-06\n(0.00, Inf)\n-2.44e-03\n0.998\n\n\nhomeless [homeless] × polydrug [multiple drugs used]\n2.36e+08\n2.02e+12\n(0.00, Inf)\n2.25e-03\n0.998\n\n\n\n\n# Zero-Inflation\n\n\n\n\n\n\n\n\n\nParameter\nOdds Ratio\nSE\n95% CI\nz\np\n\n\n\n(Intercept)\n0.10\n0.20\n(1.47e-03, 6.14)\n-1.11\n0.269\n\n\nage\n1.07\n0.04\n(0.99, 1.15)\n1.78\n0.075\n\n\nsex [F]\n2.72\n2.40\n(0.48, 15.33)\n1.13\n0.258\n\n\nhomeless [homeless]\n1.15\n0.86\n(0.27, 4.96)\n0.19\n0.853\n\n\npolydrug [multiple drugs used]\n0.75\n0.86\n(0.08, 7.12)\n-0.25\n0.799\n\n\n\n\n# Dispersion\n\nParameter\nCoefficient\n95% CI\n\n\n(Intercept)\n0.44\n(0.11, 1.71)\n\n\n\n\n\n\n\n\n\n\n\n\n\nDobson, Annette J, and Adrian G Barnett. 2018. An Introduction to Generalized Linear Models. 4th ed. CRC press. https://doi.org/10.1201/9781315182780.\n\n\nVittinghoff, Eric, David V Glidden, Stephen C Shiboski, and Charles E McCulloch. 2012. Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models. 2nd ed. Springer. https://doi.org/10.1007/978-1-4614-1353-0.",
"crumbs": [
"Generalized Linear Models",
"4Models for Count Outcomes"
@@ -533,7 +511,7 @@
"href": "intro-to-survival-analysis.html#overview",
"title": "\n5 Introduction to Survival Analysis\n",
"section": "\n5.1 Overview",
- "text": "5.1 Overview\n\n5.1.1 Time-to-event outcomes\nSurvival analysis is a framework for modeling time-to-event outcomes. It is used in:\n\nclinical trials, where the event is often death or recurrence of disease.\nengineering reliability analysis, where the event is failure of a device or system.\ninsurance, particularly life insurance, where the event is death.\n\n\n\n\n\n\n\nNote\n\n\n\nThe term Survival analysis is a bit misleading. Survival outcomes can sometimes be analyzed using binomial models (logistic regression). Time-to-event models might be a better name.",
+ "text": "5.1 Overview\n\n5.1.1 Time-to-event outcomes\nSurvival analysis is a framework for modeling time-to-event outcomes. It is used in:\n\nclinical trials, where the event is often death or recurrence of disease.\nengineering reliability analysis, where the event is failure of a device or system.\ninsurance, particularly life insurance, where the event is death.\n\n\n\n\n\n\n\nNote\n\n\n\nThe term survival analysis is a bit misleading. Survival outcomes can sometimes be analyzed using binomial models (logistic regression). Time-to-event models or survival time analysis might be a better name.",
"crumbs": [
"Time to Event Models",
"5Introduction to Survival Analysis"
@@ -852,7 +830,7 @@
"href": "references.html",
"title": "References",
"section": "",
- "text": "Anderson, Edgar. 1935. “The Irises of the Gaspe Peninsula.”\nBulletin of American Iris Society 59: 2–5.\n\n\nBache, Stefan Milton, and Hadley Wickham. 2022. Magrittr: A\nForward-Pipe Operator for r. https://CRAN.R-project.org/package=magrittr.\n\n\nBanerjee, Sudipto, and Anindya Roy. 2014. Linear Algebra and Matrix\nAnalysis for Statistics. Vol. 181. Crc Press Boca Raton.\n\n\nCanchola, Alison J, Susan L Stewart, Leslie Bernstein, Dee W West,\nRonald K Ross, Dennis Deapen, Richard Pinder, et al. 2003. “Cox\nRegression Using Different Time-Scales.” Western Users of SAS\nSoftware. https://www.lexjansen.com/wuss/2003/DataAnalysis/i-cox_time_scales.pdf.\n\n\nCasella, George, and Roger Berger. 2002. Statistical Inference.\n2nd ed. Cengage Learning. https://www.cengage.com/c/statistical-inference-2e-casella-berger/9780534243128/.\n\n\nChang, Winston. 2024. R Graphics Cookbook: Practical Recipes for\nVisualizing Data. O’Reilly Media. https://r-graphics.org/.\n\n\nChatterjee, Samprit, and Ali S Hadi. 2015. Regression Analysis by\nExample. John Wiley & Sons. https://www.wiley.com/en-us/Regression+Analysis+by+Example%2C+4th+Edition-p-9780470055458.\n\n\nDalgaard, Peter. 2008. Introductory Statistics with r. New\nYork, NY: Springer New York. https://link.springer.com/book/10.1007/978-0-387-79054-1.\n\n\nDobson, Annette J, and Adrian G Barnett. 2018. An Introduction to\nGeneralized Linear Models. 4th ed. CRC press. https://doi.org/10.1201/9781315182780.\n\n\nDunn, Peter K, Gordon K Smyth, et al. 2018. Generalized Linear\nModels with Examples in r. Vol. 53. Springer. https://link.springer.com/book/10.1007/978-1-4419-0118-7.\n\n\nEfron, Bradley, and David V Hinkley. 1978. “Assessing the Accuracy\nof the Maximum Likelihood Estimator: Observed Versus Expected Fisher\nInformation.” Biometrika 65 (3): 457–83.\n\n\nFaraway, Julian J. 2016. Extending the Linear Model with r:\nGeneralized Linear, Mixed Effects and Nonparametric Regression\nModels. 2nd ed. Chapman; Hall/CRC. https://doi.org/10.1201/9781315382722.\n\n\nFay, Colin, Sébastien Rochette, Vincent Guyader, and Cervan Girard.\n2021. Engineering Production-Grade Shiny Apps. Chapman;\nHall/CRC. https://engineering-shiny.org/.\n\n\nFieller, Nick. 2016. Basics of Matrix Algebra for Statistics with\nr. Chapman; Hall/CRC. https://doi.org/10.1201/9781315370200.\n\n\nFox, John. 2015. Applied Regression Analysis and Generalized Linear\nModels. Sage publications.\n\n\nGrambsch, Patricia M, and Terry M Therneau. 1994. “Proportional\nHazards Tests and Diagnostics Based on Weighted Residuals.”\nBiometrika 81 (3): 515–26. https://doi.org/10.1093/biomet/81.3.515.\n\n\nHarrell, Frank E. 2015. Regression Modeling Strategies: With\nApplications to Linear Models, Logistic Regression, and Survival\nAnalysis. 2nd ed. Springer. https://doi.org/10.1007/978-3-319-19425-7.\n\n\nHosmer Jr, David W, Stanley Lemeshow, and Rodney X Sturdivant. 2013.\nApplied Logistic Regression. John Wiley & Sons.\n\n\nJames, Gareth, Daniela Witten, Trevor Hastie, Robert Tibshirani, et al.\n2013. An Introduction to Statistical Learning. Vol. 112.\nSpringer. https://www.statlearning.com/.\n\n\nKhuri, André I. 2003. Advanced Calculus with Applications in\nStatistics. John Wiley & Sons. https://www.wiley.com/en-us/Advanced+Calculus+with+Applications+in+Statistics%2C+2nd+Edition-p-9780471391043.\n\n\nKlein, John P, Melvin L Moeschberger, et al. 2003. Survival\nAnalysis: Techniques for Censored and Truncated Data. Vol. 1230.\nSpringer. https://link.springer.com/book/10.1007/b97377.\n\n\nKleinbaum, David G, and Mitchel Klein. 2010. Logistic\nRegression. 3rd ed. Springer. https://link.springer.com/book/10.1007/978-1-4419-1742-3.\n\n\n———. 2012. Survival Analysis a Self-Learning Text. 3rd ed.\nSpringer. https://link.springer.com/book/10.1007/978-1-4419-6646-9.\n\n\nKleinbaum, David G, Lawrence L Kupper, Azhar Nizam, K Muller, and ES\nRosenberg. 2014. Applied Regression Analysis and Other Multivariable\nMethods. 5th ed. Cengage Learning. https://www.cengage.com/c/applied-regression-analysis-and-other-multivariable-methods-5e-kleinbaum/9781285051086/.\n\n\nKleinman, Ken, and Nicholas J Horton. 2009. SAS and r: Data\nManagement, Statistical Analysis, and Graphics. Chapman; Hall/CRC.\nhttps://www.routledge.com/SAS-and-R-Data-Management-Statistical-Analysis-and-Graphics-Second-Edition/Kleinman-Horton/p/book/9781466584495.\n\n\nKuhn, Max, and Julia Silge. 2022. Tidy Modeling with r. \"\nO’Reilly Media, Inc.\". https://www.tmwr.org/.\n\n\nKutner, Michael H, Christopher J Nachtsheim, John Neter, and William Li.\n2005. Applied Linear Statistical Models. McGraw-Hill.\n\n\nLawrance, Rachael, Evgeny Degtyarev, Philip Griffiths, Peter Trask,\nHelen Lau, Denise D’Alessio, Ingolf Griebsch, Gudrun Wallenstein, Kim\nCocks, and Kaspar Rufibach. 2020. “What Is an Estimand, and How\nDoes It Relate to Quantifying the Effect of Treatment on\nPatient-Reported Quality of Life Outcomes in Clinical Trials?”\nJournal of Patient-Reported Outcomes 4 (1): 1–8. https://link.springer.com/article/10.1186/s41687-020-00218-5.\n\n\nMcCullagh, Peter, and J. A. Nelder. 1989. Generalized Linear\nModels. 2nd ed. Routledge. https://www.utstat.toronto.edu/~brunner/oldclass/2201s11/readings/glmbook.pdf.\n\n\nMcLachlan, Geoffrey J, and Thriyambakam Krishnan. 2007. The EM\nAlgorithm and Extensions. 2nd ed. John Wiley & Sons. https://doi.org/10.1002/9780470191613.\n\n\nMoore, Dirk F. 2016. Applied Survival Analysis Using r. Vol.\n473. Springer.\n\n\nNahhas, Ramzi W. n.d. Introduction to Regression Methods for Public\nHealth Using r. CRC Press. https://www.bookdown.org/rwnahhas/RMPH/.\n\n\nPebesma, Edzer, and Roger Bivand. 2023. Spatial Data Science: With\nApplications in R. Boca Raton: Chapman; Hall/CRC. https://doi.org/10.1201/9780429459016.\n\n\nPohl, Moritz, Lukas Baumann, Rouven Behnisch, Marietta Kirchner,\nJohannes Krisam, and Anja Sander. 2021. “Estimands—A Basic Element for Clinical\nTrials.” Deutsches Ärzteblatt\nInternational 118 (51-52): 883–88. https://doi.org/10.3238/arztebl.m2021.0373.\n\n\nPolin, Richard A, William W Fox, and Steven H Abman. 2011. Fetal and\nNeonatal Physiology. 4th ed. Elsevier health sciences.\n\n\nRosenman, Ray H, Richard J Brand, C David Jenkins, Meyer Friedman,\nReuben Straus, and Moses Wurm. 1975. “Coronary Heart Disease in\nthe Western Collaborative Group Study: Final Follow-up Experience of 8\n1/2 Years.” JAMA 233 (8): 872–77. https://doi.org/10.1001/jama.1975.03260080034016.\n\n\nSearle, Shayle R, and Andre I Khuri. 2017. Matrix Algebra Useful for\nStatistics. John Wiley & Sons.\n\n\nSeber, George AF, and Alan J Lee. 2012. Linear Regression\nAnalysis. 2nd ed. John Wiley & Sons. https://www.wiley.com/en-us/Linear+Regression+Analysis%2C+2nd+Edition-p-9781118274422.\n\n\nSelvin, Steve. 2001. Epidemiologic Analysis: A Case-Oriented\nApproach. Oxford University Press.\n\n\nVan Buuren, Stef. 2018. Flexible Imputation of Missing Data.\nCRC press. https://stefvanbuuren.name/fimd/.\n\n\nVenables, Bill. 2023. codingMatrices: Alternative Factor Coding\nMatrices for Linear Model Formulae. https://CRAN.R-project.org/package=codingMatrices.\n\n\nVittinghoff, Eric, David V Glidden, Stephen C Shiboski, and Charles E\nMcCulloch. 2012. Regression Methods in Biostatistics: Linear,\nLogistic, Survival, and Repeated Measures Models. 2nd ed. Springer.\nhttps://doi.org/10.1007/978-1-4614-1353-0.\n\n\nWickham, Hadley. 2019. Advanced r. Chapman; Hall/CRC. https://adv-r.hadley.nz/index.html.\n\n\n———. 2021. Mastering Shiny. \" O’Reilly Media, Inc.\". https://mastering-shiny.org/.\n\n\nWickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy\nD’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019.\n“Welcome to the tidyverse.”\nJournal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.\n\n\nWickham, Hadley, and Jennifer Bryan. 2023. R Packages. \"\nO’Reilly Media, Inc.\". https://r-pkgs.org/.\n\n\nWickham, Hadley, Mine Çetinkaya-Rundel, and Garrett Grolemund. 2023.\nR for Data Science. \" O’Reilly Media, Inc.\". https://r4ds.hadley.nz/.",
+ "text": "Anderson, Edgar. 1935. “The Irises of the Gaspe Peninsula.”\nBulletin of American Iris Society 59: 2–5.\n\n\nBache, Stefan Milton, and Hadley Wickham. 2022. Magrittr: A\nForward-Pipe Operator for r. https://CRAN.R-project.org/package=magrittr.\n\n\nBanerjee, Sudipto, and Anindya Roy. 2014. Linear Algebra and Matrix\nAnalysis for Statistics. Vol. 181. Crc Press Boca Raton.\n\n\nCanchola, Alison J, Susan L Stewart, Leslie Bernstein, Dee W West,\nRonald K Ross, Dennis Deapen, Richard Pinder, et al. 2003. “Cox\nRegression Using Different Time-Scales.” Western Users of SAS\nSoftware. https://www.lexjansen.com/wuss/2003/DataAnalysis/i-cox_time_scales.pdf.\n\n\nCasella, George, and Roger Berger. 2002. Statistical Inference.\n2nd ed. Cengage Learning. https://www.cengage.com/c/statistical-inference-2e-casella-berger/9780534243128/.\n\n\nChang, Winston. 2024. R Graphics Cookbook: Practical Recipes for\nVisualizing Data. O’Reilly Media. https://r-graphics.org/.\n\n\nChatterjee, Samprit, and Ali S Hadi. 2015. Regression Analysis by\nExample. John Wiley & Sons. https://www.wiley.com/en-us/Regression+Analysis+by+Example%2C+4th+Edition-p-9780470055458.\n\n\nDalgaard, Peter. 2008. Introductory Statistics with r. New\nYork, NY: Springer New York. https://link.springer.com/book/10.1007/978-0-387-79054-1.\n\n\nDobson, Annette J, and Adrian G Barnett. 2018. An Introduction to\nGeneralized Linear Models. 4th ed. CRC press. https://doi.org/10.1201/9781315182780.\n\n\nDunn, Peter K, Gordon K Smyth, et al. 2018. Generalized Linear\nModels with Examples in r. Vol. 53. Springer. https://link.springer.com/book/10.1007/978-1-4419-0118-7.\n\n\nEfron, Bradley, and David V Hinkley. 1978. “Assessing the Accuracy\nof the Maximum Likelihood Estimator: Observed Versus Expected Fisher\nInformation.” Biometrika 65 (3): 457–83.\n\n\nFaraway, Julian J. 2016. Extending the Linear Model with r:\nGeneralized Linear, Mixed Effects and Nonparametric Regression\nModels. 2nd ed. Chapman; Hall/CRC. https://doi.org/10.1201/9781315382722.\n\n\nFay, Colin, Sébastien Rochette, Vincent Guyader, and Cervan Girard.\n2021. Engineering Production-Grade Shiny Apps. Chapman;\nHall/CRC. https://engineering-shiny.org/.\n\n\nFieller, Nick. 2016. Basics of Matrix Algebra for Statistics with\nr. Chapman; Hall/CRC. https://doi.org/10.1201/9781315370200.\n\n\nFox, John. 2015. Applied Regression Analysis and Generalized Linear\nModels. Sage publications.\n\n\nGrambsch, Patricia M, and Terry M Therneau. 1994. “Proportional\nHazards Tests and Diagnostics Based on Weighted Residuals.”\nBiometrika 81 (3): 515–26. https://doi.org/10.1093/biomet/81.3.515.\n\n\nHarrell, Frank E. 2015. Regression Modeling Strategies: With\nApplications to Linear Models, Logistic Regression, and Survival\nAnalysis. 2nd ed. Springer. https://doi.org/10.1007/978-3-319-19425-7.\n\n\nHosmer Jr, David W, Stanley Lemeshow, and Rodney X Sturdivant. 2013.\nApplied Logistic Regression. John Wiley & Sons.\n\n\nJames, Gareth, Daniela Witten, Trevor Hastie, Robert Tibshirani, et al.\n2013. An Introduction to Statistical Learning. Vol. 112.\nSpringer. https://www.statlearning.com/.\n\n\nKhuri, André I. 2003. Advanced Calculus with Applications in\nStatistics. John Wiley & Sons. https://www.wiley.com/en-us/Advanced+Calculus+with+Applications+in+Statistics%2C+2nd+Edition-p-9780471391043.\n\n\nKlein, John P, Melvin L Moeschberger, et al. 2003. Survival\nAnalysis: Techniques for Censored and Truncated Data. Vol. 1230.\nSpringer. https://link.springer.com/book/10.1007/b97377.\n\n\nKleinbaum, David G, and Mitchel Klein. 2010. Logistic\nRegression. 3rd ed. Springer. https://link.springer.com/book/10.1007/978-1-4419-1742-3.\n\n\n———. 2012. Survival Analysis a Self-Learning Text. 3rd ed.\nSpringer. https://link.springer.com/book/10.1007/978-1-4419-6646-9.\n\n\nKleinbaum, David G, Lawrence L Kupper, Azhar Nizam, K Muller, and ES\nRosenberg. 2014. Applied Regression Analysis and Other Multivariable\nMethods. 5th ed. Cengage Learning. https://www.cengage.com/c/applied-regression-analysis-and-other-multivariable-methods-5e-kleinbaum/9781285051086/.\n\n\nKleinman, Ken, and Nicholas J Horton. 2009. SAS and r: Data\nManagement, Statistical Analysis, and Graphics. Chapman; Hall/CRC.\nhttps://www.routledge.com/SAS-and-R-Data-Management-Statistical-Analysis-and-Graphics-Second-Edition/Kleinman-Horton/p/book/9781466584495.\n\n\nKuhn, Max, and Julia Silge. 2022. Tidy Modeling with r. \"\nO’Reilly Media, Inc.\". https://www.tmwr.org/.\n\n\nKutner, Michael H, Christopher J Nachtsheim, John Neter, and William Li.\n2005. Applied Linear Statistical Models. McGraw-Hill.\n\n\nLawrance, Rachael, Evgeny Degtyarev, Philip Griffiths, Peter Trask,\nHelen Lau, Denise D’Alessio, Ingolf Griebsch, Gudrun Wallenstein, Kim\nCocks, and Kaspar Rufibach. 2020. “What Is an Estimand, and How\nDoes It Relate to Quantifying the Effect of Treatment on\nPatient-Reported Quality of Life Outcomes in Clinical Trials?”\nJournal of Patient-Reported Outcomes 4 (1): 1–8. https://link.springer.com/article/10.1186/s41687-020-00218-5.\n\n\nMcCullagh, Peter, and J. A. Nelder. 1989. Generalized Linear\nModels. 2nd ed. Routledge. https://www.utstat.toronto.edu/~brunner/oldclass/2201s11/readings/glmbook.pdf.\n\n\nMcLachlan, Geoffrey J, and Thriyambakam Krishnan. 2007. The EM\nAlgorithm and Extensions. 2nd ed. John Wiley & Sons. https://doi.org/10.1002/9780470191613.\n\n\nMoore, Dirk F. 2016. Applied Survival Analysis Using r. Vol.\n473. Springer. https://doi.org/10.1007/978-3-319-31245-3.\n\n\nNahhas, Ramzi W. n.d. Introduction to Regression Methods for Public\nHealth Using r. CRC Press. https://www.bookdown.org/rwnahhas/RMPH/.\n\n\nPebesma, Edzer, and Roger Bivand. 2023. Spatial Data Science: With\nApplications in R. Boca Raton: Chapman; Hall/CRC. https://doi.org/10.1201/9780429459016.\n\n\nPohl, Moritz, Lukas Baumann, Rouven Behnisch, Marietta Kirchner,\nJohannes Krisam, and Anja Sander. 2021. “Estimands—A Basic Element for Clinical\nTrials.” Deutsches Ärzteblatt\nInternational 118 (51-52): 883–88. https://doi.org/10.3238/arztebl.m2021.0373.\n\n\nPolin, Richard A, William W Fox, and Steven H Abman. 2011. Fetal and\nNeonatal Physiology. 4th ed. Elsevier health sciences.\n\n\nRosenman, Ray H, Richard J Brand, C David Jenkins, Meyer Friedman,\nReuben Straus, and Moses Wurm. 1975. “Coronary Heart Disease in\nthe Western Collaborative Group Study: Final Follow-up Experience of 8\n1/2 Years.” JAMA 233 (8): 872–77. https://doi.org/10.1001/jama.1975.03260080034016.\n\n\nSearle, Shayle R, and Andre I Khuri. 2017. Matrix Algebra Useful for\nStatistics. John Wiley & Sons.\n\n\nSeber, George AF, and Alan J Lee. 2012. Linear Regression\nAnalysis. 2nd ed. John Wiley & Sons. https://www.wiley.com/en-us/Linear+Regression+Analysis%2C+2nd+Edition-p-9781118274422.\n\n\nSelvin, Steve. 2001. Epidemiologic Analysis: A Case-Oriented\nApproach. Oxford University Press.\n\n\nVan Buuren, Stef. 2018. Flexible Imputation of Missing Data.\nCRC press. https://stefvanbuuren.name/fimd/.\n\n\nVenables, Bill. 2023. codingMatrices: Alternative Factor Coding\nMatrices for Linear Model Formulae. https://CRAN.R-project.org/package=codingMatrices.\n\n\nVittinghoff, Eric, David V Glidden, Stephen C Shiboski, and Charles E\nMcCulloch. 2012. Regression Methods in Biostatistics: Linear,\nLogistic, Survival, and Repeated Measures Models. 2nd ed. Springer.\nhttps://doi.org/10.1007/978-1-4614-1353-0.\n\n\nWickham, Hadley. 2019. Advanced r. Chapman; Hall/CRC. https://adv-r.hadley.nz/index.html.\n\n\n———. 2021. Mastering Shiny. \" O’Reilly Media, Inc.\". https://mastering-shiny.org/.\n\n\nWickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy\nD’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019.\n“Welcome to the tidyverse.”\nJournal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.\n\n\nWickham, Hadley, and Jennifer Bryan. 2023. R Packages. \"\nO’Reilly Media, Inc.\". https://r-pkgs.org/.\n\n\nWickham, Hadley, Mine Çetinkaya-Rundel, and Garrett Grolemund. 2023.\nR for Data Science. \" O’Reilly Media, Inc.\". https://r4ds.hadley.nz/.",
"crumbs": [
"References"
]