Extrapolation before imputation reduces bias when imputing censored covariates

This repository contains R code and simulation data to reproduce results from the manuscript by Lotspeich and Garcia (2022+).

For the imputeCensRd package, which implements the conditional mean imputation approaches from the paper, can be found in its own repo here.

Each of the "Script (Run Simulations)" files is coded to run 1 replication of each setting for demonstration. Per the NOTES at the bottom of the scripts, some more time-intensive simulations were run in parallel.

Tables

Table 1. Simulation results for Weibull $X$ dependent on $Z$ from the full cohort analysis (i.e., where all $n$ observations had uncensored $X$) and conditional mean imputation (CMI) approaches.

Table 2. Simulation results for $\hat{\beta}$ under various distributions of $X$ dependent on $Z$ from the full cohort analysis (i.e., where all $n$ observations had uncensored $X$) and conditional mean imputation (CMI) approaches.

Table S1. Simulation results for Weibull $X$ independent of $Z$ from the full cohort analysis (i.e., where all $n$ observations had uncensored $X$) and conditional mean imputation (CMI) approaches.

Figures

Figure S1. Interpolating Breslow's estimator $\widehat{S}_0(t)$ between uncensored values with either of the two interpolation methods offered similar bias and efficiency for $\hat{\beta}$ in extrapolated conditional mean imputation. Between uncensored values, $\widehat{S}_0(\cdot)$ was either be carried forward from the last uncensored value or taken as the mean of the uncensored values immediately before and after. The dashed line denotes the true parameter value, $\beta = 0.5$. Extrapolated CMI using either imputation method was successful in all but 44 replications out of 9000; these few replications encountered errors with numerical integration or non-convergence with the Cox model. Data were simulated following Section 3.1.

Figure S2. Illustration of the four extrapolation methods for a step survival function $\widehat{S}(t)$ in simulated data. The shaded area represents values of $t > \widetilde{X}$ (the largest uncensored value), where extrapolation is needed.

Script (Make Figure)

Figure S3. Extrapolating Breslow's estimator beyond the largest uncensored value $\widetilde{X}$ to the overall largest value $X_{(n)}$ with any of the three extrapolation methods offered similar bias and efficiency for $\hat{\beta}$ in non-extrapolated conditional mean imputation using the trapezoidal rule. The dashed line denotes the true parameter value, $\beta = 0.5$. Data were simulated following Section 3.1.

Figure S4. In our simulations with Weibull $X$, we explored light ($\sim 17%$), heavy ($\sim 49%$), and extra heavy ($\sim 78%$) censoring in $X$ by generating $C$ from an exponential distribution with rates $= 0.23$, $2$, and $10$, respectively.

Script (Make Figure)

Figure S5. Comparison of the probability density functions (A) and hazard functions (B) of the different distributions considered for the censored covariate $X$ in Sections 3.2 and 3.3.

Script (Make Figure)

Figure S6. Due to the Weibull distribution's skewness, higher censoring rates led to smaller values of $W_{(n)}$ (the maximum of the observed covariate used as the integral's upper bound by the trapezoidal rule). With smaller values of $W_{(n)}$, the trapezoidal rule was cutting off more of the survival function, leading to worse performance (i.e., higher bias) with non-extrapolated conditional mean imputation. A, B, and C are the empirical densities of $W_{(n)}$ when $X$ was generated from a Weibull, log-normal, and gamma distribution, respectively, under light, heavy, or extra heavy censoring.

Script (Make Figure)

Name		Name	Last commit message	Last commit date
Latest commit History 393 Commits
Figure-Data		Figure-Data
Figure-Scripts		Figure-Scripts
Figures		Figures
Misc-Data/Comp-Times		Misc-Data/Comp-Times
Sim-Scripts		Sim-Scripts
Table-Data		Table-Data
Table-Scripts		Table-Scripts
.gitignore		.gitignore
README.html		README.html
README.md		README.md
generate_data.R		generate_data.R
supp.pdf		supp.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Extrapolation before imputation reduces bias when imputing censored covariates

Tables

Figures

About

Uh oh!

Releases

Packages

Languages

sarahlotspeich/ExtrapolationBeforeImputation

Folders and files

Latest commit

History

Repository files navigation

Extrapolation before imputation reduces bias when imputing censored covariates

Tables

Figures

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages