Skip to content

Simulation code for the manuscript "It’s integral: Replacing the trapezoidal rule to remove bias and correctly impute censored covariates with their conditional means"

Notifications You must be signed in to change notification settings

sarahlotspeich/ExtrapolationBeforeImputation

Repository files navigation

Extrapolation before imputation reduces bias when imputing censored covariates

This repository contains R code and simulation data to reproduce results from the manuscript by Lotspeich and Garcia (2022+).

For the imputeCensRd package, which implements the conditional mean imputation approaches from the paper, can be found in its own repo here.

Each of the "Script (Run Simulations)" files is coded to run 1 replication of each setting for demonstration. Per the NOTES at the bottom of the scripts, some more time-intensive simulations were run in parallel.

Tables

Table 1. Simulation results for Weibull $X$ dependent on $Z$ from the full cohort analysis (i.e., where all $n$ observations had uncensored $X$) and conditional mean imputation (CMI) approaches.

Table 2. Simulation results for $\hat{\beta}$ under various distributions of $X$ dependent on $Z$ from the full cohort analysis (i.e., where all $n$ observations had uncensored $X$) and conditional mean imputation (CMI) approaches.

Table S1. Simulation results for Weibull $X$ independent of $Z$ from the full cohort analysis (i.e., where all $n$ observations had uncensored $X$) and conditional mean imputation (CMI) approaches.

Figures

Figure S1. Interpolating Breslow's estimator $\widehat{S}_0(t)$ between uncensored values with either of the two interpolation methods offered similar bias and efficiency for $\hat{\beta}$ in extrapolated conditional mean imputation. Between uncensored values, $\widehat{S}_0(\cdot)$ was either be carried forward from the last uncensored value or taken as the mean of the uncensored values immediately before and after. The dashed line denotes the true parameter value, $\beta = 0.5$. Extrapolated CMI using either imputation method was successful in all but 44 replications out of 9000; these few replications encountered errors with numerical integration or non-convergence with the Cox model. Data were simulated following Section 3.1.

Figure S2. Illustration of the four extrapolation methods for a step survival function $\widehat{S}(t)$ in simulated data. The shaded area represents values of $t > \widetilde{X}$ (the largest uncensored value), where extrapolation is needed.

Figure S3. Extrapolating Breslow's estimator beyond the largest uncensored value $\widetilde{X}$ to the overall largest value $X_{(n)}$ with any of the three extrapolation methods offered similar bias and efficiency for $\hat{\beta}$ in non-extrapolated conditional mean imputation using the trapezoidal rule. The dashed line denotes the true parameter value, $\beta = 0.5$. Data were simulated following Section 3.1.

Figure S4. In our simulations with Weibull $X$, we explored light ($\sim 17%$), heavy ($\sim 49%$), and extra heavy ($\sim 78%$) censoring in $X$ by generating $C$ from an exponential distribution with rates $= 0.23$, $2$, and $10$, respectively.

Figure S5. Comparison of the probability density functions (A) and hazard functions (B) of the different distributions considered for the censored covariate $X$ in Sections 3.2 and 3.3.

Figure S6. Due to the Weibull distribution's skewness, higher censoring rates led to smaller values of $W_{(n)}$ (the maximum of the observed covariate used as the integral's upper bound by the trapezoidal rule). With smaller values of $W_{(n)}$, the trapezoidal rule was cutting off more of the survival function, leading to worse performance (i.e., higher bias) with non-extrapolated conditional mean imputation. A, B, and C are the empirical densities of $W_{(n)}$ when $X$ was generated from a Weibull, log-normal, and gamma distribution, respectively, under light, heavy, or extra heavy censoring.

About

Simulation code for the manuscript "It’s integral: Replacing the trapezoidal rule to remove bias and correctly impute censored covariates with their conditional means"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages