The inverse probability of censoring weights (IPCW) method is a powerful tool for adjusting survival analysis in the presence of treatment switching. In scenarios where patients switch treatments, survival data are artificially censored, leading to biased estimates if not properly addressed. This bias arises because the decision to switch treatments may depend on prognostic factors that also influence survival outcomes.
To mitigate this bias, researchers gather data on prognostic covariates at baseline and at regular intervals until one of the following occurs: a treatment switch, death, drop-out, or the end of the follow-up period. A “switching model” is then developed to estimate weights for each patient at each time point. These weights represent the inverse probability of remaining unswitched over time. Each patient retains a weight greater than 1, which accounts for both their own data and that of switchers with similar prognostic profiles.
Finally, an “outcome” model utilizing IPCW-weighted survival times is employed to estimate the treatment effect, adjusted for switching.
The switching model can be implemented using either pooled logistic regression or proportional hazards regression with time-dependent covariates.
When using pooled logistic regression for the switching model, the following specifications can be made:
Visit-Specific Intercepts: These can be modeled
using a natural cubic spline with specified degrees of freedom, denoted
as \(\nu\) (corresponding to the
ns_df
parameter of the ipcw
function). The
boundary and internal knots can be based on the range and percentiles of
observed treatment switching times, respectively. Here, \(\nu\) equals the number of internal knots
plus one; a value of \(\nu=0\)
indicates a common intercept, while \(\nu=1\) leads to a linear effect of visit.
By default, \(\nu=3\), which implies
two internal knots for the cubic spline. The parameter
relative_time
is set to TRUE
by default, which
indicates that visit time is relative to the secondary baseline defined
by the swtrt_time_lower
parameter. Setting
relative_time = FALSE
makes visit time relative to
randomization.
Stratification Factors: If more more than one
stratification factor exists, they can be included in the logistic model
either as main effects only
(strata_main_effect_only = TRUE
) or as all possible
combinations of the levels of the stratification factors
(strata_main_effect_only = FALSE
).
Handling Nonconvergence: If the pooled logistic
regression model encounters issues such as complete or quasi-complete
separation, Firth’s bias-reducing penalized likelihood method
(firth = TRUE
) can be applied alongside intercept
correction (flic = TRUE
) to yield more accurate predicted
probabilities.
When using a Cox model with time-dependent covariates as the switching model, adjacent observations within a subject may be combined as long as the values of time-dependent covariates do not change over each combined interval. This approach saves data storage but observations across subjects will not be aligned at regular intervals. In this case, unique event times across treatment arms should be replicated within each subject, enabling the availability of weights at each event time.
To address the high variability in “unstabilized” weights, researchers often opt for “stabilized” weights. The switching model for stabilized weights involves a model for the numerator of the weight in addition to the model for the denominator. Typically, the numerator model mirrors the denominator model but includes only prognostic baseline variables. Any prognostic baseline variables included in the numerator must also be part of the weighted outcome model.
You can specify weight truncation using the trunc
and
trunc_upper_only
parameters of the ipcw
function. The trunc
parameter is a number between 0 and 0.5
that represents the fraction of the weight distribution to truncate. The
trunc_upper_only
parameter is a flag that indicates whether
to truncate weights only from the upper end of the weight
distribution.
Once weights have been obtained for each event time in the data set, we fit a (potentially stratified) Cox proportional hazards model to this weighted data set to obtain an estimate of the hazard ratio. The confidence interval for the hazard ratio can be derived by bootstrapping the weight estimation and the subsequent model-fitting process.
First we prepare the data.
sim1 <- tsegestsim(
n = 500, allocation1 = 2, allocation2 = 1, pbprog = 0.5,
trtlghr = -0.5, bprogsl = 0.3, shape1 = 1.8,
scale1 = 0.000025, shape2 = 1.7, scale2 = 0.000015,
pmix = 0.5, admin = 5000, pcatnotrtbprog = 0.5,
pcattrtbprog = 0.25, pcatnotrt = 0.2, pcattrt = 0.1,
catmult = 0.5, tdxo = 1, ppoor = 0.1, pgood = 0.04,
ppoormet = 0.4, pgoodmet = 0.2, xomult = 1.4188308,
milestone = 546, swtrt_control_only = TRUE,
outputRawDataset = 1, seed = 2000)
Given that the observations are collected at regular intervals, we
can use a pooled logistic regression switching model. In this model, the
prognostic baseline variable is represented by bprog
, while
catlag
and the interaction between bprog
and
catlag
serve as time-dependent covariates. Additionally,
s1
, s2
, and s3
are the three
terms for the natural cubic spline with \(\nu=3\) degrees of freedom for
visit-specific intercepts.
fit1 <- ipcw(
sim1$paneldata, id = "id", tstart = "tstart",
tstop = "tstop", event = "died", treat = "trtrand",
swtrt = "xo", swtrt_time = "xotime",
swtrt_time_lower = "timePFSobs",
swtrt_time_upper = "xotime_upper", base_cov = "bprog",
numerator = "bprog", denominator = "bprog*catlag",
logistic_switching_model = TRUE, ns_df = 3,
relative_time = TRUE, swtrt_control_only = TRUE,
boot = FALSE)
The fits for the denominator and numerator switching models are as follows, utilizing robust sandwich variance estimates to account for correlations among observations within the same subject.
# denominator switching model fit
fit1$fit_switch[[1]]$fit_den$parest[, c("param", "beta", "sebeta", "z")]
#> param beta sebeta z
#> 1 (Intercept) -3.634890835 0.3673680 -9.8944142
#> 2 bprog 1.873087075 0.3274510 5.7202059
#> 3 catlag 0.335398224 0.5674642 0.5910474
#> 4 bprog.catlag 0.007983132 0.6767029 0.0117971
#> 5 s1 0.291678448 0.5360600 0.5441153
#> 6 s2 1.029271932 0.7417310 1.3876620
#> 7 s3 1.535968575 0.4393257 3.4961957
# numerator switching model fit
fit1$fit_switch[[1]]$fit_num$parest[, c("param", "beta", "sebeta", "z")]
#> param beta sebeta z
#> 1 (Intercept) -3.6701302 0.3544464 -10.354542
#> 2 bprog 1.9214977 0.2910121 6.602810
#> 3 s1 0.5108622 0.5036219 1.014376
#> 4 s2 1.1650715 0.7262563 1.604215
#> 5 s3 1.7329918 0.3890718 4.454170
Since treatment switching is not allowed in the experimental arm, the weights are identical to 1 for subjects in the experimental group. The unstabilized and stablized weights for the control group are plotted below.
```{r weights example 1}
# unstabilized weights
ggplot(fit1$data_outcome %>% filter(trtrand == 0),
aes(x = unstabilized_weight)) +
geom_density(fill="#77bd89", color="#1f6e34", alpha=0.8) +
scale_x_continuous("unstabilized weights")
# stabilized weights
ggplot(fit1$data_outcome %>% filter(trtrand == 0),
aes(x = stabilized_weight)) +
geom_density(fill="#77bd89", color="#1f6e34", alpha=0.8) +
scale_x_continuous("stabilized weights")
```
Now we fit a weighted outcome Cox model and compare the treatemnt hazard ratio estimate with the reported.
Now we apply the IPCW method using a Cox proportional hazards model with time-dependent covariates as the switching model.
fit2 <- ipcw(
shilong, id = "id", tstart = "tstart", tstop = "tstop",
event = "event", treat = "bras.f", swtrt = "co",
swtrt_time = "dco",
base_cov = c("agerand", "sex.f", "tt_Lnum", "rmh_alea.c",
"pathway.f"),
numerator = c("agerand", "sex.f", "tt_Lnum", "rmh_alea.c",
"pathway.f"),
denominator = c("agerand", "sex.f", "tt_Lnum", "rmh_alea.c",
"pathway.f", "ps", "ttc", "tran"),
swtrt_control_only = FALSE, boot = FALSE)
The fits for the denominator and numerator switching models for the control arm are as follows, utilizing robust sandwich variance estimates to account for correlations among observations within the same subject.
# denominator switching model for the control group
fit2$fit_switch[[1]]$fit_den$parest[, c("param", "beta", "sebeta", "z")]
#> param beta sebeta z
#> 1 agerand 0.007642073 0.01009207 0.7572351
#> 2 sex.fFemale -0.364211003 0.26324814 -1.3835273
#> 3 tt_Lnum 0.042406215 0.04639827 0.9139611
#> 4 rmh_alea.c -0.351616244 0.27378303 -1.2842880
#> 5 pathway.fHR -0.041768569 0.39817164 -0.1049009
#> 6 pathway.fPI3K.AKT.mTOR 0.267055320 0.43180962 0.6184562
#> 7 ps 0.103478553 0.17788020 0.5817317
#> 8 ttc -0.480378314 0.32124349 -1.4953713
#> 9 tran 0.295828936 0.36809457 0.8036765
# numerator switching model for the control group
fit2$fit_switch[[1]]$fit_num$parest[, c("param", "beta", "sebeta", "z")]
#> param beta sebeta z
#> 1 agerand 0.01059477 0.00957653 1.10632644
#> 2 sex.fFemale -0.32452112 0.25680716 -1.26367627
#> 3 tt_Lnum 0.05956492 0.04488927 1.32693011
#> 4 rmh_alea.c -0.29758402 0.25884651 -1.14965436
#> 5 pathway.fHR 0.02467202 0.39869959 0.06188123
#> 6 pathway.fPI3K.AKT.mTOR 0.28245603 0.42341002 0.66709812
The fits for the denominator and numerator switching models for the experimental arm are as follows:
# denominator switching model for the experimental group
fit2$fit_switch[[2]]$fit_den$parest[, c("param", "beta", "sebeta", "z")]
#> param beta sebeta z
#> 1 agerand -0.002639441 0.01482504 -0.1780394
#> 2 sex.fFemale 0.428028469 0.49814996 0.8592362
#> 3 tt_Lnum -0.152758042 0.09460056 -1.6147689
#> 4 rmh_alea.c 0.210365664 0.53818671 0.3908786
#> 5 pathway.fHR 1.774466817 1.07337446 1.6531666
#> 6 pathway.fPI3K.AKT.mTOR 0.859950234 1.06150371 0.8101246
#> 7 ps 0.261142698 0.28658566 0.9112204
#> 8 ttc -0.408175689 0.57303595 -0.7123038
#> 9 tran 1.053347294 0.50568571 2.0830078
# numerator switching model for the experimental group
fit2$fit_switch[[2]]$fit_num$parest[, c("param", "beta", "sebeta", "z")]
#> param beta sebeta z
#> 1 agerand 0.001625622 0.01543679 0.1053083
#> 2 sex.fFemale 0.476979559 0.47723504 0.9994647
#> 3 tt_Lnum -0.160020273 0.09602402 -1.6664609
#> 4 rmh_alea.c 0.154833260 0.49850922 0.3105926
#> 5 pathway.fHR 1.841535014 1.06349774 1.7315834
#> 6 pathway.fPI3K.AKT.mTOR 1.064637540 1.05309041 1.0109650
Below, the unstabilized and stabilized weights are plotted by treatment group: the left panel displays data for the control group, while the right panel shows data for the experimental group.
```{r weights example 2}
# unstabilized weights
ggplot(fit2$data_outcome, aes(x = unstabilized_weight)) +
geom_density(fill="#77bd89", color="#1f6e34", alpha=0.8) +
scale_x_continuous("unstabilized weights") +
facet_wrap(~treated)
# stabilized weights
ggplot(fit2$data_outcome, aes(x = stabilized_weight)) +
geom_density(fill="#77bd89", color="#1f6e34", alpha=0.8) +
scale_x_continuous("stabilized weights") +
facet_wrap(~treated)
```
Finally, we fit the weighted outcome Cox model and compare the treatment hazard ratio estimate with the reported.
fit2$fit_outcome$parest[, c("param", "beta", "sebeta", "z")]
#> param beta sebeta z
#> 1 treated 0.356390611 0.25526832 1.3961412
#> 2 agerand -0.006047034 0.01018596 -0.5936636
#> 3 sex.fFemale -0.487409540 0.24458876 -1.9927716
#> 4 tt_Lnum 0.011244574 0.04147245 0.2711336
#> 5 rmh_alea.c 0.941651485 0.25315061 3.7197283
#> 6 pathway.fHR -0.127273307 0.35878557 -0.3547336
#> 7 pathway.fPI3K.AKT.mTOR -0.166035907 0.34997206 -0.4744262
c(fit2$hr, fit2$hr_CI)
#> [1] 1.4281653 0.8659517 2.3553924