In this article we illustrate how to fit cause specific hazards models to competing risks data. The standard way to estimate cause specific hazards is to create one data set for each event type and fit a seperate model. However, it is also possible to create one combined data set and enter the event type as a covariate (with interactions), such that it is possible to estimate shared effects (i.e., effects that contribute equally to the hazard of multiple event types).
For illustration we use the
fourD data set from the
etm package. The data set contains time-constant covariates like
sex as well as time-to-event (
time) and event type indicator
status (0 = censored, 1 = death from cariovascular events, 2 = death from other causes).
## id sex age medication status time treated ## 1 5002 Male 60 Placebo 0 5.8480493 0 ## 4 5006 Female 68 Placebo 0 5.2539357 0 ## 7 5011 Female 70 Placebo 1 2.9541410 0 ## 9 5014 Male 69 Placebo 1 0.9856263 0 ## 10 5017 Female 58 Placebo 1 0.2902122 0 ## 11 5018 Male 63 Placebo 1 3.9452430 0
The data transformation required to fit PAMMs to competing risks data is similar to the transformation in the single event case (see the data transformation vignette for details). In fact, internally the standard transformation is applied to each event type using
as_ped, however, some choices have to be made
For cause specific hazards without shared effects the combination of cause specific interval split points and list output is usually sufficient. For models with shared effects we need to stack the individual data sets and use split points common for all event types.
Finally, in many cases we will want to calculate and visualize the cumulative incidence functions for different covariate combinations. In
pammtools this can be again achieved using
make_newdata and using the appropriate
add_* function, here
## # A tibble: 6 x 5 ## # Groups: cause  ## tend cause cif cif_lower cif_upper ## <dbl> <fct> <dbl> <dbl> <dbl> ## 1 0.00821 1 0.000860 0.000659 0.00109 ## 2 0.192 1 0.0200 0.0157 0.0250 ## 3 0.222 1 0.0231 0.0181 0.0289 ## 4 0.00821 2 0.000414 0.000281 0.000570 ## 5 0.192 2 0.00973 0.00674 0.0133 ## 6 0.222 2 0.0113 0.00781 0.0154
Similarl to other applications of
add_* functions, we can additionally group by other covariate values:
The estimated CIFs can then be compared w.r.t. to
cause for each category of
ggplot(ndf, aes(x = tend, y = cif)) + geom_line(aes(col = cause)) + geom_ribbon( aes(ymin = cif_lower, ymax = cif_upper, fill = cause), alpha = .3) + facet_wrap(~sex, labeller = label_both)
or w.r.t. to
sex for each cause:
ggplot(ndf, aes(x = tend, y = cif)) + geom_line(aes(col = sex)) + geom_ribbon( aes(ymin = cif_lower, ymax = cif_upper, fill = sex), alpha = .3) + facet_wrap(~cause, labeller = label_both)