Add cumulative incidence function to data

add_cif(newdata, object, ...)

# Default S3 method
add_cif(
  newdata,
  object,
  ci = TRUE,
  overwrite = FALSE,
  alpha = 0.05,
  nsim = 500L,
  cause_var = "cause",
  time_var = NULL,
  interval_length = "intlen",
  ...
)

# S3 method for class 'pamm_ic'
add_cif(
  newdata,
  object,
  ci = TRUE,
  alpha = 0.05,
  nsim = 500L,
  cause_var = "cause",
  time_var = NULL,
  interval_length = "intlen",
  ...
)

Arguments

newdata

A data frame or list containing the values of the model covariates at which predictions are required. If this is not provided then predictions corresponding to the original data are returned. If newdata is provided then it should contain all the variables needed for prediction: a warning is generated if not. See details for use with link{linear.functional.terms}.

object

a fitted gam object as produced by gam().

...

Further arguments passed to predict.gam and get_hazard

ci

logical. Indicates if confidence intervals should be calculated. Defaults to TRUE.

overwrite

Should hazard columns be overwritten if already present in the data set? Defaults to FALSE. If TRUE, columns with names c("hazard", "se", "lower", "upper") will be overwritten.

alpha

Significance level for pooled confidence intervals.

nsim

Total number of pooled posterior draws used for the interval.

cause_var

Character. Column name of the 'cause' variable.

time_var

Name of the variable used for the baseline hazard. Defaults to "tend".

interval_length

Character, defaults to "intlen". contains the interval length in newdata.

Details

When computing cumulative incidence for multiple groups, the input data must be grouped via group_by() before calling this function. Omitting group_by() will not produce an error or warning but will return silently incorrect results, as the cumulative incidence will be accumulated over the entire dataset rather than within each group.

The returned data contains one boundary row per group at time_var = 0 for plotting cumulative incidence from the time origin. On this row, cif = 0; if confidence intervals are requested, cif_lower = cif_upper = 0. If an interval-length column is present, it is set to 0 on the boundary row. add_cumu_hazard() adds an analogous boundary row (with cumu_hazard = 0) for continuous-time models (GAM/SCAM/PAMM), controllable via its boundary argument; interval-factor models (e.g. PEM via glm) keep the original prediction grid without a boundary row.

Examples

# \donttest{
if (require("etm")) {
  data("fourD", package = "etm")
  ped_stacked <- fourD |>
    dplyr::select(-medication, -treated) |>
    as_ped(Surv(time, status) ~., id = "id") |>
    dplyr::mutate(cause = as.factor(cause))
  pam <- pamm(
    ped_status ~ s(tend, by = cause) + sex + sex:cause + age + age:cause,
    data = ped_stacked)
  ped_stacked |>
    make_newdata(tend = unique(tend), cause = unique(cause)) |>
    group_by(cause) |>
    add_cif(pam)
}
#> Loading required package: etm
#> # A tibble: 658 × 9
#> # Groups:   cause [2]
#>       tend    id sex     age cause  intlen      cif cif_lower cif_upper
#>      <dbl> <dbl> <chr> <dbl> <fct>   <dbl>    <dbl>     <dbl>     <dbl>
#>  1 0       5889. Male   65.2 1     0       0         0          0      
#>  2 0.00821 5889. Male   65.2 1     0.00821 0.000877  0.000660   0.00117
#>  3 0.0301  5889. Male   65.2 1     0.0219  0.00322   0.00243    0.00429
#>  4 0.0520  5889. Male   65.2 1     0.0219  0.00555   0.00420    0.00740
#>  5 0.0602  5889. Male   65.2 1     0.00821 0.00643   0.00486    0.00856
#>  6 0.0684  5889. Male   65.2 1     0.00821 0.00731   0.00552    0.00972
#>  7 0.0767  5889. Male   65.2 1     0.00821 0.00818   0.00619    0.0109 
#>  8 0.110   5889. Male   65.2 1     0.0329  0.0117    0.00886    0.0155 
#>  9 0.156   5889. Male   65.2 1     0.0465  0.0167    0.0127     0.0220 
#> 10 0.164   5889. Male   65.2 1     0.00821 0.0175    0.0133     0.0232 
#> # ℹ 648 more rows
# }