Add predicted (cumulative) hazard to data set

Add (cumulative) hazard based on the provided data set and model. If ci=TRUE confidence intervals (CI) are also added. Their width can be controlled via the se_mult argument. The method by which the CI are calculated can be specified by ci_type. This is a wrapper around predict.gam. When reference is specified, the (log-)hazard ratio is calculated.

add_hazard(newdata, object, ...)

# Default S3 method
add_hazard(
  newdata,
  object,
  reference = NULL,
  type = c("response", "link"),
  ci = TRUE,
  se_mult = 2,
  ci_type = c("default", "delta", "sim"),
  overwrite = FALSE,
  time_var = NULL,
  ...
)

add_cumu_hazard(
  newdata,
  object,
  ci = TRUE,
  se_mult = 2,
  overwrite = FALSE,
  time_var = NULL,
  interval_length = "intlen",
  ...
)

Arguments

newdata: A data frame or list containing the values of the model covariates at which predictions are required. If this is not provided then predictions corresponding to the original data are returned. If newdata is provided then it should contain all the variables needed for prediction: a warning is generated if not. See details for use with link{linear.functional.terms}.
object: a fitted gam object as produced by gam().
...: Further arguments passed to predict.gam and get_hazard
reference: A data frame with number of rows equal to nrow(newdata) or one, or a named list with (partial) covariate specifications. See examples.
type: Either "response" or "link". The former calculates hazard, the latter the log-hazard.
ci: logical. Indicates if confidence intervals should be calculated. Defaults to TRUE.
se_mult: Factor by which standard errors are multiplied for calculating the confidence intervals.
ci_type: The method by which standard errors/confidence intervals will be calculated. Default transforms the linear predictor at respective intervals. "delta" calculates CIs based on the standard error calculated by the Delta method. "sim" draws the property of interest from its posterior based on the normal distribution of the estimated coefficients. See here for details and empirical evaluation.
overwrite: Should hazard columns be overwritten if already present in the data set? Defaults to FALSE. If TRUE, columns with names c("hazard", "se", "lower", "upper") will be overwritten.
time_var: Name of the variable used for the baseline hazard. If not given, defaults to "tend" for gam fits, else "interval". The latter is assumed to be a factor, the former numeric.
interval_length: The variable in newdata containing the interval lengths. Can be either bare unquoted variable name or character. Defaults to "intlen".

Examples

ped <- tumor[1:50,] %>% as_ped(Surv(days, status)~ age)
pam <- mgcv::gam(ped_status ~ s(tend)+age, data = ped, family=poisson(), offset=offset)
ped_info(ped) %>% add_hazard(pam, type="link")
#> # A tibble: 22 × 10
#>    tstart  tend intlen intmid interval    age hazard    se ci_lower ci_upper
#>     <dbl> <dbl>  <dbl>  <dbl> <fct>     <dbl>  <dbl> <dbl>    <dbl>    <dbl>
#>  1      0    27     27   13.5 (0,27]     59.7  -7.05 0.391    -7.83    -6.27
#>  2     27    33      6   30   (27,33]    59.7  -7.06 0.386    -7.83    -6.28
#>  3     33    55     22   44   (33,55]    59.7  -7.09 0.367    -7.82    -6.35
#>  4     55    62      7   58.5 (55,62]    59.7  -7.10 0.361    -7.82    -6.37
#>  5     62   139     77  100.  (62,139]   59.7  -7.20 0.310    -7.82    -6.58
#>  6    139   209     70  174   (139,209]  59.7  -7.29 0.280    -7.85    -6.73
#>  7    209   214      5  212.  (209,214]  59.7  -7.30 0.279    -7.86    -6.74
#>  8    214   257     43  236.  (214,257]  59.7  -7.35 0.269    -7.89    -6.82
#>  9    257   304     47  280.  (257,304]  59.7  -7.41 0.265    -7.94    -6.88
#> 10    304   308      4  306   (304,308]  59.7  -7.42 0.265    -7.95    -6.89
#> # ℹ 12 more rows
ped_info(ped) %>% add_hazard(pam, type = "response")
#> # A tibble: 22 × 10
#>    tstart  tend intlen intmid interval    age   hazard    se ci_lower ci_upper
#>     <dbl> <dbl>  <dbl>  <dbl> <fct>     <dbl>    <dbl> <dbl>    <dbl>    <dbl>
#>  1      0    27     27   13.5 (0,27]     59.7 0.000869 0.391 0.000397  0.00190
#>  2     27    33      6   30   (27,33]    59.7 0.000862 0.386 0.000398  0.00187
#>  3     33    55     22   44   (33,55]    59.7 0.000837 0.367 0.000402  0.00174
#>  4     55    62      7   58.5 (55,62]    59.7 0.000829 0.361 0.000402  0.00171
#>  5     62   139     77  100.  (62,139]   59.7 0.000747 0.310 0.000402  0.00139
#>  6    139   209     70  174   (139,209]  59.7 0.000681 0.280 0.000389  0.00119
#>  7    209   214      5  212.  (209,214]  59.7 0.000676 0.279 0.000387  0.00118
#>  8    214   257     43  236.  (214,257]  59.7 0.000640 0.269 0.000373  0.00110
#>  9    257   304     47  280.  (257,304]  59.7 0.000602 0.265 0.000355  0.00102
#> 10    304   308      4  306   (304,308]  59.7 0.000599 0.265 0.000353  0.00102
#> # ℹ 12 more rows
ped_info(ped) %>% add_cumu_hazard(pam)
#> # A tibble: 22 × 9
#>    tstart  tend intlen intmid interval    age cumu_hazard cumu_lower cumu_upper
#>     <dbl> <dbl>  <dbl>  <dbl> <fct>     <dbl>       <dbl>      <dbl>      <dbl>
#>  1      0    27     27   13.5 (0,27]     59.7      0.0235     0.0107     0.0513
#>  2     27    33      6   30   (27,33]    59.7      0.0286     0.0131     0.0625
#>  3     33    55     22   44   (33,55]    59.7      0.0470     0.0220     0.101 
#>  4     55    62      7   58.5 (55,62]    59.7      0.0528     0.0248     0.113 
#>  5     62   139     77  100.  (62,139]   59.7      0.110      0.0557     0.220 
#>  6    139   209     70  174   (139,209]  59.7      0.158      0.0829     0.303 
#>  7    209   214      5  212.  (209,214]  59.7      0.161      0.0849     0.309 
#>  8    214   257     43  236.  (214,257]  59.7      0.189      0.101      0.356 
#>  9    257   304     47  280.  (257,304]  59.7      0.217      0.118      0.404 
#> 10    304   308      4  306   (304,308]  59.7      0.220      0.119      0.408 
#> # ℹ 12 more rows

Add predicted (cumulative) hazard to data set

Arguments

See also

Examples