Add (cumulative) hazard based on the provided data set and model. If ci=TRUE confidence intervals (CI) are also added. Their width can be controlled via the se_mult argument. The method by which the CI are calculated can be specified by ci_type. This is a wrapper around predict.gam. When reference is specified, the (log-)hazard ratio is calculated.

add_hazard(newdata, object, ...)

# S3 method for default
add_hazard(
  newdata,
  object,
  reference = NULL,
  type = c("response", "link"),
  ci = TRUE,
  se_mult = 2,
  ci_type = c("default", "delta", "sim"),
  overwrite = FALSE,
  time_var = NULL,
  ...
)

add_cumu_hazard(
  newdata,
  object,
  ci = TRUE,
  se_mult = 2,
  overwrite = FALSE,
  time_var = NULL,
  interval_length = "intlen",
  ...
)

Arguments

newdata

A data frame or list containing the values of the model covariates at which predictions are required. If this is not provided then predictions corresponding to the original data are returned. If newdata is provided then it should contain all the variables needed for prediction: a warning is generated if not. See details for use with link{linear.functional.terms}.

object

a fitted gam object as produced by gam().

...

Further arguments passed to predict.gam and get_hazard

reference

A data frame with number of rows equal to nrow(newdata) or one, or a named list with (partial) covariate specifications. See examples.

type

Either "response" or "link". The former calculates hazard, the latter the log-hazard.

ci

logical. Indicates if confidence intervals should be calculated. Defaults to TRUE.

se_mult

Factor by which standard errors are multiplied for calculating the confidence intervals.

ci_type

The method by which standard errors/confidence intervals will be calculated. Default transforms the linear predictor at respective intervals. "delta" calculates CIs based on the standard error calculated by the Delta method. "sim" draws the property of interest from its posterior based on the normal distribution of the estimated coefficients. See here for details and empirical evaluation.

overwrite

Should hazard columns be overwritten if already present in the data set? Defaults to FALSE. If TRUE, columns with names c("hazard", "se", "lower", "upper") will be overwritten.

time_var

Name of the variable used for the baseline hazard. If not given, defaults to "tend" for gam fits, else "interval". The latter is assumed to be a factor, the former numeric.

interval_length

The variable in newdata containing the interval lengths. Can be either bare unquoted variable name or character. Defaults to "intlen".

Examples

ped <- tumor[1:50,] %>% as_ped(Surv(days, status)~ age)
pam <- mgcv::gam(ped_status ~ s(tend)+age, data = ped, family=poisson(), offset=offset)
ped_info(ped) %>% add_hazard(pam, type="link")
#> # A tibble: 22 × 10
#>    tstart  tend intlen intmid interval    age hazard    se ci_lower ci_upper
#>     <dbl> <dbl>  <dbl>  <dbl> <fct>     <dbl>  <dbl> <dbl>    <dbl>    <dbl>
#>  1      0    27     27   13.5 (0,27]     59.7  -7.05 0.391    -7.83    -6.27
#>  2     27    33      6   30   (27,33]    59.7  -7.06 0.386    -7.83    -6.28
#>  3     33    55     22   44   (33,55]    59.7  -7.09 0.367    -7.82    -6.35
#>  4     55    62      7   58.5 (55,62]    59.7  -7.10 0.361    -7.82    -6.37
#>  5     62   139     77  100.  (62,139]   59.7  -7.20 0.310    -7.82    -6.58
#>  6    139   209     70  174   (139,209]  59.7  -7.29 0.280    -7.85    -6.73
#>  7    209   214      5  212.  (209,214]  59.7  -7.30 0.279    -7.86    -6.74
#>  8    214   257     43  236.  (214,257]  59.7  -7.35 0.269    -7.89    -6.82
#>  9    257   304     47  280.  (257,304]  59.7  -7.41 0.265    -7.94    -6.88
#> 10    304   308      4  306   (304,308]  59.7  -7.42 0.265    -7.95    -6.89
#> # ℹ 12 more rows
ped_info(ped) %>% add_hazard(pam, type = "response")
#> # A tibble: 22 × 10
#>    tstart  tend intlen intmid interval    age   hazard    se ci_lower ci_upper
#>     <dbl> <dbl>  <dbl>  <dbl> <fct>     <dbl>    <dbl> <dbl>    <dbl>    <dbl>
#>  1      0    27     27   13.5 (0,27]     59.7 0.000869 0.391 0.000397  0.00190
#>  2     27    33      6   30   (27,33]    59.7 0.000862 0.386 0.000398  0.00187
#>  3     33    55     22   44   (33,55]    59.7 0.000837 0.367 0.000402  0.00174
#>  4     55    62      7   58.5 (55,62]    59.7 0.000829 0.361 0.000402  0.00171
#>  5     62   139     77  100.  (62,139]   59.7 0.000747 0.310 0.000402  0.00139
#>  6    139   209     70  174   (139,209]  59.7 0.000681 0.280 0.000389  0.00119
#>  7    209   214      5  212.  (209,214]  59.7 0.000676 0.279 0.000387  0.00118
#>  8    214   257     43  236.  (214,257]  59.7 0.000640 0.269 0.000373  0.00110
#>  9    257   304     47  280.  (257,304]  59.7 0.000602 0.265 0.000355  0.00102
#> 10    304   308      4  306   (304,308]  59.7 0.000599 0.265 0.000353  0.00102
#> # ℹ 12 more rows
ped_info(ped) %>% add_cumu_hazard(pam)
#> # A tibble: 22 × 9
#>    tstart  tend intlen intmid interval    age cumu_hazard cumu_lower cumu_upper
#>     <dbl> <dbl>  <dbl>  <dbl> <fct>     <dbl>       <dbl>      <dbl>      <dbl>
#>  1      0    27     27   13.5 (0,27]     59.7      0.0235     0.0107     0.0513
#>  2     27    33      6   30   (27,33]    59.7      0.0286     0.0131     0.0625
#>  3     33    55     22   44   (33,55]    59.7      0.0470     0.0220     0.101 
#>  4     55    62      7   58.5 (55,62]    59.7      0.0528     0.0248     0.113 
#>  5     62   139     77  100.  (62,139]   59.7      0.110      0.0557     0.220 
#>  6    139   209     70  174   (139,209]  59.7      0.158      0.0829     0.303 
#>  7    209   214      5  212.  (209,214]  59.7      0.161      0.0849     0.309 
#>  8    214   257     43  236.  (214,257]  59.7      0.189      0.101      0.356 
#>  9    257   304     47  280.  (257,304]  59.7      0.217      0.118      0.404 
#> 10    304   308      4  306   (304,308]  59.7      0.220      0.119      0.408 
#> # ℹ 12 more rows