Add (cumulative) hazard based on the provided data set and model.
If ci=TRUE
confidence intervals (CI) are also added. Their width can
be controlled via the se_mult
argument. The method by which the
CI are calculated can be specified by ci_type
.
This is a wrapper around
predict.gam
. When reference
is specified, the
(log-)hazard ratio is calculated.
add_hazard(newdata, object, ...)
# S3 method for default
add_hazard(
newdata,
object,
reference = NULL,
type = c("response", "link"),
ci = TRUE,
se_mult = 2,
ci_type = c("default", "delta", "sim"),
overwrite = FALSE,
time_var = NULL,
...
)
add_cumu_hazard(
newdata,
object,
ci = TRUE,
se_mult = 2,
overwrite = FALSE,
time_var = NULL,
interval_length = "intlen",
...
)
A data frame or list containing the values of the model covariates at which predictions
are required. If this is not provided then predictions corresponding to the
original data are returned. If newdata
is provided then
it should contain all the variables needed for prediction: a
warning is generated if not. See details for use with link{linear.functional.terms}
.
a fitted gam
object as produced by gam()
.
Further arguments passed to predict.gam
and
get_hazard
A data frame with number of rows equal to nrow(newdata)
or
one, or a named list with (partial) covariate specifications. See examples.
Either "response"
or "link"
. The former calculates
hazard, the latter the log-hazard.
logical
. Indicates if confidence intervals should be
calculated. Defaults to TRUE
.
Factor by which standard errors are multiplied for calculating the confidence intervals.
The method by which standard errors/confidence intervals
will be calculated. Default transforms the linear predictor at
respective intervals. "delta"
calculates CIs based on the standard
error calculated by the Delta method. "sim"
draws the
property of interest from its posterior based on the normal distribution of
the estimated coefficients. See here
for details and empirical evaluation.
Should hazard columns be overwritten if already present in
the data set? Defaults to FALSE
. If TRUE
, columns with names
c("hazard", "se", "lower", "upper")
will be overwritten.
Name of the variable used for the baseline hazard. If
not given, defaults to "tend"
for gam
fits, else
"interval"
. The latter is assumed to be a factor, the former
numeric.
The variable in newdata containing the interval lengths.
Can be either bare unquoted variable name or character. Defaults to "intlen"
.
ped <- tumor[1:50,] %>% as_ped(Surv(days, status)~ age)
pam <- mgcv::gam(ped_status ~ s(tend)+age, data = ped, family=poisson(), offset=offset)
ped_info(ped) %>% add_hazard(pam, type="link")
#> # A tibble: 22 × 10
#> tstart tend intlen intmid interval age hazard se ci_lower ci_upper
#> <dbl> <dbl> <dbl> <dbl> <fct> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 0 27 27 13.5 (0,27] 59.7 -7.05 0.391 -7.83 -6.27
#> 2 27 33 6 30 (27,33] 59.7 -7.06 0.386 -7.83 -6.28
#> 3 33 55 22 44 (33,55] 59.7 -7.09 0.367 -7.82 -6.35
#> 4 55 62 7 58.5 (55,62] 59.7 -7.10 0.361 -7.82 -6.37
#> 5 62 139 77 100. (62,139] 59.7 -7.20 0.310 -7.82 -6.58
#> 6 139 209 70 174 (139,209] 59.7 -7.29 0.280 -7.85 -6.73
#> 7 209 214 5 212. (209,214] 59.7 -7.30 0.279 -7.86 -6.74
#> 8 214 257 43 236. (214,257] 59.7 -7.35 0.269 -7.89 -6.82
#> 9 257 304 47 280. (257,304] 59.7 -7.41 0.265 -7.94 -6.88
#> 10 304 308 4 306 (304,308] 59.7 -7.42 0.265 -7.95 -6.89
#> # ℹ 12 more rows
ped_info(ped) %>% add_hazard(pam, type = "response")
#> # A tibble: 22 × 10
#> tstart tend intlen intmid interval age hazard se ci_lower ci_upper
#> <dbl> <dbl> <dbl> <dbl> <fct> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 0 27 27 13.5 (0,27] 59.7 0.000869 0.391 0.000397 0.00190
#> 2 27 33 6 30 (27,33] 59.7 0.000862 0.386 0.000398 0.00187
#> 3 33 55 22 44 (33,55] 59.7 0.000837 0.367 0.000402 0.00174
#> 4 55 62 7 58.5 (55,62] 59.7 0.000829 0.361 0.000402 0.00171
#> 5 62 139 77 100. (62,139] 59.7 0.000747 0.310 0.000402 0.00139
#> 6 139 209 70 174 (139,209] 59.7 0.000681 0.280 0.000389 0.00119
#> 7 209 214 5 212. (209,214] 59.7 0.000676 0.279 0.000387 0.00118
#> 8 214 257 43 236. (214,257] 59.7 0.000640 0.269 0.000373 0.00110
#> 9 257 304 47 280. (257,304] 59.7 0.000602 0.265 0.000355 0.00102
#> 10 304 308 4 306 (304,308] 59.7 0.000599 0.265 0.000353 0.00102
#> # ℹ 12 more rows
ped_info(ped) %>% add_cumu_hazard(pam)
#> # A tibble: 22 × 9
#> tstart tend intlen intmid interval age cumu_hazard cumu_lower cumu_upper
#> <dbl> <dbl> <dbl> <dbl> <fct> <dbl> <dbl> <dbl> <dbl>
#> 1 0 27 27 13.5 (0,27] 59.7 0.0235 0.0107 0.0513
#> 2 27 33 6 30 (27,33] 59.7 0.0286 0.0131 0.0625
#> 3 33 55 22 44 (33,55] 59.7 0.0470 0.0220 0.101
#> 4 55 62 7 58.5 (55,62] 59.7 0.0528 0.0248 0.113
#> 5 62 139 77 100. (62,139] 59.7 0.110 0.0557 0.220
#> 6 139 209 70 174 (139,209] 59.7 0.158 0.0829 0.303
#> 7 209 214 5 212. (209,214] 59.7 0.161 0.0849 0.309
#> 8 214 257 43 236. (214,257] 59.7 0.189 0.101 0.356
#> 9 257 304 47 280. (257,304] 59.7 0.217 0.118 0.404
#> 10 304 308 4 306 (304,308] 59.7 0.220 0.119 0.408
#> # ℹ 12 more rows