```
library(dplyr)
library(mgcv)
library(pammtools)
library(ggplot2)
theme_set(theme_bw())
library(survival)
library(pec)
```

In this vignette we illustrate how to obtain model performance
measures for PAMMs, specifically the C-Index and (Integrated) Brier
Score (IBS). Both is achieved by providing an
`predictSurvProb`

extension for PAMMs, which allows the usage
of the ** pec** package (Mogensen, Ishwaran, and Gerds 2012) for
model evaluation.

Below we split the `tumor`

data contained within the
package and split it into a training and a test data set. The former is
used to train the models, the latter to obtain out of sample performance
measures.

In order to work with the `pec`

or `cindex`

functions from the ** pec** package, the models
have to be fit using the

`pamm`

function (which is a thin
wrapper around `mgcv::gam`

). Once the models are fit, the
prediction error curves (PEC) and C-Index can be computed similar to any
other model (see respective help pages `?pec::pec`

and
`?pec::cindex`

). Note that for technical reasons, evaluation
of `pamm`

objects should not start at exact 0 (see
`times`

and `start`

arguments below).```
data(tumor)
## split data into train and test data
n_train <- 400
train_idx <- sample(seq_len(nrow(tumor)), n_train)
test_idx <- setdiff(seq_len(nrow(tumor)), train_idx)
## data transformation
tumor_ped <- as_ped(tumor[train_idx, ], Surv(days, status)~.)
# some simple models for comparison
pam1 <- pamm(
formula = ped_status ~ s(tend) + charlson_score + age,
data = tumor_ped)
pam2 <- pamm(
formula = ped_status ~ s(tend) + charlson_score + age + metastases + complications,
data = tumor_ped)
pam3 <- pamm(
formula = ped_status ~s(tend, by = complications) + charlson_score + age +
metastases,
data = tumor_ped)
# calculate prediction error curves (on test data)
pec <- pec(
list(pam1 = pam1, pam2 = pam2, pam3 = pam3),
Surv(days, status) ~ 1, # formula for IPCW
data = tumor[test_idx, ], # new data not used for model fit
times = seq(.01, 1200, by = 10),
start = .01,
exact = FALSE
)
```

The results illustrate that no one model is not necessarily better
w.r.t. the prediction error for all time-points. For example
`pam3`

is better than `pam2`

in the beginning and
worse towards the end. Similarly, the integrated brier score (IBS) also
depends on the evaluation time.

```
# plot prediction error curve
plot(pec)
```

```
# calculate integrated brier score
crps(pec, times = quantile(tumor$days[tumor$status == 1], c(.25, .5, .75)))
```

```
##
## Integrated Brier score (crps):
##
## IBS[0.01;time=203) IBS[0.01;time=533) IBS[0.01;time=1120)
## Reference 0.061 0.109 0.159
## pam1 0.060 0.108 0.163
## pam2 0.056 0.101 0.153
## pam3 0.053 0.101 0.157
```

Exemplary, we calculate the C-Index, however, note the warning message and the cited literature.

```
cindex(
list(pam1 = pam1, pam2 = pam2, pam3 = pam3),
Surv(days, status) ~ 1,
data = tumor[test_idx, ],
eval.times = quantile(tumor$days[tumor$status == 1], c(.25, .5, .75)))
```

```
##
## The c-index for right censored event times
##
## Prediction models:
##
## pam1 pam2 pam3
## pam1 pam2 pam3
##
## Right-censored response of a survival model
##
## No.Observations: 376
##
## Pattern:
## Freq
## event 165
## right.censored 211
##
## Censoring model for IPCW: marginal model (Kaplan-Meier for censoring distribution)
##
## No data splitting: either apparent or independent test sample performance
##
## Estimated C-index in %
##
## $AppCindex
## time=203 time=533 time=1120
## pam1 64.5 60.3 58.2
## pam2 76.7 67.6 62.5
## pam3 76.1 65.6 56.0
```

```
## Warning in summary.Cindex(x, print = TRUE, ...): The C-index is not proper for t-year predictions. Blanche et al. (2018), Biostatistics, 20(2): 347--357.
##
## Consider using time-dependent AUC instead: riskRegression::Score
```

Mogensen, Ulla B., Hemant Ishwaran, and Thomas A. Gerds. 2012.
“Evaluating Random Forests for
Survival Analysis Using
Prediction Error Curves.”
*Journal of Statistical Software* 50 (11): 1–23. http://www.jstatsoft.org/v50/i11.