```
library(dplyr)
library(mgcv)
library(pammtools)
library(ggplot2)
theme_set(theme_bw())
library(survival)
library(pec)
```

In this vignette we illustrate how to obtain model performance measures for PAMMs, specifically the C-Index and (Integrated) Brier Score (IBS). Both is achieved by providing an `predictSurvProb`

extension for PAMMs, which allows the usage of the ** pec** package (Mogensen, Ishwaran, and Gerds 2012) for model evaluation.

Below we split the `tumor`

data contained within the package and split it into a training and a test data set. The former is used to train the models, the latter to obtain out of sample performance measures.

In order to work with the `pec`

or `cindex`

functions from the ** pec** package, the models have to be fit using the

`pamm`

function (which is a thin wrapper around `mgcv::gam`

). Once the models are fit, the prediction error curves (PEC) and C-Index can be computed similar to any other model (see respective help pages `?pec::pec`

and `?pec::cindex`

). Note that for technical reasons, evaluation of `pamm`

objects should not start at exact 0 (see `times`

and `start`

arguments below).```
data(tumor)
## split data into train and test data
n_train <- 400
train_idx <- sample(seq_len(nrow(tumor)), n_train)
test_idx <- setdiff(seq_len(nrow(tumor)), train_idx)
## data transformation
tumor_ped <- as_ped(tumor[train_idx, ], Surv(days, status)~.)
# some simple models for comparison
pam1 <- pamm(
formula = ped_status ~ s(tend) + charlson_score + age,
data = tumor_ped)
pam2 <- pamm(
formula = ped_status ~ s(tend) + charlson_score + age + metastases + complications,
data = tumor_ped)
pam3 <- pamm(
formula = ped_status ~s(tend, by = complications) + charlson_score + age +
metastases,
data = tumor_ped)
# calculate prediction error curves (on test data)
pec <- pec(
list(pam1 = pam1, pam2 = pam2, pam3 = pam3),
Surv(days, status) ~ 1, # formula for IPCW
data = tumor[test_idx, ], # new data not used for model fit
times = seq(.01, 1200, by = 10),
start = .01,
exact = FALSE
)
```

The results illustrate that no one model is not necessarily better w.r.t. the prediction error for all time-points. For example `pam3`

is better than `pam2`

in the beginning and worse towards the end. Similarly, the integrated brier score (IBS) also depends on the evaluation time.

```
# plot prediction error curve
plot(pec)
```

```
# calculate integrated brier score
crps(pec, times = quantile(tumor$days[tumor$status == 1], c(.25, .5, .75)))
```

```
##
## Integrated Brier score (crps):
##
## IBS[0.01;time=203) IBS[0.01;time=533) IBS[0.01;time=1120)
## Reference 0.063 0.125 0.178
## pam1 0.061 0.122 0.178
## pam2 0.058 0.116 0.172
## pam3 0.053 0.113 0.174
```

Exemplary, we calculate the C-Index, however, note the warning message and the cited literature.

```
cindex(
list(pam1 = pam1, pam2 = pam2, pam3 = pam3),
Surv(days, status) ~ 1,
data = tumor[test_idx, ],
eval.times = quantile(tumor$days[tumor$status == 1], c(.25, .5, .75)))
```

```
##
## The c-index for right censored event times
##
## Prediction models:
##
## pam1 pam2 pam3
## pam1 pam2 pam3
##
## Right-censored response of a survival model
##
## No.Observations: 376
##
## Pattern:
## Freq
## event 194
## right.censored 182
##
## Censoring model for IPCW: marginal model (Kaplan-Meier for censoring distribution)
##
## No data splitting: either apparent or independent test sample performance
##
## Estimated C-index in %
##
## $AppCindex
## time=203 time=533 time=1120
## pam1 63.1 58.4 55.2
## pam2 74.0 64.9 60.2
## pam3 78.1 63.2 52.9
```

```
## Warning in summary.Cindex(x, print = TRUE, ...): The C-index is not proper for t-year predictions. Blanche et al. (2018), Biostatistics, 20(2): 347--357.
##
## Consider using time-dependent AUC instead: riskRegression::Score
```

Mogensen, Ulla B., Hemant Ishwaran, and Thomas A. Gerds. 2012. “Evaluating Random Forests for Survival Analysis Using Prediction Error Curves.” *Journal of Statistical Software* 50 (11): 1–23. http://www.jstatsoft.org/v50/i11.