This is the general data transformation function provided by the pammtools package. Two main applications must be distinguished:

  1. Transformation of standard time-to-event data.

  2. Transformation of left-truncated time-to-event data.

  3. Transformation of time-to-event data with time-dependent covariates (TDC).

For the latter, the type of effect one wants to estimate is also important for the data transformation step. In any case, the data transformation is specified by a two sided formula. In case of TDCs, the right-hand-side of the formula can contain formula specials concurrent and cumulative. See the data-transformation vignette for details.

as_ped(data, ...)

# S3 method for data.frame
as_ped(
  data,
  formula,
  cut = NULL,
  max_time = NULL,
  tdc_specials = c("concurrent", "cumulative"),
  censor_code = 0L,
  transition = character(),
  timescale = c("gap", "calendar"),
  min_events = 1L,
  ...
)

# S3 method for nested_fdf
as_ped(data, formula, ...)

# S3 method for list
as_ped(
  data,
  formula,
  tdc_specials = c("concurrent", "cumulative"),
  censor_code = 0L,
  ...
)

is.ped(x)

# S3 method for ped
as_ped(data, newdata, ...)

# S3 method for pamm
as_ped(data, newdata, ...)

as_ped_multistate(
  data,
  formula,
  cut = NULL,
  max_time = NULL,
  tdc_specials = c("concurrent", "cumulative"),
  censor_code = 0L,
  transition = character(),
  timescale = c("gap", "calendar"),
  min_events = 1L,
  ...
)

Arguments

data

Either an object inheriting from data frame or in case of time-dependent covariates a list of data frames (of length 2), where the first data frame contains the time-to-event information and static covariates while the second (and potentially further data frames) contain information on time-dependent covariates and the times at which they have been observed.

...

Further arguments passed to the data.frame method and eventually to survSplit

formula

A two sided formula with a Surv object on the left-hand-side and covariate specification on the right-hand-side (RHS). The RHS can be an extended formula, which specifies how TDCs should be transformed using specials concurrent and cumulative. The left hand-side can be in start-stop-notation. This, however, is only used to create left-truncated data and does not support the full functionality.

cut

Split points, used to partition the follow up into intervals. If unspecified, all unique event times will be used.

max_time

If cut is unspecified, this will be the last possible event time. All event times after max_time will be administratively censored at max_time.

tdc_specials

A character vector. Names of potential specials in formula for concurrent and or cumulative effects.

censor_code

Specifies the value of the status variable that indicates censoring. Often this will be 0, which is the default.

x

any R object.

newdata

A new data set (data.frame) that contains the same variables that were used to create the PED object (data).

Value

A data frame class ped in piece-wise exponential data format.

Examples

tumor[1:3, ]
#> # A tibble: 3 × 9
#>    days status charlson_score   age sex    transfusion complications metastases
#>   <dbl>  <int>          <int> <int> <fct>  <fct>       <fct>         <fct>     
#> 1   579      0              2    58 female yes         no            yes       
#> 2  1192      0              2    52 male   no          yes           yes       
#> 3   308      1              2    74 female yes         no            yes       
#> # ℹ 1 more variable: resection <fct>
tumor[1:3, ] %>% as_ped(Surv(days, status)~ age + sex, cut = c(0, 500, 1000))
#>   id tstart tend   interval   offset ped_status age    sex
#> 1  1      0  500    (0,500] 6.214608          0  58 female
#> 2  1    500 1000 (500,1000] 4.369448          0  58 female
#> 3  2      0  500    (0,500] 6.214608          0  52   male
#> 4  2    500 1000 (500,1000] 6.214608          0  52   male
#> 5  3      0  500    (0,500] 5.730100          1  74 female
tumor[1:3, ] %>% as_ped(Surv(days, status)~ age + sex)
#>   id tstart tend interval offset ped_status age    sex
#> 1  1      0  308  (0,308] 5.7301          0  58 female
#> 2  2      0  308  (0,308] 5.7301          0  52   male
#> 3  3      0  308  (0,308] 5.7301          1  74 female
if (FALSE) {
data("cgd", package = "frailtyHL")
cgd2 <- cgd %>%
 select(id, tstart, tstop, enum, status, age) %>%
 filter(enum %in% c(1:2))
ped_re <- as_ped_multistate(
  formula = Surv(tstart, tstop, status) ~ age + enum,
  data = cgd2,
 transition = "enum",
 timescale = "calendar")
}