Skip to main content
Trains a survival model using the Cox Proportional Hazard model. The output will always be a new column with the trained model’s predictions on the training data, as well as a saved and named model file that can be used in other projects for prediction of new data.

Usage

The following shows how the step can be used in a recipe.

Examples

  • Signature
General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.
train_survival(ds: dataset, {
    "param": value,
    ...
}) -> (predicted: number, model: model_survival[ds])

Inputs & Outputs

The following are the inputs expected by the step and the outputs it produces. These are generally columns (ds.first_name), datasets (ds or ds[["first_name", "last_name"]]) or models (referenced by name e.g. "churn-clf").
ds
dataset
required
Should contain the target columns (see target parameter below) and the feature columns you wish to use in the model.
predicted
column[number]
required
Name for output column containing model predictions.
model
file[model_survival[ds]]
required
Zip file containing the trained model and associated information.
info
file.hidden
required

Configuration

The following parameters can be used to configure the behaviour of the step by including them in a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output).

Parameters

  • CoxPH
model
string
default:"CoxPH"
Kind of survival model to train. “CoxPH” trains a lifelines Cox Proportional Hazard model.
target
array
Target variables. Two names, exactly, corresponding to the target columns that contain in the following order:
  1. whether the event was observed (boolean) and
  2. the time (duration) to event or censoring (number).
Item 0
string (ds.column:boolean)
Item 1
string (ds.column:number)
predictions
object
Configure the kind of predictions to return.
kind
string
default:"median"
Kind of prediction. median returns the median survival time. percentile returns the survival time at the given percentile. expectation returns the expected survival time. survival_function returns the whole survival function (one series per sample).Values must be one of the following:
  • median
  • percentile
  • expectation
  • survival_function
percentile
number
default:"0.5"
Percentile when kind is set to percentileValues must be in the following range:
0percentile1
times
[array, object]
Points in time to predict. Configures at which points to predict when kind is set to survival_function. Either an explicit array of durations, or an object specifying a duration step size and maximum duration.
  • Explicit times
  • Step enumeration
{_}
array[number]
Array of times/durations. Will predict the survival function at each of the durations. E.g. [1, 2, 3, 4, 5].
Item
number
Each item in array.
params
object
Model parameters.
alpha
number
default:"0.05"
Level in the confidence intervals.
penalizer
number
default:"0.0"
Penalizer strength. Attach an L2 penalizer to the size of the coefficients during regression. This improves stability of the estimates and controls for high correlation between covariates.Values must be in the following range:
0.0penalizer < inf
l1_ratio
number
default:"0.0"
L1 vs L2 penalty ratio. Specify what ratio to assign to a L1 vs L2 penalty (ridge vs lasso). Same as scikit-learn convention.Values must be in the following range:
0.0l1_ratio1.0
strata
array[string]
Columns to use in stratification. This is useful if a categorical covariate does not obey the proportional hazard assumption.
Item
string (ds.column:categorical)
Each item in array.
baseline_estimation_method
string
default:"breslow"
How the fitter should estimate the baseline.Values must be one of the following:
  • breslow
  • spline
  • piecewise
I