Train and store a survival model to be loaded at a later point for prediction.
Trains a survival model using the Cox Proportional Hazard model.
The output will always be a new column with the trained model’s predictions on the training data, as well as a saved and named model file that can be used in other projects for prediction of new data.
The following shows how the step can be used in a recipe.
Examples
General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.
General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.
The following are the inputs expected by the step and the outputs it produces. These are generally
columns (ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Inputs
Should contain the target columns (see target
parameter below) and the feature columns you wish to use in the model.
Outputs
The following parameters can be used to configure the behaviour of the step by including them in
a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output)
.
Parameters
Kind of survival model to train. “CoxPH” trains a lifelines Cox Proportional Hazard model.
Configure the kind of predictions to return.
Properties
Kind of prediction.
median
returns the median survival time. percentile
returns the survival time at
the given percentile. expectation
returns the expected survival time.
survival_function
returns the whole survival function (one series per sample).
Values must be one of the following:
median
percentile
expectation
survival_function
Percentile when kind
is set to percentile
Values must be in the following range:
Points in time to predict.
Configures at which points to predict when kind
is set to survival_function
.
Either an explicit array of durations, or an object specifying a duration step size and
maximum duration.
Options
Model parameters.
Properties
Level in the confidence intervals.
Penalizer strength. Attach an L2 penalizer to the size of the coefficients during regression. This improves stability of the estimates and controls for high correlation between covariates.
Values must be in the following range:
L1 vs L2 penalty ratio. Specify what ratio to assign to a L1 vs L2 penalty (ridge vs lasso). Same as scikit-learn convention.
Values must be in the following range:
Columns to use in stratification. This is useful if a categorical covariate does not obey the proportional hazard assumption.
Array items
Each item in array.
How the fitter should estimate the baseline.
Values must be one of the following:
breslow
spline
piecewise
Kind of survival model to train. “CoxPH” trains a lifelines Cox Proportional Hazard model.
Configure the kind of predictions to return.
Properties
Kind of prediction.
median
returns the median survival time. percentile
returns the survival time at
the given percentile. expectation
returns the expected survival time.
survival_function
returns the whole survival function (one series per sample).
Values must be one of the following:
median
percentile
expectation
survival_function
Percentile when kind
is set to percentile
Values must be in the following range:
Points in time to predict.
Configures at which points to predict when kind
is set to survival_function
.
Either an explicit array of durations, or an object specifying a duration step size and
maximum duration.
Options
Model parameters.
Properties
Level in the confidence intervals.
Penalizer strength. Attach an L2 penalizer to the size of the coefficients during regression. This improves stability of the estimates and controls for high correlation between covariates.
Values must be in the following range:
L1 vs L2 penalty ratio. Specify what ratio to assign to a L1 vs L2 penalty (ridge vs lasso). Same as scikit-learn convention.
Values must be in the following range:
Columns to use in stratification. This is useful if a categorical covariate does not obey the proportional hazard assumption.
Array items
Each item in array.
How the fitter should estimate the baseline.
Values must be one of the following:
breslow
spline
piecewise
Train and store a survival model to be loaded at a later point for prediction.
Trains a survival model using the Cox Proportional Hazard model.
The output will always be a new column with the trained model’s predictions on the training data, as well as a saved and named model file that can be used in other projects for prediction of new data.
The following shows how the step can be used in a recipe.
Examples
General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.
General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.
The following are the inputs expected by the step and the outputs it produces. These are generally
columns (ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Inputs
Should contain the target columns (see target
parameter below) and the feature columns you wish to use in the model.
Outputs
The following parameters can be used to configure the behaviour of the step by including them in
a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output)
.
Parameters
Kind of survival model to train. “CoxPH” trains a lifelines Cox Proportional Hazard model.
Configure the kind of predictions to return.
Properties
Kind of prediction.
median
returns the median survival time. percentile
returns the survival time at
the given percentile. expectation
returns the expected survival time.
survival_function
returns the whole survival function (one series per sample).
Values must be one of the following:
median
percentile
expectation
survival_function
Percentile when kind
is set to percentile
Values must be in the following range:
Points in time to predict.
Configures at which points to predict when kind
is set to survival_function
.
Either an explicit array of durations, or an object specifying a duration step size and
maximum duration.
Options
Model parameters.
Properties
Level in the confidence intervals.
Penalizer strength. Attach an L2 penalizer to the size of the coefficients during regression. This improves stability of the estimates and controls for high correlation between covariates.
Values must be in the following range:
L1 vs L2 penalty ratio. Specify what ratio to assign to a L1 vs L2 penalty (ridge vs lasso). Same as scikit-learn convention.
Values must be in the following range:
Columns to use in stratification. This is useful if a categorical covariate does not obey the proportional hazard assumption.
Array items
Each item in array.
How the fitter should estimate the baseline.
Values must be one of the following:
breslow
spline
piecewise
Kind of survival model to train. “CoxPH” trains a lifelines Cox Proportional Hazard model.
Configure the kind of predictions to return.
Properties
Kind of prediction.
median
returns the median survival time. percentile
returns the survival time at
the given percentile. expectation
returns the expected survival time.
survival_function
returns the whole survival function (one series per sample).
Values must be one of the following:
median
percentile
expectation
survival_function
Percentile when kind
is set to percentile
Values must be in the following range:
Points in time to predict.
Configures at which points to predict when kind
is set to survival_function
.
Either an explicit array of durations, or an object specifying a duration step size and
maximum duration.
Options
Model parameters.
Properties
Level in the confidence intervals.
Penalizer strength. Attach an L2 penalizer to the size of the coefficients during regression. This improves stability of the estimates and controls for high correlation between covariates.
Values must be in the following range:
L1 vs L2 penalty ratio. Specify what ratio to assign to a L1 vs L2 penalty (ridge vs lasso). Same as scikit-learn convention.
Values must be in the following range:
Columns to use in stratification. This is useful if a categorical covariate does not obey the proportional hazard assumption.
Array items
Each item in array.
How the fitter should estimate the baseline.
Values must be one of the following:
breslow
spline
piecewise