train_classification
step to make sure the model’s predicted probabilities are well-calibrated.
Note that currently we only support calibration of already fitted models, which should always be performed
on new data not already seen during training. For more information see the
scikit-learn documentation.
Usage
The following example shows how the step can be used in a recipe.Examples
Examples
- Example 1
- Signature
Assuming we have reserved a test set containing data that wasn’t used to train the model, we can simply pass it to this step to create a new, calibrated, model:
Inputs & Outputs
The following are the inputs expected by the step and the outputs it produces. These are generally columns (ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Inputs
Inputs
Outputs
Outputs
Configuration
The following parameters can be used to configure the behaviour of the step by including them in a json object as the last “input” to the step, i.e.step(..., {"param": "value", ...}) -> (output)
.
Parameters
Parameters
Target variable.
Name of the column that contains your target values (labels).
Calibration method.
Method to use for calibration.
isotonic
is a non-parametric method that fits a piecewise-constant,
strictly increasing function to the predicted probabilities. sigmoid
(Platt’s method) is a parametric
method that fits a logistic function to the predicted probabilities.It is not advised to use isotonic calibration with too few calibration samples (much fewer than 1,000) since it tends to overfit.Values must be one of the following:isotonic
sigmoid