calibrate_classification
Calibrate a classification model.
Usually employed after the train_classification
step to make sure the model’s predicted probabilities are well-calibrated.
Note that currently we only support calibration of already fitted models, which should always be performed on new data not already seen during training. For more information see the scikit-learn documentation.
Usage
The following example shows how the step can be used in a recipe.
Assuming we have reserved a test set containing data that wasn’t used to train the model, we can simply pass it to this step to create a new, calibrated, model:
Inputs & Outputs
The following are the inputs expected by the step and the outputs it produces. These are generally
columns (ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Configuration
The following parameters can be used to configure the behaviour of the step by including them in
a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output)
.
Target variable. Name of the column that contains your target values (labels).
Calibration method.
Method to use for calibration. isotonic
is a non-parametric method that fits a piecewise-constant,
strictly increasing function to the predicted probabilities. sigmoid
(Platt’s method) is a parametric
method that fits a logistic function to the predicted probabilities.
It is not advised to use isotonic calibration with too few calibration samples (much fewer than 1,000) since it tends to overfit.
Values must be one of the following:
isotonic
sigmoid
Was this page helpful?