Skip to main content
Usually employed after the train_classification step to make sure the model’s predicted probabilities are well-calibrated. Note that currently we only support calibration of already fitted models, which should always be performed on new data not already seen during training. For more information see the scikit-learn documentation.

Usage

The following example shows how the step can be used in a recipe.

Examples

  • Example 1
  • Signature
Assuming we have reserved a test set containing data that wasn’t used to train the model, we can simply pass it to this step to create a new, calibrated, model:
calibrate(ds_test, "model", {"target": "is_churn", "method": "isotonic"}) -> ("calibrated_model")

Inputs & Outputs

The following are the inputs expected by the step and the outputs it produces. These are generally columns (ds.first_name), datasets (ds or ds[["first_name", "last_name"]]) or models (referenced by name e.g. "churn-clf").
ds
dataset
required
A dataset containing features and target columns for data that has not already been used to train the model.
model
file[model_classification[ds]]
required
A trained classification model to calibrate.
model_out
file[model_classification[ds]]
required
A zip file containing the calibrated model.
info
file.hidden
required

Configuration

The following parameters can be used to configure the behaviour of the step by including them in a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output).

Parameters

target
string (ds.column:category|boolean)
required
Target variable. Name of the column that contains your target values (labels).
method
string
default:"isotonic"
Calibration method. Method to use for calibration. isotonic is a non-parametric method that fits a piecewise-constant, strictly increasing function to the predicted probabilities. sigmoid (Platt’s method) is a parametric method that fits a logistic function to the predicted probabilities.It is not advised to use isotonic calibration with too few calibration samples (much fewer than 1,000) since it tends to overfit.Values must be one of the following:
  • isotonic
  • sigmoid
I