Explains the predictions of a trained machine learning model. Currently the only supported method is SHAP, which provides a unified measure of feature importance and feature effects. For more information see the SHAP documentation.

method
string
default: "shap"

Explanation method.

Values must be one of the following:

  • shap
positive_class
[string, null]
required

Positive class. Name/label of the target class to generate explanations for if model is a classifier.

space
string
default: "probability"

Explanation space. The space in which to calculate the explanations. “raw” corresponds to the internal prediction space of the model, e.g. log-odds in the case of a Catboost classifier. “normalized” will re-normalize the explanations to the range [0, 1] for each feature. SHAP values for all features in a single row will sum to 1.0 in this case. “probability” will convert SHAP values to probabilities by rescaling the sum of SHAP values for each row such that they sum to the difference between the base probability and the model’s prediction.

Values must be one of the following:

  • raw
  • normalized
  • probability
base_probability
[number, null]

Base probability. The base probability to use when converting SHAP values to probabilities. If not provided, the mean of the model’s predictions on the dataset will be used. Only relevant if space is set to “probability”.

topn
[integer, null]
default: "5"

Top N features. Number of top features to include in the explanation. If not provided, all features will be included.

round
[integer, null]
default: "4"

Round numerical explanations. How many decimal places to round the explanations to. If not provided, or null will not round.

output
string
default: "json"

Output format of the explanations. If json, the default, explanations will be json-encoded. For each row in the dataset, the explanation consists of an array containing one object for each of the topn features, with each object in turn containing the feature name, the SHAP value, and the feature value (e.g. "[{'name': 'events': 'data': 5071, 'value': 0.15}, {...}, ...]"). The resulting json-encoded output column can be processed further in Graphext using the extract_json_values step.

If verbose, explanations will be more verbal, using a configurable template to generate a human-readable explanation. The default format is shown in the format parameter below.

If columns, the explanations will be returned as separate columns. The first column will contain in each row a list of the feature names of the topn features, sorted descending by SHAP value. The second (optional) column will contain the corresponding SHAP values in the same order. A third (optional) column will contain the corresponding feature values.

Values must be one of the following:

  • columns
  • json
  • verbose
format
[string, null]
default: "{name}(={data}): {shap}"

Verbal explanation format. A template string to generate a human-readable explanation (applicable only if parameter "output": "verbose"). The template can contain placeholders for the feature name, the SHAP value, and the feature value (data). The default format is ”(=): ”. An even more verbose explanation format could be "{name} has a SHAP value of {value} and a feature value of {data}", for example. The topn features will be converted using this format and then concatenated using the below separator parameter.

separator
[string]
default: ", "

Verbal explanation separator. A string to separate the explanations of the topn features (applicable only if parameter "verbose": true).