Explain a prediction model.
Examples
ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Inputs
Outputs
step(..., {"param": "value", ...}) -> (output)
.
Parameters
-1
1
0
None
null
will not round.json
, the default, explanations will be json-encoded. For each row in the dataset, the explanation
consists of an array containing one object for each of the topn
features, with each object in turn containing
the feature name, the SHAP value, and the feature value (e.g. "[{'name': 'events': 'data': 5071, 'value': 0.15}, {...}, ...]"
).
The resulting json-encoded output column can be processed further in Graphext using the extract_json_values
step.If verbose
, explanations will be more verbal, using a configurable template to generate a human-readable explanation.
The default format is shown in the format
parameter below.If columns
, the explanations will be returned as separate columns. The first column will contain in each row a list of
the feature names of the topn
features, sorted descending by SHAP value. The second (optional) column will contain the
corresponding SHAP values in the same order. A third (optional) column will contain the corresponding feature values.Values must be one of the following:columns
json
verbose
true
, and the output
parameter is "json"
, entries in each output row are flat lists of objects, each containing
the name, the value and SHAP contribution of a feature in the dataset. Additional information, such as the sum of remaining
SHAP contributions (when topn
or groups
is set, see include_tail
below), or the base value, will be included with special
names "<tail>"
and "<base>"
, respectively, as if they were features themselved.If false
, each output row will contain an object instead, where proper SHAP values are nested under the “shap_values” key,
while the tail and base value are top-level key-value pairs.true
, the sum of SHAP values of features not included in the topn
items or groups will also be included in the output.true
, the base value of the model will be included in the output. Note that this value is usually identical
for all rows in the dataset."output": "verbose"
).
The template can contain placeholders for the feature name, the SHAP value, and the feature value (data).
The default format is ”(=): ”. An even more verbose explanation format could be
"{name} has a SHAP value of {value} and a feature value of {data}"
, for example. The topn
features will be converted using this format and then concatenated using the below separator
parameter.topn
features (applicable only if parameter "verbose": true
).raw
normalized
probability
space
is
set to “probability”.shap