General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.
The following are the inputs expected by the step and the outputs it produces. These are generally
columns (ds.first_name), datasets (ds or ds[["first_name", "last_name"]]) or models (referenced
by name e.g. "churn-clf").
The following parameters can be used to configure the behaviour of the step by including them in
a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output).
A list of threshold configurations.
A categorical column can have two kinds of thresholds determining whether specific categories will be
hidden from its view in the UI: a minimum number of rows in the current selection below which a category
will be hidden, or a minimum number of rows in the whole dataset (everything).
The thresholds parameter should be a list containing 1 or 2 objects: the configuration of a selection
threshold, and/or the configuration of a threshold for everything.
Configure categories to be discarded (hidden) in terms of their occurrence in the whole dataset.
Categories with a number (or percentage) of rows in the whole dataset less than value will be discarded (hidden from the variable’s filter view).
Configure categories to be discarded (hidden) in terms of their occurrence in the whole dataset.
Categories with a number (or percentage) of rows in the whole dataset less than value will be discarded (hidden from the variable’s filter view).
Configure categories to be discarded (hidden) in terms of their occurrence in the current selection.
Categories with a number (or percentage) of rows in the current selection less than value will be discarded (hidden from the variable’s filter view).
Configure categories to be discarded (hidden) in terms of their occurrence in the whole dataset.
Categories with a number (or percentage) of rows in the whole dataset less than value will be discarded (hidden from the variable’s filter view).
Configure categories to be discarded (hidden) in terms of their occurrence in the current selection.
Categories with a number (or percentage) of rows in the current selection less than value will be discarded (hidden from the variable’s filter view).
Configure categories to be discarded (hidden) in terms of their occurrence in the current selection.
Categories with a number (or percentage) of rows in the current selection less than value will be discarded (hidden from the variable’s filter view).
Configure categories to be discarded (hidden) in terms of their occurrence in the whole dataset.
Categories with a number (or percentage) of rows in the whole dataset less than value will be discarded (hidden from the variable’s filter view).