Includes or excludes rows of the input datset based on the values of a selected text or list column. Depending on the configuration, if the column contains any or all of the specified values, the corresponding rows will be kept or dropped in the output dataset.

“Containment” here means texts in a text column containing one or more specified substrings (words), or lists in a list column containing one or more elements matching the specified values. See below for illustrative examples.

Usage

The following examples show how the step can be used in a recipe.

Inputs & Outputs

The following are the inputs expected by the step and the outputs it produces. These are generally columns (ds.first_name), datasets (ds or ds[["first_name", "last_name"]]) or models (referenced by name e.g. "churn-clf").

Configuration

The following parameters can be used to configure the behaviour of the step by including them in a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output).