is_missing
Check for missing values in a given column.
This step checks each row of the input column to determine if the value is missing (null or NaN). The result is a new boolean column, where each row indicates whether the corresponding element in the input column is missing.
The step can work with single-valued and multi-valued columns, and the output can be configured to be either boolean (true/false), numeric (0/1) or categorical (custom labels).
- For single-valued columns: Each row in the output column will be
true
if the corresponding value in the input column is missing, andfalse
otherwise. - For multivalued columns: Each row in the output column will be
true
if the corresponding sub-list in the input column is empty, andfalse
otherwise.
Usage
The following examples show how the step can be used in a recipe.
Check for missing values in a numeric column.
Inputs & Outputs
The following are the inputs expected by the step and the outputs it produces. These are generally
columns (ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Configuration
The following parameters can be used to configure the behaviour of the step by including them in
a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output)
.
Output type. The data type of the output column.
- ‘boolean’: Output is true/false indicating missing or not.
- ‘number’: Output is 0/1 indicating missing or not.
- ‘category’: Output is specified by params[“labels”][“true”] and params[“labels”][“false”].
Values must be one of the following:
boolean
number
category
Labels for the true and false categories. An object mapping the “true” and “false” categories to custom labels.
Was this page helpful?