Check for missing values in a given column.
This step checks each row of the input column to determine if the value is missing (null or NaN). The result is a new boolean column, where each row indicates whether the corresponding element in the input column is missing.
The step can work with single-valued and multi-valued columns, and the output can be configured to be either boolean (true/false), numeric (0/1) or categorical (custom labels).
true
if the corresponding
value in the input column is missing, and false
otherwise.true
if the corresponding
sub-list in the input column is empty, and false
otherwise.The following examples show how the step can be used in a recipe.
Examples
Check for missing values in a numeric column.
Check for missing values in a numeric column.
Check for missing values in a text column and set output type to numeric.
General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.
The following are the inputs expected by the step and the outputs it produces. These are generally
columns (ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Inputs
The input column to check for missing values.
Outputs
The output column indicating the presence of missing values in the input column.
The following parameters can be used to configure the behaviour of the step by including them in
a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output)
.
Parameters
Output type. The data type of the output column.
Values must be one of the following:
boolean
number
category