is_missing

This step checks each row of the input column to determine if the value is missing (null or NaN). The result is a new boolean column, where each row indicates whether the corresponding element in the input column is missing.

The step can work with single-valued and multi-valued columns, and the output can be configured to be either boolean (true/false), numeric (0/1) or categorical (custom labels).

For single-valued columns: Each row in the output column will be true if the corresponding value in the input column is missing, and false otherwise.
For multivalued columns: Each row in the output column will be true if the corresponding sub-list in the input column is empty, and false otherwise.

Usage

The following examples show how the step can be used in a recipe.

Examples

Check for missing values in a numeric column.

is_missing(ds.numeric_col) -> (ds.numeric_col_missing)

Check for missing values in a numeric column.

is_missing(ds.numeric_col) -> (ds.numeric_col_missing)

Check for missing values in a text column and set output type to numeric.

is_missing(ds.string_col, {"out_type": "number"}) -> (ds.string_col_missing)

General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.

is_missing(column: column, {
    "param": value,
    ...
}) -> (result: column)

Inputs & Outputs

The following are the inputs expected by the step and the outputs it produces. These are generally columns (ds.first_name), datasets (ds or ds[["first_name", "last_name"]]) or models (referenced by name e.g. "churn-clf").

Inputs

Outputs

Configuration

The following parameters can be used to configure the behaviour of the step by including them in a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output).

Parameters

Prepare

Report

Analyse

Usage

Inputs & Outputs

Configuration

Prepare

Report

Analyse

​Usage

​Inputs & Outputs

​Configuration

Usage

Inputs & Outputs

Configuration