filter_missing

By default keeps only those rows where values in selected columns are not missing (non-NaNs). Using the exclude parameter, the row selection can be inverted, such that only rows with missing values in selected rows will be returned.

Usage

The following example shows how the step can be used in a recipe.

Examples

To keep only those rows where neither “address” nor “name” is missing

filter_missing(ds, {"columns": ["address", "name"]} -> (ds_filtered)

To keep only those rows where neither “address” nor “name” is missing

filter_missing(ds, {"columns": ["address", "name"]} -> (ds_filtered)

General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.

filter_missing(ds_in: dataset, {
    "param": value,
    ...
}) -> (ds_out: dataset)

Inputs & Outputs

The following are the inputs expected by the step and the outputs it produces. These are generally columns (ds.first_name), datasets (ds or ds[["first_name", "last_name"]]) or models (referenced by name e.g. "churn-clf").

Inputs

Outputs

Configuration

The following parameters can be used to configure the behaviour of the step by including them in a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output).

Parameters

Prepare

Analyse

Report

Usage

Inputs & Outputs

Configuration

Prepare

Analyse

Report

​Usage

​Inputs & Outputs

​Configuration

Usage

Inputs & Outputs

Configuration