Skip to main content
Allowed elements in the fomula are column names as well as common operators and values for comparison (strings need to be specified using single quotes, see example below). For more details about valid formulas see here.

Usage

The following examples show how the step can be used in a recipe.

Examples

  • Example 1
  • Example 2
  • Signature
The first example keeps those rows where the “salary” column is either “low” or “high”:
filter_with_formula(ds, {
  "formula": "salary == 'low' or salary == 'high'"
  }) -> (ds_filtered)

Inputs & Outputs

The following are the inputs expected by the step and the outputs it produces. These are generally columns (ds.first_name), datasets (ds or ds[["first_name", "last_name"]]) or models (referenced by name e.g. "churn-clf").
ds_in
dataset
required
An input dataset to filter.
ds_out
dataset
required
A new dataset containing the same columns as the input dataset but only those rows passing the filter query.

Configuration

The following parameters can be used to configure the behaviour of the step by including them in a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output).

Parameters

formula
string
required
A formula describing the matching operation to perform on each row.
  • salary == 'low' or salary == 'high'
  • number_project >= 3 and number_project <= 4
I