Skip to content

Filter values

Filter rows where column matches specified values exactly.


To create a new dataset keeping only those rows where values in the "salary" column are either "low" or "high".

filter_values(ds, {"column": "salary", "values": ["low", "high"]}) -> (ds_filtered)
More examples

Or, using the exclude parameter to drop rows where "salary" values are either "low" or "high":

filter_values(ds, {"column": "salary", "values": ["low", "high"], "exclude": true}) -> (ds_filtered)


The following are the step's expected inputs and outputs and their specific types.

filter_values(ds_in: dataset, {"param": value}) -> (ds_out: dataset)

where the object {"param": value} is optional in most cases and if present may contain any of the parameters described in the corresponding section below.


ds_in: dataset

An input dataset to filter.


ds_out: dataset

A new dataset containing the same columns as the input dataset but only those rows passing the filter condition.


column: string

Name of column to be matched against the specified values

values: number | string | array[number | string]

Only rows matching these values exactly will be included in the resulting dataset. May be a single value or a list of values to be matched.

Example parameter values:

  • "the"
  • ["the", "cat"]
  • 2
  • [2, 3]

exclude: boolean = False

if true, only rows not matching the specified values will be included in the resulting dataset.