Filter values¶
Filter rows where column matches specified values exactly.
Usage¶
The following are the step's expected inputs and outputs and their specific types.
filter_values(ds_in: dataset, {"param": value}) -> (ds_out: dataset)
where the object {"param": value}
is optional in most cases and if present may contain any of the parameters described in the
corresponding section below.
Example¶
To create a new dataset keeping only those rows where values in the "salary" column are either "low" or "high".
filter_values(ds, {"column": "salary", "values": ["low", "high"]}) -> (ds_filtered)
More examples
Or, using the exclude
parameter to drop rows where "salary" values are either "low" or "high":
filter_values(ds, {"column": "salary", "values": ["low", "high"], "exclude": true}) -> (ds_filtered)
Inputs¶
ds_in: dataset
An input dataset to filter.
Outputs¶
ds_out: dataset
A new dataset containing the same columns as the input dataset but only those rows passing the filter condition.
Parameters¶
column: string
Name of column to be matched against the specified values
values: number | string | array[number | string]
Only rows matching these values exactly will be included in the resulting dataset. May be a single value or a list of values to be matched.
Example parameter values:
"the"
["the", "cat"]
2
[2, 3]
exclude: boolean = False
if true
, only rows not matching the specified values
will be included in the resulting dataset.