Filter range¶
Filter rows based on the numeric values in a given column.
Keeps or drops rows where numeric values fall within a desired range, i.e. are greater than a certain minimum, and/or smaller than a maximum value.
Usage¶
The following are the step's expected inputs and outputs and their specific types.
filter_range(ds_in: dataset, {"param": value}) -> (ds_out: dataset)
where the object {"param": value}
is optional in most cases and if present may contain any of the parameters described in the
corresponding section below.
Example¶
The following example creates a new dataset including only those rows whose satisfaction_level is between 0.6 and 0.9 (inclusive).
filter_range(ds, {"column": "satisfaction_level", "min": 0.6, "max": 0.9}) -> (ds_filtered)
More examples
Using the exclude parameter, the next example creates a dataset including only those rows whose satisfaction_level falls outside the range 0.6–0.9.
filter_range(ds, {"column": "satisfaction_level", "min": 0.6, "max": 0.9, "exclude": true}) -> (ds_filtered)
Inputs¶
ds_in: dataset
An input dataset to filter.
Outputs¶
ds_out: dataset
A new dataset containing the same columns as the input dataset but only those rows passing the filter condition.
Parameters¶
column: string
Name of the column to apply the filter to.
exclude: boolean = False
If true
, values within the specified range will be excluded from the resulting dataset.
max: number
Maximum value in the selected column to pass the filter (to be included). Either this or the min
parameter must be specified.
min: number
Minimum value in the selected column to pass the filter (to be included). Either this or the max
parameter must be specified.