Filter topn¶
Sort a dataset by selected columns and pick the first N rows (or exclude them).
Usage¶
The following are the step's expected inputs and outputs and their specific types.
filter_topn(ds_in: dataset, {"param": value}) -> (ds_out: dataset)
where the object {"param": value}
is optional in most cases and if present may contain any of the parameters described in the
corresponding section below.
Inputs¶
ds_in: dataset
An input dataset to filter.
Outputs¶
ds_out: dataset
A new dataset containing the same columns as the input dataset but only those rows passing the filter condition.
Parameters¶
n: integer
How many of the leading rows to keep after sorting.
sort_by: array[string] | string
One or more columns to sort by before picking the first n rows. May be a column name or a list of column names.
Example parameter values:
"salary"
["salary", "time_spend_company", "last_evaluation"]
exclude: boolean = False
If true
, the first n rows after sorting will be excluded from the resulting dataset.
ascending: boolean = False
Whether to sort in ascending order rather than descending.