Skip to content

Filter topn

Sort a dataset by selected columns and pick the first N rows (or exclude them).

Usage

The following are the step's expected inputs and outputs and their specific types.

filter_topn(ds_in: dataset, {"param": value}) -> (ds_out: dataset)

where the object {"param": value} is optional in most cases and if present may contain any of the parameters described in the corresponding section below.

Inputs


ds_in: dataset

An input dataset to filter.

Outputs


ds_out: dataset

A new dataset containing the same columns as the input dataset but only those rows passing the filter condition.

Parameters


n: integer

How many of the leading rows to keep after sorting.


sort_by: array[string] | string

One or more columns to sort by before picking the first n rows. May be a column name or a list of column names.

Example parameter values:

  • "salary"
  • ["salary", "time_spend_company", "last_evaluation"]

exclude: boolean = False

If true, the first n rows after sorting will be excluded from the resulting dataset.


ascending: boolean = False

Whether to sort in ascending order rather than descending.