> ## Documentation Index
> Fetch the complete documentation index at: https://docs.graphext.com/llms.txt
> Use this file to discover all available pages before exploring further.

# filter_topn

> Sort a dataset by selected columns and pick the first N rows (or exclude them). 

## Usage

The following examples show how the step can be used in a recipe.

<Accordion title="Examples" icon="code" defaultOpen="true">
  <Tabs>
    <Tab title="Example 1">
      Keep the top 10 rows sorted by salary

      ```stan theme={null}
      filter_topn(ds, {"n": 10, "sort_by": "salary"}) -> (ds_filtered)
      ```
    </Tab>

    <Tab title="Example 2">
      Exclude the bottom 5 rows when sorting by date ascending

      ```stan theme={null}
      filter_topn(ds, {"n": 5, "sort_by": "date", "ascending": true, "exclude": true}) -> (ds_filtered)
      ```
    </Tab>

    <Tab title="Signature">
      General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.

      ```stan theme={null}
      filter_topn(ds_in: dataset, {
          "param": value,
          ...
      }) -> (ds_out: dataset)
      ```
    </Tab>
  </Tabs>
</Accordion>

## Inputs & Outputs

The following are the inputs expected by the step and the outputs it produces. These are generally
columns (`ds.first_name`), datasets (`ds` or `ds[["first_name", "last_name"]]`) or models (referenced
by name e.g. `"churn-clf"`).

<Accordion title="Inputs" icon="right-to-bracket">
  <ParamField path="ds_in" type="dataset" required>
    An input dataset to filter.
  </ParamField>
</Accordion>

<Accordion title="Outputs" icon="right-from-bracket">
  <ParamField path="ds_out" type="dataset" required>
    A new dataset containing the same columns as the input dataset but only those rows passing the filter condition.
  </ParamField>
</Accordion>

## Configuration

The following parameters can be used to configure the behaviour of the step by including them in
a json object as the last "input" to the step, i.e. `step(..., {"param": "value", ...}) -> (output)`.

<Accordion title="Parameters" defaultOpen="true" icon="sliders">
  <ParamField path="n" type="integer" required>
    How many of the leading rows to keep after sorting.
  </ParamField>

  <ParamField path="sort_by" type="[array, string]" required>
    One or more columns to sort by before picking the first n rows.
    May be a column name or a list of column names.

    <Accordion title="Options">
      <Tabs>
        <Tab title="array">
          <ParamField path="{_}" type="array[string]">
            array.

            <Accordion title="Array items">
              <ParamField path="Item" type="string (ds_in.column)">
                Each item in array.
              </ParamField>
            </Accordion>
          </ParamField>
        </Tab>

        <Tab title="string">
          <ParamField path="{_}" type="string (ds_in.column)">
            string.
          </ParamField>
        </Tab>
      </Tabs>
    </Accordion>

    <Accordion title="Examples">
      * salary
      * \['salary', 'time\_spend\_company', 'last\_evaluation']
    </Accordion>
  </ParamField>

  <ParamField path="exclude" type="boolean" default="false">
    If `true`, the first n rows after sorting will be *excluded* from the resulting dataset.
  </ParamField>

  <ParamField path="ascending" type="boolean" default="false">
    Whether to sort in ascending order rather than descending.
  </ParamField>
</Accordion>
