> ## Documentation Index
> Fetch the complete documentation index at: https://docs.graphext.com/llms.txt
> Use this file to discover all available pages before exploring further.

# configure_discarded_categories

> Configures a minimum number of rows in a category below which the category will be hidden from the variable's filter view. 

Hides categories with a low number of rows from the filter panel. The `thresholds` parameter defines the minimum count or percentage below which a category is hidden. This is useful for decluttering filters when there are many infrequent categories.

This is a UI configuration step that affects how the project is displayed in Graphext. It applies to the dataset referenced in its inputs. If your recipe produces multiple datasets (e.g. a filtered dataset that is then passed to create\_project alongside the original), you need to add separate configure steps for each dataset you want to configure.

## Usage

The following example shows how the step can be used in a recipe.

<Accordion title="Examples" icon="code" defaultOpen="true">
  <Tabs>
    <Tab title="Example 1">
      ```stan theme={null}
      configure_discarded_categories(ds.cluster, { "thresholds": [{ "target": "EVERYTHING", "reference": "PERCENTAGE", "value": 20 }] })
      ```
    </Tab>

    <Tab title="Signature">
      General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.

      ```stan theme={null}
      configure_discarded_categories(column: category|list[category]|text, {
          "param": value,
          ...
      })
      ```
    </Tab>
  </Tabs>
</Accordion>

## Inputs & Outputs

The following are the inputs expected by the step and the outputs it produces. These are generally
columns (`ds.first_name`), datasets (`ds` or `ds[["first_name", "last_name"]]`) or models (referenced
by name e.g. `"churn-clf"`).

<Accordion title="Inputs" icon="right-to-bracket">
  <ParamField path="column" type="column[category|list[category]|text]" required>
    The column to configure.
  </ParamField>
</Accordion>

<Accordion title="Outputs" icon="right-from-bracket" />

## Configuration

The following parameters can be used to configure the behaviour of the step by including them in
a json object as the last "input" to the step, i.e. `step(..., {"param": "value", ...}) -> (output)`.

<Accordion title="Parameters" defaultOpen="true" icon="sliders">
  <ParamField path="thresholds" type="[array, array, array, array]" required>
    A list of threshold configurations.
    A categorical column can have two kinds of thresholds determining whether specific categories will be
    hidden from its view in the UI: a minimum number of rows in the current *selection* below which a category
    will be hidden, or a minimum number of rows in the *whole dataset* (*everything*).

    The `thresholds` parameter should be a list containing 1 or 2 objects: the configuration of a *selection*
    threshold, and/or the configuration of a threshold for *everything*.

    <Accordion title="Options">
      <Tabs>
        <Tab title="array">
          <ParamField path="{_}" type="array">
            array.

            <Accordion title="Array items">
              <ParamField path="Item 0" type="object">
                Configure categories to be discarded (hidden) in terms of their occurrence in the *whole dataset*.
                Categories with a number (or percentage) of rows in the *whole dataset* less than `value` will be discarded (hidden from the variable's filter view).

                <Accordion title="Properties">
                  <ParamField path="target" type="string" default="EVERYTHING">
                    Whether to apply the threshold to the current selection of rows or all rows in the dataset.
                  </ParamField>

                  <ParamField path="reference" type="string">
                    Whether to interpret the threshold value as an absolute (count) or percentage of rows.

                    Values must be one of the following:

                    * `ABSOLUTE`
                    * `PERCENTAGE`
                  </ParamField>

                  <ParamField path="value" type="number">
                    Categories less frequent than this value will be discarded (hidden).
                  </ParamField>
                </Accordion>
              </ParamField>
            </Accordion>
          </ParamField>
        </Tab>

        <Tab title="array">
          <ParamField path="{_}" type="array">
            array.

            <Accordion title="Array items">
              <ParamField path="Item 0" type="object">
                Configure categories to be discarded (hidden) in terms of their occurrence in the *current selection*.
                Categories with a number (or percentage) of rows in the current selection less than `value` will be discarded (hidden from the variable's filter view).

                <Accordion title="Properties">
                  <ParamField path="target" type="string" default="SELECTION">
                    Whether to apply the threshold to the current selection of rows or all rows in the dataset.
                  </ParamField>

                  <ParamField path="reference" type="string">
                    Whether to interpret the threshold value as an absolute (count) or percentage of rows.

                    Values must be one of the following:

                    * `ABSOLUTE`
                    * `PERCENTAGE`
                  </ParamField>

                  <ParamField path="value" type="number">
                    Categories less frequent than this value will be discarded (hidden).
                  </ParamField>
                </Accordion>
              </ParamField>
            </Accordion>
          </ParamField>
        </Tab>

        <Tab title="array">
          <ParamField path="{_}" type="array">
            array.

            <Accordion title="Array items">
              <ParamField path="Item 0" type="object">
                Configure categories to be discarded (hidden) in terms of their occurrence in the *whole dataset*.
                Categories with a number (or percentage) of rows in the *whole dataset* less than `value` will be discarded (hidden from the variable's filter view).

                <Accordion title="Properties">
                  <ParamField path="target" type="string" default="EVERYTHING">
                    Whether to apply the threshold to the current selection of rows or all rows in the dataset.
                  </ParamField>

                  <ParamField path="reference" type="string">
                    Whether to interpret the threshold value as an absolute (count) or percentage of rows.

                    Values must be one of the following:

                    * `ABSOLUTE`
                    * `PERCENTAGE`
                  </ParamField>

                  <ParamField path="value" type="number">
                    Categories less frequent than this value will be discarded (hidden).
                  </ParamField>
                </Accordion>
              </ParamField>

              <ParamField path="Item 1" type="object">
                Configure categories to be discarded (hidden) in terms of their occurrence in the *current selection*.
                Categories with a number (or percentage) of rows in the current selection less than `value` will be discarded (hidden from the variable's filter view).

                <Accordion title="Properties">
                  <ParamField path="target" type="string" default="SELECTION">
                    Whether to apply the threshold to the current selection of rows or all rows in the dataset.
                  </ParamField>

                  <ParamField path="reference" type="string">
                    Whether to interpret the threshold value as an absolute (count) or percentage of rows.

                    Values must be one of the following:

                    * `ABSOLUTE`
                    * `PERCENTAGE`
                  </ParamField>

                  <ParamField path="value" type="number">
                    Categories less frequent than this value will be discarded (hidden).
                  </ParamField>
                </Accordion>
              </ParamField>
            </Accordion>
          </ParamField>
        </Tab>

        <Tab title="array">
          <ParamField path="{_}" type="array">
            array.

            <Accordion title="Array items">
              <ParamField path="Item 0" type="object">
                Configure categories to be discarded (hidden) in terms of their occurrence in the *current selection*.
                Categories with a number (or percentage) of rows in the current selection less than `value` will be discarded (hidden from the variable's filter view).

                <Accordion title="Properties">
                  <ParamField path="target" type="string" default="SELECTION">
                    Whether to apply the threshold to the current selection of rows or all rows in the dataset.
                  </ParamField>

                  <ParamField path="reference" type="string">
                    Whether to interpret the threshold value as an absolute (count) or percentage of rows.

                    Values must be one of the following:

                    * `ABSOLUTE`
                    * `PERCENTAGE`
                  </ParamField>

                  <ParamField path="value" type="number">
                    Categories less frequent than this value will be discarded (hidden).
                  </ParamField>
                </Accordion>
              </ParamField>

              <ParamField path="Item 1" type="object">
                Configure categories to be discarded (hidden) in terms of their occurrence in the *whole dataset*.
                Categories with a number (or percentage) of rows in the *whole dataset* less than `value` will be discarded (hidden from the variable's filter view).

                <Accordion title="Properties">
                  <ParamField path="target" type="string" default="EVERYTHING">
                    Whether to apply the threshold to the current selection of rows or all rows in the dataset.
                  </ParamField>

                  <ParamField path="reference" type="string">
                    Whether to interpret the threshold value as an absolute (count) or percentage of rows.

                    Values must be one of the following:

                    * `ABSOLUTE`
                    * `PERCENTAGE`
                  </ParamField>

                  <ParamField path="value" type="number">
                    Categories less frequent than this value will be discarded (hidden).
                  </ParamField>
                </Accordion>
              </ParamField>
            </Accordion>
          </ParamField>
        </Tab>
      </Tabs>
    </Accordion>
  </ParamField>
</Accordion>
