Skip to content

Order categories

fast step 

(Re-)order the categories/levels of a categorical column.

The output, if transformation is successful, will always be an ordinal (ordered Category) column, even if the input was an unordered Category or not a Category at all. I.e. it is supposed that re-ordering the levels means the order is important.

If the input column is already a Category:

  • If unordered (non-ordinal): levels will be ordered in the given order (converted to ordinal)
  • If ordered (ordinal): levels will be re-ordered only

In both cases, the specified categories have to match the ones already existing. I.e. only re-ordering is allowed, but not deletion or addition of new categories.

If the column is not already a Category:

  • If the column contains lists, the new column will be identical to input (re-ordering not supported).
  • Otherwise, the column will be converted if the param force_categorical is true, and ordered as desired. Otherwise the new column will be identical to the input (no ordering performed).

Usage


The following are the step's expected inputs and outputs and their specific types.

Step signature
order_categories(input: column, {"param": value}) -> (output: column)

where the object {"param": value} is optional in most cases and if present may contain any of the parameters described in the corresponding section below.

Example

To arrange the levels "small", "medium", "large" ("S", "M", "L") in reverser order:

Example call (in recipe editor)
order_categories(ds.cat_col, {"categories": ["L", "M", "S"]} -> (ds.ordinal)
More examples

Converting a text column containing the strings "low", "medium" and "high" to an ordinal column:

Example call (in recipe editor)
order_categories(ds.text, {
    "categories": ["low", "medium", "high"],
    "force_categorial": true
} -> (ds.ordinal)

Inputs


input: column

A column to re-order or convert to ordinal (ordered Category).

Outputs


output: column

An ordinal column.

Parameters


categories: array | null

List with desired order of categories. If null, the unique existing values in the input column will be used.


force_categorical: boolean = True

Whether to convert non-categorical input columns to Category before ordering (otherwise will be unchanged).