> ## Documentation Index
> Fetch the complete documentation index at: https://docs.graphext.com/llms.txt
> Use this file to discover all available pages before exploring further.

# equal

> Check the row-wise equality of all input columns. 

For each row, checks whether all values in that row are equal. The result is a boolean column
indicating equality for each row as `true` or `false`.

Note that if the types of input columns are not compatible, the result will be `False` for all
rows. Compatibility here means that input columns must be

* all numeric or boolean (the latter being interpreted as 0.0/1.0), OR
* all string-like (categorical or text), OR
* all list-like

By default, missing values (NaNs) in the same location are considered equal in this step. However,
check the parameter `keep_nans` below to control how the presence of NaNs affects the result.

Also, when performing numeric comparison, the parameters `rel_tol` and `abs_tol` can be used to check
for approximate equality. The desired tolerance (precision) can then be expressed either as a
proportion of a reference value; and/or as an absolute maximum difference). More specifically,
the equation used to check for numeric equality between values `a` and `b` is:

`absolute(a - b) <= (rel_tol * absolute(b) + abs_tol)`.

Also see the parameter descriptions below, or the corresponding
[numpy documentation](https://numpy.org/doc/stable/reference/generated/numpy.isclose.html)
for further details.

## Usage

The following examples show how the step can be used in a recipe.

<Accordion title="Examples" icon="code" defaultOpen="true">
  <Tabs>
    <Tab title="Example 1">
      To check exact equality of numeric columns `num1` and `num2`

      ```stan theme={null}
      equal(ds.num1, ds.num2) -> (ds.num1_num2_eq)
      ```
    </Tab>

    <Tab title="Example 2">
      To check *approximate* equality of numeric columns `num1` and `num2`, with differences of less than 0.001 being considered "equal" use the `abs_tol` parameter, (note that for reasons of limited precision in how numbers are stored it would be safer to use e.g. 0.0011 or even 0.002 to approximate equality to three decimals):

      ```stan theme={null}
      equal(ds.num1, ds.num2, {"abs_tol": 0.001}) -> (ds.aprox_eq)
      ```
    </Tab>

    <Tab title="Signature">
      General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.

      ```stan theme={null}
      equal(*columns: column, {
          "param": value,
          ...
      }) -> (result: boolean)
      ```
    </Tab>
  </Tabs>
</Accordion>

## Inputs & Outputs

The following are the inputs expected by the step and the outputs it produces. These are generally
columns (`ds.first_name`), datasets (`ds` or `ds[["first_name", "last_name"]]`) or models (referenced
by name e.g. `"churn-clf"`).

<Accordion title="Inputs" icon="right-to-bracket">
  <ParamField path="*columns" type="column">
    One or more columns to check for equality.
  </ParamField>
</Accordion>

<Accordion title="Outputs" icon="right-from-bracket">
  <ParamField path="result" type="column[boolean]" required>
    Output column indicating row-wise equality of the input columns.
  </ParamField>
</Accordion>

## Configuration

The following parameters can be used to configure the behaviour of the step by including them in
a json object as the last "input" to the step, i.e. `step(..., {"param": "value", ...}) -> (output)`.

<Accordion title="Parameters" defaultOpen="true" icon="sliders">
  <ParamField path="abs_tol" type="number" default="0">
    Absolute tolerance.
    The absolute (positive) difference of two numbers must be smaller than or equal to this value
    for them to be considered equal.
  </ParamField>

  <ParamField path="rel_tol" type="number" default="0">
    Relative tolerance.
    The absolute (positive) difference of two numbers `a` and `b` must be smaller than or equal
    to `rel_tol * absolute(b)` for them to be considered equal.
  </ParamField>

  <ParamField path="keep_nans" type="[boolean, string]" default="false">
    Whether to maintain missing values (NaNs) in the result.
    The possible values are `{true, false, "any", "all"}`:

    * If `false`: use default NaN comparison. I.e. `NaN == value => false` but `NaN == NaN => true`.
      Note that this means the result will never contain any NaNs.

    * If `true` or `any`: the result will be NaN if *any* value in a row is NaN

    * If `all`: the result will be NaN if *all* values in a row are NaN.

    Values must be one of the following:

    * `any`
    * `all`
    * `True`
    * `False`
  </ParamField>
</Accordion>
