equal

For each row, checks whether all values in that row are equal. The result is a boolean column indicating equality for each row as true or false.

Note that if the types of input columns are not compatible, the result will be False for all rows. Compatibility here means that input columns must be

all numeric or boolean (the latter being interpreted as 0.0/1.0), OR
all string-like (categorical or text), OR
all list-like

By default, missing values (NaNs) in the same location are considered equal in this step. However, check the parameter keep_nans below to control how the presence of NaNs affects the result.

Also, when performing numeric comparison, the parameters rel_tol and abs_tol can be used to check for approximate equality. The desired tolerance (precision) can then be expressed either as a proportion of a reference value; and/or as an absolute maximum difference). More specifically, the equation used to check for numeric equality between values a and b is:

absolute(a - b) <= (rel_tol * absolute(b) + abs_tol).

Also see the parameter descriptions below, or the corresponding numpy documentation for further details.

Usage

The following examples show how the step can be used in a recipe.

Examples

Inputs & Outputs

The following are the inputs expected by the step and the outputs it produces. These are generally columns (ds.first_name), datasets (ds or ds[["first_name", "last_name"]]) or models (referenced by name e.g. "churn-clf").

Inputs

Outputs

Configuration

The following parameters can be used to configure the behaviour of the step by including them in a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output).

Parameters

Prepare

Report

Analyse

Usage

Inputs & Outputs

Configuration

Prepare

Report

Analyse

​Usage

​Inputs & Outputs

​Configuration

Usage

Inputs & Outputs

Configuration