For example, to multiply column A by two and add column Col B, you would simply write

calculate(ds[["A", "Col B"]], {
    "formula": "2 * A - `Col B`"
}) -> (ds.result)

For more details regarding valids operators etc. see the Pandas eval() documentation, more specifically the supported syntax, and eval() applied to DataFrames.

Note that assignments in the formula are not supported, since the result must always be a new column, i.e. the following kind of formula should be avoided: "c = a + b". The correct way to return the result as a column would simply be "a + b".

If the name of an input column contains spaces, such as in the example above, it should be quoted in single backticks.

Usage

The following example shows how the step can be used in a recipe.

Inputs & Outputs

The following are the inputs expected by the step and the outputs it produces. These are generally columns (ds.first_name), datasets (ds or ds[["first_name", "last_name"]]) or models (referenced by name e.g. "churn-clf").

Configuration

The following parameters can be used to configure the behaviour of the step by including them in a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output).