Given a distribution name with a scale and loc parameters, the step optionally applies another scaling to it either based on the standard deviation of the column or a proportionally to each point through the relative parameter in order to preserve the underlying structure of the data. Then the computation is carried as follows:

new value = original value + relative scaling factor * random sample from the distribution.

If this relative parameter is not given or is set to abs, then the relative scaling factor is 1.

Usage

The following examples show how the step can be used in a recipe.

Inputs & Outputs

The following are the inputs expected by the step and the outputs it produces. These are generally columns (ds.first_name), datasets (ds or ds[["first_name", "last_name"]]) or models (referenced by name e.g. "churn-clf").

Configuration

The following parameters can be used to configure the behaviour of the step by including them in a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output).