math

Add noise to a column with numbers or lists of numbers.

Given a distribution name with a scale and loc parameters, the step optionally applies another scaling to it either based on the standard deviation of the column or a proportionally to each point through the `relative` parameter in order to preserve the underlying structure of the data. Then the computation is carried as follows:

``````new value = original value + relative scaling factor * random sample from the distribution.
``````

If this `relative` parameter is not given or is set to `abs`, then the relative scaling factor is 1.

## Usage¶

The following are the step's expected inputs and outputs and their specific types.

Step signature
``````add_noise(input_column: number|list[number], {
"param": value
}) -> (result: column)
``````

where the object `{"param": value}` is optional in most cases and if present may contain any of the parameters described in the corresponding section below.

#### Example¶

Add white noise to a column of embeddings

Example call (in recipe editor)
``````add_noise(ds.embeddings) -> (ds.embeddings_with_noise)
``````
More examples

Add std-dependant noise to a numerical column

Example call (in recipe editor)
``````add_noise(ds.number, {"relative": "std"}) -> (ds.number_with_noise)
``````

## Inputs¶

input_column: column:number|list[number]

The original column.

## Outputs¶

result: column

The result of applying noise to it.

## Parameters¶

relative: number | string = "abs"

Mode to use. Either set to "std" to use the standard deviation, or use a number to scale the sampling.

Must be one of: `"std"`, `"abs"`

dist_name: string = "normal"

Distribution Function that noise is sampled from.

Must be one of: `"gumbel"`, `"laplace"`, `"logistic"`, `"normal"`

loc: number = 0.0

Mean ("centre") of the chosen distribution.

scale: number = 1.0

Standard deviation (spread or "width") of the distribution.

seed: number | null

The seed to use for the random distribution, if you wish to get reproducibility in your results.