derive_column
Derive a new column with a custom JS script.
Supports any JS script using ECMAScript 2023 syntax. The script should have a return
clause returning either a value or null / undefined.
The script has access to a row
object that represent a row in the dataset and have the column names as keys.
Lists are supported both as inputs and outputs.
It’s important to correctly manage null values by checking for null (e.g. if (row.col != null) { ... }
) or using the JS optional chaining operator (?
).
Usage
The following examples show how the step can be used in a recipe.
Examples
Examples
The following example joins all values in a list of numbers with ’|’ as separator:
The following example joins all values in a list of numbers with ’|’ as separator:
The following example computes the sum for a list of numbers:
The following example adds a prefix to a category:
The following example extracts a regex from a text:
The following example extracts the domain from a URL column:
The following example extracts the year component from a Date column:
General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.
Inputs & Outputs
The following are the inputs expected by the step and the outputs it produces. These are generally
columns (ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Inputs
Inputs
An input dataset.
Outputs
Outputs
The column resulting from evaluating the script.
Configuration
The following parameters can be used to configure the behaviour of the step by including them in
a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output)
.
Parameters
Parameters
The javascript code to execute.
Examples
Examples
- For example, to multiply by 2 every row with a value:
Output column type. Select the desired type using a shortened yet fully specified name.
Values must be one of the following:
boolean
category
date
number
text
url
list[number]
list[category]
list[url]
list[date]
list[boolean]