Skip to main content
Each element from an exploded list will results in a new row in the resulting dataset, i.e. the tranformation will create a taller dataset than the original, but one that has the same number of columns. Note: to unpack lists into separate columns, see the step unpack_list instead.

Usage

The following shows how the step can be used in a recipe.

Examples

  • Signature
General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.
explode(ds: dataset, {
    "param": value,
    ...
}) -> (ds_out: dataset)

Inputs & Outputs

The following are the inputs expected by the step and the outputs it produces. These are generally columns (ds.first_name), datasets (ds or ds[["first_name", "last_name"]]) or models (referenced by name e.g. "churn-clf").
ds
dataset
required
An input dataset having at least one column containing lists.
ds_out
dataset
required
A taller output dataset having no list columns.

Configuration

The following parameters can be used to configure the behaviour of the step by including them in a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output).

Parameters

explode_by
[null, string, array]
The list of columns to explode. Any list columns in the input dataset not mentioned here will not be exploded, i.e. will remain list columns in the output dataset. If null, attempts to explode all columns containing lists.
  • null
  • string
  • array
{_}
null
null.
just_keep
[null, string, array]
Columns to keep in the output dataset. Specifies which non-exploded columns should be included in the output dataset. If null (default), all non-exploded columns will be included. If a string, only that column will be included. If an array of strings, only those columns will be included. Note that columns specified in explode_by will always be included regardless of this parameter.
  • null
  • string
  • array
{_}
null
null.
parallel
boolean
default:"true"
Whether to explode the selected columns together. If true, assumes all specified columns to be exploded are of the same lengths (in any given row). In this case, if a row contains two lists with 5 elements each, this will produce 5 rows with matching elements in the output dataset.If false, on the other hand, will explode iteratively column-by-column. A row containing two lists with 5 elements each, will therefore produce 25 rows in the output dataset. I.e., exploding the first column will produce 5 rows, and when these rows are exploded again using the second column, each will produce 5 rows in turn.
query
string
The graphext advanced query used to identify the rows to select previous to the grouping.
I