> ## Documentation Index
> Fetch the complete documentation index at: https://docs.graphext.com/llms.txt
> Use this file to discover all available pages before exploring further.

# explode

> Explode (extract) items from column(s) of lists into separate rows. 

Each element from an exploded list will results in a new *row* in the resulting dataset, i.e. the tranformation will create a *taller* dataset than the original, but one that has the same number of columns.
Note: to unpack lists into separate *columns*, see the step `unpack_list` instead.

## Usage

The following examples show how the step can be used in a recipe.

<Accordion title="Examples" icon="code" defaultOpen="true">
  <Tabs>
    <Tab title="Example 1">
      Explode all list columns into separate rows

      ```stan theme={null}
      explode(ds) -> (ds_exploded)
      ```
    </Tab>

    <Tab title="Example 2">
      Explode only the tags column, keeping specific columns

      ```stan theme={null}
      explode(ds, {"explode_by": "tags", "just_keep": ["title", "tags"]}) -> (ds_exploded)
      ```
    </Tab>

    <Tab title="Signature">
      General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.

      ```stan theme={null}
      explode(ds: dataset, {
          "param": value,
          ...
      }) -> (ds_out: dataset)
      ```
    </Tab>
  </Tabs>
</Accordion>

## Inputs & Outputs

The following are the inputs expected by the step and the outputs it produces. These are generally
columns (`ds.first_name`), datasets (`ds` or `ds[["first_name", "last_name"]]`) or models (referenced
by name e.g. `"churn-clf"`).

<Accordion title="Inputs" icon="right-to-bracket">
  <ParamField path="ds" type="dataset" required>
    An input dataset having at least one column containing lists.
  </ParamField>
</Accordion>

<Accordion title="Outputs" icon="right-from-bracket">
  <ParamField path="ds_out" type="dataset" required>
    A taller output dataset having *no* list columns.
  </ParamField>
</Accordion>

## Configuration

The following parameters can be used to configure the behaviour of the step by including them in
a json object as the last "input" to the step, i.e. `step(..., {"param": "value", ...}) -> (output)`.

<Accordion title="Parameters" defaultOpen="true" icon="sliders">
  <ParamField path="explode_by" type="[null, string, array]">
    The list of columns to explode.
    Any list columns in the input dataset not mentioned here will not be exploded, i.e. will
    remain list columns in the output dataset. If `null`, attempts to explode *all* columns
    containing lists.

    <Accordion title="Options">
      <Tabs>
        <Tab title="null">
          <ParamField path="{_}" type="null">
            null.
          </ParamField>
        </Tab>

        <Tab title="string">
          <ParamField path="{_}" type="string (ds.column)">
            string.
          </ParamField>
        </Tab>

        <Tab title="array">
          <ParamField path="{_}" type="array[string]">
            array.

            <Accordion title="Array items">
              <ParamField path="Item" type="string (ds.column)">
                Each item in array.
              </ParamField>
            </Accordion>
          </ParamField>
        </Tab>
      </Tabs>
    </Accordion>
  </ParamField>

  <ParamField path="just_keep" type="[null, string, array]">
    Columns to keep in the output dataset.
    Specifies which non-exploded columns should be included in the output dataset. If `null` (default),
    all non-exploded columns will be included. If a string, only that column will be included. If an array
    of strings, only those columns will be included. Note that columns specified in `explode_by` will always
    be included regardless of this parameter.

    <Accordion title="Options">
      <Tabs>
        <Tab title="null">
          <ParamField path="{_}" type="null">
            null.
          </ParamField>
        </Tab>

        <Tab title="string">
          <ParamField path="{_}" type="string (ds.column)">
            string.
          </ParamField>
        </Tab>

        <Tab title="array">
          <ParamField path="{_}" type="array[string]">
            array.

            <Accordion title="Array items">
              <ParamField path="Item" type="string (ds.column)">
                Each item in array.
              </ParamField>
            </Accordion>
          </ParamField>
        </Tab>
      </Tabs>
    </Accordion>
  </ParamField>

  <ParamField path="parallel" type="boolean" default="true">
    Whether to explode the selected columns together.
    If `true`, assumes all specified columns to be exploded are of the same lengths (in any
    given row). In this case, if a row contains two lists with 5 elements each, this will
    produce *5* rows with matching elements in the output dataset.

    If `false`, on the other hand, will explode iteratively column-by-column. A row containing
    two lists with 5 elements each, will therefore produce *25* rows in the output dataset. I.e.,
    exploding the first column will produce 5 rows, and when these rows are exploded again
    using the second column, each will produce 5 rows in turn.
  </ParamField>

  <ParamField path="query" type="string">
    The *graphext advanced query* used to identify the rows to select previous to the grouping.
  </ParamField>
</Accordion>
