> ## Documentation Index
> Fetch the complete documentation index at: https://docs.graphext.com/llms.txt
> Use this file to discover all available pages before exploring further.

# split_string

> Split a single column containing texts into two. 

The values of a text column will be split in two at the first occurrence of a given pattern, returning two new text columns.
For example, splitting a text column on the comma character (",") will produce two new columns: the first containing everything
before the first comma encountered in each text, and the second containing all text encountered after the comma.

If the specified split pattern was not encountered in any of the input texts, the first output column will contain
the original text, and the second column will contain missing values only (NaN).

## Usage

The following example shows how the step can be used in a recipe.

<Accordion title="Examples" icon="code" defaultOpen="true">
  <Tabs>
    <Tab title="Example 1">
      E.g. to split on the first comma encountered starting from the left of each text:

      ```stan theme={null}
      split_string(ds.text, {"pattern": ","}) -> (ds.text_left, ds.text_right)
      ```
    </Tab>

    <Tab title="Signature">
      General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.

      ```stan theme={null}
      split_string(input: text|category, {
          "param": value,
          ...
      }) -> (output_left: text, output_right: text)
      ```
    </Tab>
  </Tabs>
</Accordion>

## Inputs & Outputs

The following are the inputs expected by the step and the outputs it produces. These are generally
columns (`ds.first_name`), datasets (`ds` or `ds[["first_name", "last_name"]]`) or models (referenced
by name e.g. `"churn-clf"`).

<Accordion title="Inputs" icon="right-to-bracket">
  <ParamField path="input" type="column[text|category]" required>
    A text column to split.
  </ParamField>
</Accordion>

<Accordion title="Outputs" icon="right-from-bracket">
  <ParamField path="output_left" type="column[text]" required>
    A text column containing the part to the left of the given split pattern.
  </ParamField>

  <ParamField path="output_right" type="column[text]" required>
    A text column containing the part to the right of the given split pattern.
  </ParamField>
</Accordion>

## Configuration

The following parameters can be used to configure the behaviour of the step by including them in
a json object as the last "input" to the step, i.e. `step(..., {"param": "value", ...}) -> (output)`.

<Accordion title="Parameters" defaultOpen="true" icon="sliders">
  <ParamField path="pattern" type="string" default=" ">
    A pattern of characters indicating where to split each text. By default uses the whitespace " ".
  </ParamField>

  <ParamField path="right" type="boolean" default="false">
    Whether to search for the pattern starting from the right instead of starting from the left (default).
  </ParamField>
</Accordion>
