The values of a text column will be split in two at the first occurrence of a given pattern, returning two new text columns. For example, splitting a text column on the comma character (”,”) will produce two new columns: the first containing everything before the first comma encountered in each text, and the second containing all text encountered after the comma.

If the specified split pattern was not encountered in any of the input texts, the first output column will contain the original text, and the second column will contain missing values only (NaN).

Usage

The following example shows how the step can be used in a recipe.

Inputs & Outputs

The following are the inputs expected by the step and the outputs it produces. These are generally columns (ds.first_name), datasets (ds or ds[["first_name", "last_name"]]) or models (referenced by name e.g. "churn-clf").

Configuration

The following parameters can be used to configure the behaviour of the step by including them in a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output).