slice
Extract a range/slice of elements from a column of texts or lists.
Using start
, stop
and step
to define a range of indices, the corresponding range of elements is extracted
from each text or list in the input column.
Note: indices start at 0, and a stop
of 3 means elements up to but not including the element at index 3
will be extracted. In particular this means simply specifying "stop": 3
(setting or leaving start
at its
default of 0), will extract 3 elements in total.
Usage
The following shows how the step can be used in a recipe.
Examples
Examples
General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.
General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.
Inputs & Outputs
The following are the inputs expected by the step and the outputs it produces. These are generally
columns (ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Inputs
Inputs
A column containing texts or lists to extract a range of characters or elements from.
Outputs
Outputs
Contains the extracted slices. The type depends on the out_type
parameter, and needs to be consistent with the transformation.
Configuration
The following parameters can be used to configure the behaviour of the step by including them in
a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output)
.
Parameters
Parameters
Index of the first element to be extracted from each string or list in the input column.
Index at which to stop including elements from the list.
The element at the stop
index will not be included. As an example, "start": 0, "stop": 3
will include all elements up to but not including the element at index 3, thus extracting a
total of 3 elements.
Step size used to move from start
to stop
index.
E.g., if "step": 2
, only every second element from the range [start
, stop
] is returned.
Select types using their name.
Values must be one of the following:
category
date
number
boolean
url
sex
text
list[number]
list[category]
list[url]
list[boolean]
list[date]