Skip to main content
If only a single input column is provided, even if it is a list, the result will be a text column by default. If multiple columns are passed, and any of these contains lists, then the result is also a column of lists. In this case, each output list will contain the result of concatenating all elements in the corresponding row, whether these elements are themselves lists or not. If none of the multiple input columns contains lists, the result will be a text column. Each input column will be converted to a string representation if necessary, and then concatenated with a given separator and pre- and/or postfix. You can change this default behavior by explicitly setting an out_type in params.

Usage

The following examples show how the step can be used in a recipe.

Examples

  • Example 1
  • Example 2
  • Signature
The following example combines first names, last names and a title to create a new column with values in the form “Dr. first_name last_name”:
concatenate(ds.first_name, ds.last_name, {
  "separator": " ",
  "prefix": "Dr. "
}) -> (ds.title_fullname)

Inputs & Outputs

The following are the inputs expected by the step and the outputs it produces. These are generally columns (ds.first_name), datasets (ds or ds[["first_name", "last_name"]]) or models (referenced by name e.g. "churn-clf").
*columns
column
One or more columns to concatenate.
result
column
required
Column containing the result of the concatenation.

Configuration

The following parameters can be used to configure the behaviour of the step by including them in a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output).

Parameters

separator
[string, null]
A separator to use between elements of individual columns when concatenating as texts.
prefix
[string, null]
A prefix to prepend to the result of the concatenation (or to a single column if no more were provided).
postfix
[string, null]
A postfix to append to the result of the concatenation (or to a single column if no more were provided).
nan_as
[string, null]
default:""
How to represent missing values (NaN) in the concatenated result. If a “nan_as” value is specified, this will be used to fill in missing values during concatenation. With "nan_as": null the concatenation will produce a missing value in rows where at least 1 column to be concatenated had a missing value.
out_type
[string, null]
The semantic data type of the output column. Note, if this type is not compatible with the result of the concatenation, the output may consist of missing values (NaNs) only.Values must be one of the following:category date number currency url boolean text list[category] list[date] list[number] list[currency] list[url] list[boolean]
I