join parameter controls whether only the
common columns are kept (inner), or all columns (outer). In the latter case, rows will have missing
values (NaNs), where a column only existed in one of the two datasets.
Usage
The following example shows how the step can be used in a recipe.Examples
Examples
- Example 1
- Signature
To append the rows of dataset
ds_right to the dataset ds_left, keeping all columns from both datasets:Inputs & Outputs
The following are the inputs expected by the step and the outputs it produces. These are generally columns (ds.first_name), datasets (ds or ds[["first_name", "last_name"]]) or models (referenced
by name e.g. "churn-clf").
Inputs
Inputs
Outputs
Outputs
A dataset containing the rows of both
ds_left, and ds_right,
as well as an aditional column original_index indicating the index of each row in its original dataset.Configuration
The following parameters can be used to configure the behaviour of the step by including them in a json object as the last “input” to the step, i.e.step(..., {"param": "value", ...}) -> (output).
Parameters
Parameters
Whether to do concatenate using an “inner” or “outer” join of columns.
When
"inner", only common columns will be kept. When "outer", all columns will be kept.Values must be one of the following:innerouter