append_rows
Add rows from one dataset to another.
I.e., vertically concatenates two datasets, appending the rows of the second to the end of the first.
When the two datasets contain different columns, the join
parameter controls whether only the
common columns are kept (inner
), or all columns (outer
). In the latter case, rows will have missing
values (NaNs), where a column only existed in one of the two datasets.
Usage
The following example shows how the step can be used in a recipe.
Examples
Examples
To append the rows of dataset ds_right
to the dataset ds_left
, keeping all columns from both datasets:
To append the rows of dataset ds_right
to the dataset ds_left
, keeping all columns from both datasets:
General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.
Inputs & Outputs
The following are the inputs expected by the step and the outputs it produces. These are generally
columns (ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Inputs
Inputs
Outputs
Outputs
A dataset containing the rows of both ds_left
, and ds_right
,
as well as an aditional column original_index
indicating the index of each row in its original dataset.
Configuration
The following parameters can be used to configure the behaviour of the step by including them in
a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output)
.
Parameters
Parameters
Whether to do concatenate using an “inner” or “outer” join of columns.
When "inner"
, only common columns will be kept. When "outer"
, all columns will be kept.
Values must be one of the following:
inner
outer