Skip to main content
For each row this step iterates over the IDs in the targets_in column, and if an ID exists also in the source column, the corresponding rows will be connected, optionally with specified attributes. The targets_in column may contain one target ID per row, or lists of target IDs. In either case, any additional attribute columns should be of the same type. I.e. if each row specifies multiple links via lists in targets_in, then attribute columns should also contain lists of the same length, such that each link can be assigned its corresponding attribute. If the lengths of lists containing target IDs and attributes do not match, the attributes for links in that row will be missing. If attributes are single-valued (not containing lists), all links specified in that row will have the same attribute value. Note that the types of values in source and link_targets identifying the nodes/rows to be linked should also match. Ideally, either both columns have numeric values or both have string-like (categorical) values. However, as long as one can be converted safely to the other, linking will work as expected (e.g. source IDs could be specified as numbers [0, 1, 2] and target IDs as strings [“3”, “2”, “1”] without the step failing). The step will generate at least target and weight columns, as well as another column for each input. If link attribute columns were passed, the weight_column parameter should be used to identify the column containing link weights (importances). If there is no such column, the parameter value should be null, in which case an new weights column will be generated automatically (see parameters below).

Usage

The following example shows how the step can be used in a recipe.

Examples

  • Example 1
  • Signature
In the following example we connect rows/nodes identified in the column link_source, to rows/nodes specified in the column link_targets, which contains lists of such link targets. Additionally, we use the columns link_weights and links_are_reciprocal (which contain lists of the same lengths as targets), to add attributes to the created links (the weight of the link and whether it is unidirectional or bidirectional).
link_rows(ds.link_source, ds.link_targets, ds.link_weights, ds.links_are_reciprocal) -> (ds.targets_out, ds.weights_out, ds.are_reciprocal_out)

Inputs & Outputs

The following are the inputs expected by the step and the outputs it produces. These are generally columns (ds.first_name), datasets (ds or ds[["first_name", "last_name"]]) or models (referenced by name e.g. "churn-clf").
source
column[number|category]
required
A column of (numerical or categorical) IDs identifying the nodes/rows acting as the source of a link. These need to be compatible with the IDs in the targets_in column! E.g. if these are twitter handles, then the targets must also be twitter handles.
targets_in
column[number|category|list[number]|list[category]]
required
A column containing (potentially lists) of IDs corresponding to link targets.
*attrs_in
column
One ore more optional attributes for the links. Must be lists of the same lengths as link_targets if the latter contains lists. If an attribute column has a single value per row, it is assumed that all targets in that row have the same attribute value.
targets_out
column
required
A column containing new lists of IDs corresponding to link targets.
*attrs_out
column
If optional inputs were provided, new weight/attribute columns between connected nodes.

Configuration

The following parameters can be used to configure the behaviour of the step by including them in a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output).

Parameters

weight_column
[string, null]
required
Name of the column acting as the weights of the links. Must refer to one of the optional columns passed to the step. If null, an extra output column will be created containing a weight of 1.0 for each link defined in the target column (unless a weight_factor is applied, in which case the weights will have the corresponding value, see below).
weight_factor
number
default:"1.0"
Multiply link weights by this number.
I