For each node in a network, group and aggregate over its neighbours.
Examples
products
where each row represents a supermarket product (having at least a price
and aisle
column),
and containing a targets column dataset representing connections between similar products, the following example calculates for
each productds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Inputs
Outputs
step(..., {"param": "value", ...}) -> (output)
.
Parameters
list
) the encountered order is important.Properties
"gx_weight"
Options
null
, no sorting is applied.Examples
Examples
sum
aggregation of column A calculates a single total by adding up all the values in A belonging to each group.Possible aggregations functions accepted as func
parameters are:n
, size
or count
: calculate number of rows in groupsum
: sum total of valuesmean
: take mean of valuesmax
: take max of valuesmin
: take min of valuesmode
: find most frequent value (returns first mode if multiple exist)first
: take first item foundlast
: take last item foundunique
: collect a list of unique valuesn_unique
: count the number of unique valueslist
: collect a list of all valuesconcatenate
: convert all values to text and concatenate them into one long textconcat_lists
: concatenate lists in all rows into a single larger listcount_where
: number of rows in which the column matches a value, needs parameter value
with the value that you want to countpercent_where
: percentage of the column where the column matches a value, needs parameter value
with the value that you want to countcount_where
and percent_where
an additional value
parameter is required.Item properties
Item properties
"func"
parameter. If the aggregation function accepts further arguments,
like the "value"
parameter in case of count_where
and percent_where
, these need to be provided also.
For example:Properties
n
size
count
sum
mean
n_unique
count_where
percent_where
concatenate
max
min
first
last
mode
concat_lists
unique
list
Examples
"directed": false
, in contrast, i.e. links are undirected,
it is assumed that the link A→B is always identical to B→A (i.e. A↔B always). This is usually the case when
links represent a similarity between nodes.