At the moment the only supported clustering algorithm is Louvain. Louvain tries to identify the communities in a network by optimizing the modularity of the whole network, that is a measure of the density of edges inside communities to edges outside communities. The result is a column of cluster IDs (integers), where the value -1 is reserved for nodes in very small clusters, which are grouped into a “noise” cluster.Documentation Index
Fetch the complete documentation index at: https://docs.graphext.com/llms.txt
Use this file to discover all available pages before exploring further.
Usage
The following example shows how the step can be used in a recipe.Examples
Examples
- Example 1
- Signature
The following configuration allows for smallish clusters and considers fewish data points as noise:
Inputs & Outputs
The following are the inputs expected by the step and the outputs it produces. These are generally columns (ds.first_name), datasets (ds or ds[["first_name", "last_name"]]) or models (referenced
by name e.g. "churn-clf").
Inputs
Inputs
Outputs
Outputs
A column containing cluster tags.
Configuration
The following parameters can be used to configure the behaviour of the step by including them in a json object as the last “input” to the step, i.e.step(..., {"param": "value", ...}) -> (output).
Parameters
Parameters
Clustering algorithm to use.
Only Louvain is currently supported.Values must be one of the following:
louvain
The higher this value the bigger the clusters.Values must be in the following range:
The larger the value, the more conservative the clustering.
Cluster with this number of nodes or less will be considered noise.Values must be in the following range:
The graphext advanced query syntax used to select rows.