Skip to content

Cluster subnetwork

fast step  networkgraphlouvaincommunity detection

Identify clusters in the network by filtering the input dataset.

At the moment the only supported clustering algorithm is Louvain. Louvain tries to identify the communities in a network by optimizing the modularity of the whole network, that is a measure of the density of edges inside communities to edges outside communities. The result is a column of cluster IDs (integers), where the value -1 is reserved for nodes in very small clusters, which are grouped into a "noise" cluster.

Usage


The following are the step's expected inputs and outputs and their specific types.

Step signature
cluster_subnetwork(ds_in: dataset, {"param": value}) -> (cluster: column)

where the object {"param": value} is optional in most cases and if present may contain any of the parameters described in the corresponding section below.

Example

The following configuration allows for smallish clusters and considers fewish data points as noise:

Example call (in recipe editor)
cluster_subnetwork(ds, {
  "targets": "targets",
  "weights": "weights",
  "resolution": 0.3,
  "noise": 5
}) -> (ds.cluster)

Inputs


ds_in: dataset

An input dataset to use as source of the network.

Outputs


cluster: column

Parameters


targets: string

Name of column containing the link targets. Source is implied in the index.


weights: string

Name of column containing the link weights.


query: string

The graphext advanced query syntax used to select rows.


algorithm: string = "louvain"

Clustering algorithm to use.

Must be one of: "louvain"


resolution: number = 0.5

The higher this value the bigger the clusters.

Range: 0 < resolution ≤ 1


noise: integer = 1

The larger the value, the more conservative the clustering. Cluster with this number of nodes or less will be considered noise.

Range: 0 ≤ noise < inf