Skip to content

Link session items

network · basket analysis · association rules

Link items (e.g. products) in sessions (baskets) if one item makes the presence of the other in the same session more likely.

A link (or association) A->B is created between items A and B if the presence of A makes the presence of B in the same session N times more likely.

For further details about the algorithm see association rule learning.

Example

The following call creates links between pairs of items A and B, if:

  • A occurs in at least 7 sessions
  • B occurs in at least 25% of sessions containing A
  • The presence of A in a session makes the presence of B in the same session at least twice as likely.

Note that the last condition is equivalent to saying that the overall frequency of B in all sessions must be less than 12.5% (half of 25%). In other words, a minimum lift of 2 means that the frequency of B, in sessions already containing A, must be twice the background frequency of B in general.

As an example, the percentage of shopping baskets containing milk (item B) may be 10%. However, amongst those baskets already containing cereals, the percentage containing milk is likely to be higher. If milk occured e.g. in 30% of baskets also having cereals, than the lift of the rule cereal->milk would be 3. The buying of cereal make the buying of milk 3 times more likely.

link_session_items(items.id, sessions.item_ids, {
  "min_support": 7
  "min_confidence": 25
  "min_lift": 2
}) -> (links)

Usage

The following are the step's expected inputs and outputs and their specific types.

link_session_items(
    item_id: category|number,
    session_item_ids: category, 
    {
        "param": value
    }
) -> (links: dataset)

where the object {"param": value} is optional in most cases and if present may contain any of the parameters described in the corresponding section below.

Inputs


item_id: column:category|number

Quantitative or categorical Column containing the IDs of items to analyze.


session_item_ids: column:category

A categorical Column containing lists of IDs corresponding to items in the same sessions, basket etc.

Outputs


links: dataset

A new Dataset containing links between associated items.

Parameters


min_support: number | integer = 10

Minimum Support. Minimum support of a rule antecedent. If it is < 1 it will be taken as a proportion. In any other case it will be expected as a positive integer representing the count. Create link A->B only if A occurred in at least this many sessions.


min_confidence: number = 20

Minimum Confidence. Expressed as a rule as a percentage. Include link A->B only if B occurred in at least this percentage of sessions also containing A.

Range: 0 ≤ min_confidence ≤ 100


min_lift: number | null

Minimum Lift. Expressed as multipler/ratio. Include link A->B only if A makes the presence of B in the same sessions at least this many times more likely.


weight_metric: string = "rule_lift_pct"

Metric for link weight.

Must be one of: "itemset_support_abs", "itemset_support_pct", "filter_metric_abs", "filter_metric_pct", "antecedent_support_abs", "antecedent_support_pct", "consequent_support_abs", "consequent_support_pct", "rule_confidence_pct", "rule_lift_abs", "rule_lift_pct"