Link session items¶
network • basket analysis • association rules
Link items (e.g. products) in sessions (baskets) if one item makes the presence of the other in the same session more likely.
A link (or association) A->B is created between items A and B if the presence of A makes the presence of B in the same session N times more likely.
For further details about the algorithm see e.g. association rule learning.
Usage¶
The following are the step's expected inputs and outputs and their specific types.
link_session_items(
items: category|number,
sessions: list[category]|list[number],
{
"param": value
}
) -> (targets: column, weights: column)
where the object {"param": value}
is optional in most cases and if present may contain any of the parameters described in the
corresponding section below.
Example¶
The following call creates links between pairs of items A and B, if:
- A occurs in at least 7 sessions
- B occurs in at least 25% of sessions containing A
- The presence of A in a session makes the presence of B in the same session at least twice as likely.
Note that the last condition is equivalent to saying that the overall frequency of B in all sessions must be less than 12.5% (half of 25%). In other words, a minimum lift of 2 means that the frequency of B, in sessions already containing A, must be twice the background frequency of B in general.
As an example, the percentage of shopping baskets containing milk (item B) may be 10%. However, amongst those baskets already containing cereals, the percentage containing milk is likely to be higher. If milk occured e.g. in 30% of baskets also having cereals, than the lift of the rule cereal->milk would be 3. The buying of cereal make the buying of milk 3 times more likely.
link_session_items(items.id, sessions.item_ids, {
"min_support": 7
"min_confidence": 25
"min_lift": 2
}) -> (items.targets, items.weights)
Inputs¶
items: column:category|number
A column containing the IDs of items to analyze.
sessions: column:list[category]|list[number]
A column containing lists of IDs corresponding to items in the same sessions, basket etc.
Outputs¶
targets: column
A column containing for each item a list of IDs (row numbers) identfying other items it will be linked to.
weights: column
A column containing for each item a list of weights identfying the "importance" of each link to
other items identified in the targets
column.
Parameters¶
min_support: number | integer = 10
Minimum Support. Minimum support of a rule antecedent. If it is < 1 it will be taken as a proportion. In any other case it will be expected as a positive integer representing the count. Create link A->B only if A occurred in at least this many sessions.
min_confidence: number = 20
Minimum Confidence. Expressed as a rule as a percentage. Include link A->B only if B occurred in at least this percentage of sessions also containing A.
Range: 0 ≤ min_confidence ≤ 100
min_lift: number | null
Minimum Lift. Expressed as multipler/ratio. Include link A->B only if A makes the presence of B in the same sessions at least this many times more likely.
weight_metric: string = "rule_lift_pct"
Metric for link weight.
Must be one of:
"itemset_support_abs"
,
"itemset_support_pct"
,
"filter_metric_abs"
,
"filter_metric_pct"
,
"antecedent_support_abs"
,
"antecedent_support_pct"
,
"consequent_support_abs"
,
"consequent_support_pct"
,
"rule_confidence_pct"
,
"rule_lift_abs"
,
"rule_lift_pct"