link_session_items
Link items (e.g. products) in sessions (baskets) if one item makes the presence of the other in the same session more likely.
A link (or association) A->B is created between items A and B if the presence of A makes the presence of B in the same session N times more likely.
For further details about the algorithm see e.g. association rule learning.
Usage
The following example shows how the step can be used in a recipe.
The following call creates links between pairs of items A and B, if:
- A occurs in at least 7 sessions
- B occurs in at least 25% of sessions containing A
- The presence of A in a session makes the presence of B in the same session at least twice as likely.
Note that the last condition is equivalent to saying that the overall frequency of B in all sessions must be less than 12.5% (half of 25%). In other words, a minimum lift of 2 means that the frequency of B, in sessions already containing A, must be twice the background frequency of B in general.
As an example, the percentage of shopping baskets containing milk (item B) may be 10%. However, amongst those baskets already containing cereals, the percentage containing milk is likely to be higher. If milk occured e.g. in 30% of baskets also having cereals, than the lift of the rule cereal->milk would be 3. The buying of cereal make the buying of milk 3 times more likely.
Inputs & Outputs
The following are the inputs expected by the step and the outputs it produces. These are generally
columns (ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Configuration
The following parameters can be used to configure the behaviour of the step by including them in
a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output)
.
Minimum Support. Minimum support of a rule antecedent. If it is < 1 it will be taken as a proportion. In any other case it will be expected as a positive integer representing the count. Create link A->B only if A occurred in at least this many sessions.
Minimum Confidence. Expressed as a rule as a percentage. Include link A->B only if B occurred in at least this percentage of sessions also containing A.
Values must be in the following range:
Minimum Lift. Expressed as multipler/ratio. Include link A->B only if A makes the presence of B in the same sessions at least this many times more likely.
Metric for link weight.
Values must be one of the following:
itemset_support_abs
itemset_support_pct
filter_metric_abs
filter_metric_pct
antecedent_support_abs
antecedent_support_pct
consequent_support_abs
consequent_support_pct
rule_confidence_pct
rule_lift_abs
rule_lift_pct
Was this page helpful?