Relabel categories based on the top terms in each category.
top_terms
, within
each category. It takes two columns as inputs: one with the old_labels
, which can be single or multi-valued categories,
and one with the top_terms
for each data point. The replacement of the labels is influenced by the specified rank method,
which can be TFIDF
, BACKGROUND
, FOREGROUND
, UPLIFT
, ORDINAL
, or ALPHANUM
, and the number of top terms considered
(specified by top_n
).
Examples
ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Inputs
Outputs
step(..., {"param": "value", ...}) -> (output)
.
Parameters
TFIDF
BACKGROUND
FOREGROUND
UPLIFT
ORDINAL
ALPHANUM