Use language models to calulate an embedding for each text in provided column.
link_embeddings
,
for example, to create a network of texts connected by similarity.
In this step, embeddings of texts are calculated using pre-trained
neural language models, especially those using the
popular transformer architecture (e.g.
Bert-based models).
embed_text
, which uses a different, appropriate spaCy
model for each language in the text column, this step will always use a single model only to calculate embeddings. This
means the model should be multilingual if you have mixed languages, and that otherwise you need to choose the
correct model for your (single) language.Examples
ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Inputs
Outputs
step(..., {"param": "value", ...}) -> (output)
.
Parameters
Examples