Detect the language used for each text in the input column.
Examples
ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Inputs
Outputs
step(..., {"param": "value", ...}) -> (output)
.
Parameters
"lingua"
: https://github.com/pemistahl/lingua-py"fasttext"
: https://fasttext.cc/docs/en/language-identification.html"langdetect"
: https://github.com/Mimino666/langdetect"langid"
: https://github.com/saffsd/langid.py.lingua
fasttext
langdetect
langid
true
or null
), we restrict this
to the languages which we have spaCy models for, because this is the most common use of
language detection in Graphext (applying the correct spaCy language model to extract keywords
e.g.).If set to false
, will allow detection of all languages supported by the selected model.If set to a list of ISO 639-1 codes, only these languages are detected (if supported by
the model).Array items