Categorize people into fields of occupation using their bios (biographies).
The categorization is performed using a predefined lookup-table matching certain keywords with associated fields of occupation. E.g. bios will be categorized as “journalists” if their texts contain any of the following words: “periodista”, “journalist”, “journalism”, “periodismo”, “news”, “noticia”, “noticias”.
Possible categories currently are:
The following example shows how the step can be used in a recipe.
Examples
This step has no configuration parameters, so simply use
This step has no configuration parameters, so simply use
General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.
The following are the inputs expected by the step and the outputs it produces. These are generally
columns (ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Inputs
A column containing biographies (e.g. from social network profiles).
Outputs
A column containing one or more fields of occupation for each bio.
The following parameters can be used to configure the behaviour of the step by including them in
a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output)
.
Parameters
This step doesn’t expect any configuration.
Categorize people into fields of occupation using their bios (biographies).
The categorization is performed using a predefined lookup-table matching certain keywords with associated fields of occupation. E.g. bios will be categorized as “journalists” if their texts contain any of the following words: “periodista”, “journalist”, “journalism”, “periodismo”, “news”, “noticia”, “noticias”.
Possible categories currently are:
The following example shows how the step can be used in a recipe.
Examples
This step has no configuration parameters, so simply use
This step has no configuration parameters, so simply use
General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.
The following are the inputs expected by the step and the outputs it produces. These are generally
columns (ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Inputs
A column containing biographies (e.g. from social network profiles).
Outputs
A column containing one or more fields of occupation for each bio.
The following parameters can be used to configure the behaviour of the step by including them in
a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output)
.
Parameters
This step doesn’t expect any configuration.