Skip to content

Label bios


Categorize people into fields of occupation using their bios (biographies).

The categorization is performed using a predefined lookup-table matching certain keywords with associated fields of occupation. E.g. bios will be categorized as "journalists" if their texts contain any of the following words: "periodista", "journalist", "journalism", "periodismo", "news", "noticia", "noticias".

Possible categories currently are:

  • journalists
  • business
  • developers
  • marketing
  • travel
  • photography
  • university
  • seo
  • blogging
  • sports
  • politics
  • social sciences
  • medical
  • entertainment
  • art design
  • economics
  • videogames.


The following are the step's expected inputs and outputs and their specific types.

Step signature
label_bios(bios: text) -> (labels: list[category])

where the object {"param": value} is optional in most cases and if present may contain any of the parameters described in the corresponding section below.


This step has no configuration parameters, so simply use

Example call (in recipe editor)
label_bios(ds.text) -> (ds.field_of_occupation)


bios: column:text

A column containing biographies (e.g. from social network profiles).


labels: column:list[category]

A column containing one or more fields of occupation for each bio.