Skip to content

Extract text features

NLPtext

Parse and process texts to extract multiple features at once.

Essentially combines all of the following steps into one:

  • embed_text
  • extract_emoji
  • extract_entities
  • extract_hashtags
  • extract_keywords
  • extract_mentions
  • infer_sentiment
  • tokenize

Note that the step does not currently allow for detailed configuration of each of the extracted features. To do that, use any or all of the individual steps above.

Usage


The following are the step's expected inputs and outputs and their specific types.

Step signature
extract_text_features(
    text: text,
    lang: category, 
    {
        "param": value
    }
) -> (
    Sentiment: number,
    Embedding: list[number],
    Hashtags: list[category],
    Mentions: list[category],
    Keywords: list[category],
    Tokens: list[category],
    Emoji: list[category],
    People: list[category],
    Groups: list[category],
    Organizatons: list[category],
    GPEs: list[category],
    Locations: list[category],
    Products: list[category],
    Events: list[category],
    Money: list[category]
)

where the object {"param": value} is optional in most cases and if present may contain any of the parameters described in the corresponding section below.

Inputs


text: column:text


lang: column:category

Outputs


Sentiment: column:number


Embedding: column:list[number]


Hashtags: column:list[category]


Mentions: column:list[category]


Keywords: column:list[category]


Tokens: column:list[category]


Emoji: column:list[category]


People: column:list[category]


Groups: column:list[category]


Organizatons: column:list[category]


GPEs: column:list[category]


Locations: column:list[category]


Products: column:list[category]


Events: column:list[category]


Money: column:list[category]

Parameters