Group a dataset of tweets by author and calculate relevant author statistics.
aggregate
step, but with a predefined set of aggregation functions. See the ds_out
argument below
for the columns generated in the resulting dataset.
Examples
ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Inputs
Outputs
author_id
: Official Twitter IDtweet_count
: Number of tweets by this authorhandler
: Official Twitter handlename
: User namepic
: Link to user’s profile picturelinks
: A list of links mentioned by the userdates
: A list of dates of published tweets by this authortweet_ids
: The official Twitter IDs of the tweets published by the authorretweets
: The number of retweets receivedfavorites
: The number of favorites receivedmention_ids
: List of other accounts (IDs) the author has mentionedmention_names
: List of other accounts (names) the author has mentionedrp_user_ids
: List of other accounts (IDs) the author has replied torp_user_names
: List of other accounts (names) the author has replied tomentions
: The count of mentions receivedreplies
: The count of replies receivedtweet_text
: The text of the author’s tweets, concatenated.step(..., {"param": "value", ...}) -> (output)
.
Parameters
mention_ids
, mention_names
and/or rp_user_id
, rp_user_name
) will add the corresponding accounts as rows in the result,
even if they didn’t have a tweet in the original dataset.Will add mentions
and replies
columns recording how many times the accounts were
mentioned or replied to.{"your_column": "author_id"}
.The expected column names are [author_id, author_handler, author_name, author_avatar, links, date, id, retweets, favorites, mention_ids, mention_names, rp_user_id, rp_user_name , text]
.Pattern properties