Replace parts of text detected with a regular expression.
{"pattern": "hi", "replacement": "hello"}
.
However, using capturing groups in pattern
and replacement
parameters allows for much greater flexibility.
For example, if a column of texts includes twitter mentions of the form “@abc”, the regular expression
"pattern": "@(\\w*)"
will match these mentions and save the actual name without the ”@” character in a capturing
group. Using the replacement string "replacement": "{1}"
will then replace all matched mentions with only the name
part of the twitter handle, effectively removing the ”@” tags from all mentions (without removing other occurrences of
the ”@” character).
To further familiarize yourself with the regex language also see these references:
Examples
ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Inputs
Outputs
text
: if input is text and parameter "as_category": false
category
: if input is not a column of lists and "as_category": true
list
: if input is a column of lists.step(..., {"param": "value", ...}) -> (output)
.
Parameters
pattern
may include (numbered) regex capturing groups,
which allows this method to use parts of a match to format the way matches are then replaced in the output via the replacement
parameter (see below).Array items
a
ascii
debug
i
ignorecase
l
locale
m
multiline
s
dotall
x
verbose