filter_rows
Filter rows using graphext’s advanced query syntax (similar to Elasticsearch).
Usage
The following examples show how the step can be used in a recipe.
This simple query creates a new dataset only including those rows where the ‘age’ columns is greater than 18:
This simple query creates a new dataset only including those rows where the ‘age’ columns is greater than 18:
Filter clients who are legally adults.
Select clients who are exactly 19 years old.
Filter clients who are over 27 years old but below the mean age.
Select all clients belonging to the cool class.
Select clients belonging to the 4 most frequent classes.
Filter clients aged 18 who earn more than 50 dollars monthly on average.
Filter fire dates in 2020.
Select rows where the text column contains both “he” and “she”.
Filter rows where ‘Average Monthly Hours’ is greater than 5 and less than 7.
General syntax for using the step in a recipe. Shows the inputs and outputs the step is expected to receive and will produce respectively. For futher details see sections below.
Inputs & Outputs
The following are the inputs expected by the step and the outputs it produces. These are generally
columns (ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Configuration
The following parameters can be used to configure the behaviour of the step by including them in
a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output)
.
The graphext advanced query used to identify the rows to keep.