StepFastDescription
filter_containingFilter rows containing any or all of a number of specified values
filter_duplicate_nodesRemove duplicate nodes in a network
filter_duplicatesFilter duplicate rows, keeping the first or last of each set of duplicates found only
filter_missingFilter rows based on missing values in one or more columns
filter_rangeFilter rows based on the numeric values in a given column
filter_row_numbersFilter rows by row number
filter_rowsFilter rows using graphext’s advanced query syntax (similar to Elasticsearch)
filter_sampleRandomly sample the dataset, optionally within groups (can be used to balance a dataset)
filter_topnSort a dataset by selected columns and pick the first N rows (or exclude them)
filter_valuesFilter rows where column matches specified values exactly
filter_with_formulaFilter rows using a (pandas-compatible) formula
upsampleUpsample a dataset given a weight column