Prepare
Filter
Step | Fast | Description |
---|---|---|
filter_containing | Filter rows containing any or all of a number of specified values | |
filter_duplicate_nodes | Remove duplicate nodes in a network | |
filter_duplicates | Filter duplicate rows, keeping the first or last of each set of duplicates found only | |
filter_missing | Filter rows based on missing values in one or more columns | |
filter_range | Filter rows based on the numeric values in a given column | |
filter_row_numbers | Filter rows by row number | |
filter_rows | Filter rows using graphext’s advanced query syntax (similar to Elasticsearch) | |
filter_sample | Randomly sample the dataset, optionally within groups (can be used to balance a dataset) | |
filter_topn | Sort a dataset by selected columns and pick the first N rows (or exclude them) | |
filter_values | Filter rows where column matches specified values exactly | |
filter_with_formula | Filter rows using a (pandas-compatible) formula | |
upsample | Upsample a dataset given a weight column |
Was this page helpful?