Surgically precise selection and filtering
AND, OR, NOT
<, >, <=, >=
REGEX, SUBSTR, FUZZY
TOP, FREQ
TOP
): BACKGROUND, FOREGROUND, UPLIFT, TFIDF, ORDINAL
NULL
MIN, MAX, MEAN, P25, MEDIAN, P75
NULL
operator, which exclusively returns null rows.age
value greater than 10 and less than 55:
TOP
, FREQ
, FUZZY
, SUBSTR
and REGEX
operators. They work similarly since
both are based on text: categories are just very short expressions, whereas text
tends to present a longer format.
ball
, which will return all rows that contain the word ball
by itself.FUZZY(ball)
will return all rows that contain the word ball
, whether ball
is part of other words or appears by itself.
FUZZY
is case insensitive and normalizes all input (a.k.a ASCII folding) before searching.SUBSTR(ball)
will return only rows that contain ball
as part of a word, but not by itself.REGEX()
will accept a regular expression string to match more complex patterns.Selecting FUZZY(ball) yields more results than just ball
FREQ
selects those terms whose frequency is greater or equal than n
: FREQ(10000)
selects those values whose frequency is greater or equal to 10K.TOP
selects the top n
terms in terms of frequency: TOP(10)
selects the top 10.
TOP
’s behavior by saying TOP(10, FOREGROUND)
, which would select the top 10 out of the current selection we have made.TOP(10, UPLIFT)
selects the top 10 after sorting them by how different the frequency is between the selection and the whole dataset. These operators are the same as the ones mentioned in the sorting section.Since ELECTRONIC_CABLE has ~17K occurrencies, it was left out of the greater than 20K selection.
Text or category notation examples
department
column exactly contains “engineering”:text
column exactly matches “he” and “she”:department
column:Simply selecting a range in the little plot will create a query string with the greater/less than and AND operators.
>= P25 AND <= P75
, which returns all rows that are both
greater or equal than the P25 and less or equal than the P75. This effectively returns the the interquartile range.
In numeric columns, numbers can also be specified using scientific notation. The following two numbers are both valid and represent the same number: 145000 and 1.45e5.
Queries in numeric columns support =, >, >=, < and operators.
Numerical Notation Examples
>= 2018-06-05T11:33:48.554Z AND <= 2021-06-26T05:56:18.172Z
, which means anything after June 5th, 2018 at 11:33:48 and before June 26th, 2021 at 05:56:18, effectively
returning dates within that time interval. Same as with numbers, simply creating a range in the cross filter will generate this query for you to adjust, in case more precision was needed.
Here are some examples of valid date notation:
To select all rows whose “date” field falls into the year 2019:
Date notation examples