Rediscover data exploration interactively
Cross filters are one of the most powerful tools Graphext offers. They are natural to use, as they show the distribution of your variables, but enable exploration on different combinations of values, making the whole interface reactive.
For example, in this dataset holding transactions from an e-commerce, we can filter those transactions made between 2020 and 2022:
which leaves us with 72% of the data, 1.1M rows out of 1.6M we have in total.
Notice the relative scale on the right, spanning from 0 to 12% (really it’s more like ~13%). That is now telling us how much of our data lies on each of the bars (called bins).
This (or any) selection affects every other cross filter. This is what makes them so powerful: they all behave like one single system informing of the different distributions of your variables.
When selecting the category GIFT_CARD, we see a very prominent decrease in sales from the end of 2021 and onwards
It is worth noting that using cross filters affects the whole state of the application, meaning that Graph and Plot also react to whatever you are selecting.
Cross filters can also be sorted and searched, making surgically precise questions a breeze to answer.
You can sort categorical and text variables, in several ways. The default is “by everything”, which just means the frequency of each value sorted in descending order; the most common items appear first.
You also have these other methods available:
A practical example
For example, say we select this specific demographic of women between the age 18 and 24.
we can see how the category column changes based on this information, and sorting it according to the most relevant data. That is, the one that differs most with respect the whole dataset, without selection.
If we sort Category based on Uplift, we see interesting stuff:
corsets, jewlery and hair extensions come as one of the most distictive results for this specific subset of data. Which, indeed makes sense.
Remember we are not sorting by frequency (since that’s the default), but rather by how different this distribution is with respect to the original dataset, with no filters.
These results must be taken with a grain of salt, since most of these bars are representing tens or hundreds of datapoints, which are completely dwarfed by the scale of the million datapoints we have. While promising, they represent a very small portion of our population. Take this into account in your own research.
Clicking the little magnifying glass in a cross filter will allow you to search through the different values it holds:
This popup allows you to select any segment belonging to that column. You can select different rules for searching, like exact match, or contains. This just translates your choices to an advanced filter query.
This magnifying glass is only available in text-based variables, like text
or
category
. In numerical
or date
variables, you can access it via the options
menu → Custom query selection.
Just in case you missed it, you can group, pin and rearrange variables, so the most important information is always where you want it to be.
Rediscover data exploration interactively
Cross filters are one of the most powerful tools Graphext offers. They are natural to use, as they show the distribution of your variables, but enable exploration on different combinations of values, making the whole interface reactive.
For example, in this dataset holding transactions from an e-commerce, we can filter those transactions made between 2020 and 2022:
which leaves us with 72% of the data, 1.1M rows out of 1.6M we have in total.
Notice the relative scale on the right, spanning from 0 to 12% (really it’s more like ~13%). That is now telling us how much of our data lies on each of the bars (called bins).
This (or any) selection affects every other cross filter. This is what makes them so powerful: they all behave like one single system informing of the different distributions of your variables.
When selecting the category GIFT_CARD, we see a very prominent decrease in sales from the end of 2021 and onwards
It is worth noting that using cross filters affects the whole state of the application, meaning that Graph and Plot also react to whatever you are selecting.
Cross filters can also be sorted and searched, making surgically precise questions a breeze to answer.
You can sort categorical and text variables, in several ways. The default is “by everything”, which just means the frequency of each value sorted in descending order; the most common items appear first.
You also have these other methods available:
A practical example
For example, say we select this specific demographic of women between the age 18 and 24.
we can see how the category column changes based on this information, and sorting it according to the most relevant data. That is, the one that differs most with respect the whole dataset, without selection.
If we sort Category based on Uplift, we see interesting stuff:
corsets, jewlery and hair extensions come as one of the most distictive results for this specific subset of data. Which, indeed makes sense.
Remember we are not sorting by frequency (since that’s the default), but rather by how different this distribution is with respect to the original dataset, with no filters.
These results must be taken with a grain of salt, since most of these bars are representing tens or hundreds of datapoints, which are completely dwarfed by the scale of the million datapoints we have. While promising, they represent a very small portion of our population. Take this into account in your own research.
Clicking the little magnifying glass in a cross filter will allow you to search through the different values it holds:
This popup allows you to select any segment belonging to that column. You can select different rules for searching, like exact match, or contains. This just translates your choices to an advanced filter query.
This magnifying glass is only available in text-based variables, like text
or
category
. In numerical
or date
variables, you can access it via the options
menu → Custom query selection.
Just in case you missed it, you can group, pin and rearrange variables, so the most important information is always where you want it to be.