Segments
Group selections for easy access
What is a Segment?
Segments can be thought of as a group of cross filters. These allow us to save an arbitrary amount of filters on several columns that may make semantical sense.
Demographics segment in the Titanic dataset, giving a broader perspective on the age and gender distribution of people.
Creating a Segment
Creating a segment involves making a selection of an arbitrary number of variables, and then “saving” that selection.
Step by step
First, make any selection you are interested in, and then head to the Save selection menu, just under the big figure showing the currently selected rows, in the top left corner of the screen.
If you already created a segment, it will appear in the menu. Choose it to save the current selection to the existing segment. Otherwhise, create a new segment clicking the “New Segmentation” button.
Here in the video we save 3 groups of gender-age demographics for easier access: men in the age 0–25 age range, 25–50 and 50–90.
These are saved in new, specific categories inside a new variable that can be used to select that data in one single click, or even train a machine learning model.
Practical example
For example, you could save demographic data in a more approachable way. In the titanic dataset, we have age and gender in two different variables. Assuming we wanted to have a coarse perspective on these two factors, like Young vs Old people, and Men vs Women, we could create a “Demographics” segment.
By creating these four segments, we now have a very quick way to reach for the “Young Men” category, which would involve selecting all Male passengers under the age of 25.
This is a simple example, but we could compose an arbitrarily complex filter, which would make reaching for these specific rows much easier.
How it works
As we can see, this process just involves creating a new multivalued column that assigns the name of the segment to the selected rows. If the row was present in the initial filter, the row gets that category in the column.
Some rows may end up in several filters simultaneously, hence the multivalued column that can store an array many different values.
Was this page helpful?