10 min tutorial walking you through the basic stuff you need to know about the recipe
Writing a recipe is one way to instruct Graphext to build a project/visualization given some tabular data (others being the use of the Wizard, or the application of pre-defined recipes). A recipe itself is nothing more than a number of steps, which are functions that accept some data and output new, transformed or enriched data. A recipe can have an arbitrary number of such steps, and can generate an arbitrary number of intermediate datasets. But, the output must always be a single dataset that serves as the basis for visual exploration in the resulting project. When you open the Recipe Editor for the first time in a newly created project, the inital dataset is made available by default with the name
ds
, and so the simplest possible recipe is simply
create_project
which accepts a dataset as input and has no output. This is a
special case. Since the result of this step is the creation of a project, it doesn’t generate any output that can be
further processed inside the recipe.
In practice you’ll almost always want to somehow transform or enrich your dataset however, and so you’ll want to add
one or more of the many steps available in Graphext before the final step of project creation.
Steps
In general, the syntax for adding a step is very simple and always of the form:->
), you provide names for the outputs that the step will generate.
Again, the outputs may be one more columns or datasets.
To differentiate between input datasets and columns, column names need to be prefixed with the name of the dataset it
belongs to, while datasets can be referred to by their name only. In other words, ds
refers to the dataset with the name
“ds” and to pick out a specific column you’d use either ds.my_column
or ds["my_column"]
. The two forms are generally
interchangeable, but the latter is required if a column name contains spaces.
To given an example, a simple step that splits the texts in a given column in two at the first comma, might be written as
Usually, when you start typing the beginning of a step’s name in the Recipe
Editor, the rest of the step’s signature will be autocompleted, including
the default names of any outputs it creates. So you only need to change the
names if you don’t like the default ones (or if they clash with other outputs
you may have generated already).
Parameters
Parameters let you configure how a step will process its inputs. The syntax of parameters corresponds to a valid json object, for those familiar with json or javascript. For those who are not, it’s simply a number of quoted parameter names and corresponding values in between curly braces. E.g. we have already seen the example"pattern"
is the parameter’s name and ","
its value.
In general, all parameter names must be quoted strings, while values may be
- quoted strings
- numbers
- lists of numbers or strings
- another, nested object in curly braces, following the above rules