Steps

Steps are functions that tell Graphext how to process your data, in a pipeline sort of way. They live in the recipe, as a sorted list of processes to perform on your data. Usually, the output from one will go into the next, although this is not strictly necessary. Some steps output a new column, some just help you set metadata about a column, some help you export your data, train a model and much more. Steps are written in a low code language specific to Graphext recipes, that resembles a bit of a mix between Javascript and Python. The recipe editor will help with auto-completion of the steps, as well as auto-filling their inputs. Here’s an example of a recipe with only two steps:

extract_json_values(ds.products, {
    "path": "name",
    "type": "category"
  }) -> (ds.productName)

create_project(ds)

We can see the extract_json_values and the create_project steps. The create_project step is a special one whose only purpose is to instantiate the dataset (ds) and make it available for other steps to process it. We can see the step extract_json_values takes two inputs: ds.products and a dictionary-like object as options. ds.products is a column on ds, made available by create_project. This step will then create a new column on ds called productName, with the results of the transformation in it. This is a very powerful and relatively easy way of processing your data and having it readily available. Sometimes, this can be quite a bit faster than booting up a classic python/R notebook, while still providing with much the same functionality. The complete list for all the possible steps lives in the API Docs, also at the top of this page. Do not hesitate to reach out if you need any help!

Graphext concepts

Data science concepts