group_by

Usage

The following examples show how the step can be used in a recipe.

Examples

This example groups the dataset by an exact match on the category column and a date component (month level) on the date column, and then aggregates the count of sales and the sum of revenue:

group_by(ds, {
  "by": [
    { "by": "category", "groupingType": "EXACT" },
    { "by": "date", "groupingType": "DATE_COMPONENT", "param": { "component": "MONTH", "timezone": "UTC" } }
  ],
  "aggregations": [
    { "name": "total_sales", "on": "sales", "type": "COUNT" },
    { "name": "total_revenue", "on": "revenue", "type": "SUM" }
  ]
}) -> (ds_grouped)

Inputs & Outputs

The following are the inputs expected by the step and the outputs it produces. These are generally columns (ds.first_name), datasets (ds or ds[["first_name", "last_name"]]) or models (referenced by name e.g. "churn-clf").

Inputs

Outputs

Configuration

The following parameters can be used to configure the behaviour of the step by including them in a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output).

Parameters

Prepare

Report

Analyse

Usage

Inputs & Outputs

Configuration

Prepare

Report

Analyse

​Usage

​Inputs & Outputs

​Configuration

Usage

Inputs & Outputs

Configuration