duration: The time interval between the start and end dates in the specified unit (default: days).observed: A boolean column indicating whether the event was observed (i.e., if the end date occurs before the observation date).
- If either
start_dateorend_dateis missing (null),observedwill be false, anddurationwill be null. - Otherwise, the
durationis calculated as the interval betweenstart_dateandend_date. - If
end_dateis not null,observedwill be true ifend_date <= observation_end; otherwise, it will be false.
Usage
The following examples show how the step can be used in a recipe.Examples
Examples
- Example 1
- Example 2
- Signature
Calculate the duration and observation status between a start date and end date.
Inputs & Outputs
The following are the inputs expected by the step and the outputs it produces. These are generally columns (ds.first_name), datasets (ds or ds[["first_name", "last_name"]]) or models (referenced
by name e.g. "churn-clf").
Inputs
Inputs
Outputs
Outputs
Configuration
The following parameters can be used to configure the behaviour of the step by including them in a json object as the last “input” to the step, i.e.step(..., {"param": "value", ...}) -> (output).
Parameters
Parameters
Observation end date.
The cutoff date to determine if the event was observed (e.g., churn or any other event).
Absolute value for duration.
Whether to use absolute values for the calculated duration.
Unit for duration.
The unit of measurement for the duration. Allowed values are: - “Y”, “year” - “Q”, “quarter” - “M”, “month” - “W”, “week” - “D”, “day” - “h”, “hour” - “m”, “minute” - “s”, “second” - “ms”, “millisecond”
The unit name can be spelled in singular or plural and is case-insensitive.Values must be one of the following:
Y year Year years Years Q quarter Quarter quarters Quarters M month Month months Months W week Week weeks Weeks D day Day days Days h hour Hour hours Hours m minute Minute minutes Minutes s second Second seconds Seconds ms millisecond Millisecond milliseconds Milliseconds