time_interval
Calculates the duration of a time interval between two dates (datetimes/timestamps).
The dates can be specified either as two datetime columns (in which case the second is subtracted from the first),
or as a single column and a reference date provided as a parameter (see since
or until
in parameters below).
If only one column is provided as input, one of since
or until
must be specified as a reference date.
Usage
The following examples show how the step can be used in a recipe.
To get the positive number of hours since the last login of a user as of now:
Inputs & Outputs
The following are the inputs expected by the step and the outputs it produces. These are generally
columns (ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Configuration
The following parameters can be used to configure the behaviour of the step by including them in
a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output)
.
Unit for duration. The unit of measurement for the duration. Allowed values are: - “Y”, “year” - “Q”, “quarter” - “M”, “month” - “W”, “week” - “D”, “day” - “h”, “hour” - “m”, “minute” - “s”, “second” - “ms”, “millisecond” The unit name can be spelled in singular or plural and is case-insensitive.
Values must be one of the following:
Y
year
Year
years
Years
Q
quarter
Quarter
quarters
Quarters
M
month
Month
months
Months
W
week
Week
weeks
Weeks
D
day
Day
days
Days
h
hour
Hour
hours
Hours
m
minute
Minute
minutes
Minutes
s
second
Second
seconds
Seconds
ms
millisecond
Millisecond
milliseconds
Milliseconds
Return type of interval duration.
Whether the interval duration should always be returned as positive, independent of whether date1
occurred before or after date2
.
Date start reference for intervals.
If only one column is specified as input, a reference date relative to which the intervals will be calculated. The result will be date1 - since
. I.e. intervals will be positive if dates in the column are more recent than the reference date (and negative otherwise). The date must be either a valid date string (preferrable month-first, e.g. “2021-12-31”), or the constant “now”.
Date ending reference for intervals.
If only one column is specified as input, a reference date relative to which the intervals will be calculated. The result will be until - date1
. I.e. intervals will be positive if the reference date is more recent than the dates in the column (and negative otherwise). The date must be either a valid date string (preferrable month-first, e.g. “2021-12-31”), or the constant “now”.
Was this page helpful?