observed_duration
Calculate the duration between two dates and determine whether an event was observed before a specified observation date.
This step calculates the duration between a start date and an end date and determines whether an event was observed. The output consists of two columns:
duration
: The time interval between the start and end dates in the specified unit (default: days).observed
: A boolean column indicating whether the event was observed (i.e., if the end date occurs before the observation date).
This is particularly useful for preparing input data for survival analysis, such as Kaplan-Meier curves, where the event observation (censoring) status and duration are key inputs.
- If either
start_date
orend_date
is missing (null),observed
will be false, andduration
will be null. - Otherwise, the
duration
is calculated as the interval betweenstart_date
andend_date
. - If
end_date
is not null,observed
will be true ifend_date <= observation_end
; otherwise, it will be false.
Usage
The following examples show how the step can be used in a recipe.
Calculate the duration and observation status between a start date and end date.
Inputs & Outputs
The following are the inputs expected by the step and the outputs it produces. These are generally
columns (ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Configuration
The following parameters can be used to configure the behaviour of the step by including them in
a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output)
.
Observation end date. The cutoff date to determine if the event was observed (e.g., churn or any other event).
Absolute value for duration. Whether to use absolute values for the calculated duration.
Unit for duration. The unit of measurement for the duration. Allowed values are: - “Y”, “year” - “Q”, “quarter” - “M”, “month” - “W”, “week” - “D”, “day” - “h”, “hour” - “m”, “minute” - “s”, “second” - “ms”, “millisecond” The unit name can be spelled in singular or plural and is case-insensitive.
Values must be one of the following:
Y
year
Year
years
Years
Q
quarter
Quarter
quarters
Quarters
M
month
Month
months
Months
W
week
Week
weeks
Weeks
D
day
Day
days
Days
h
hour
Hour
hours
Hours
m
minute
Minute
minutes
Minutes
s
second
Second
seconds
Seconds
ms
millisecond
Millisecond
milliseconds
Milliseconds
Was this page helpful?