observed_duration
Calculate the duration between two dates and determine whether an event was observed before a specified observation date.
This step calculates the duration between a start date and an end date and determines whether an event was observed. The output consists of two columns:
duration
: The time interval between the start and end dates in the specified unit (default: days).observed
: A boolean column indicating whether the event was observed (i.e., if the end date occurs before the observation date).
This is particularly useful for preparing input data for survival analysis, such as Kaplan-Meier curves, where the event observation (censoring) status and duration are key inputs.
- If either
start_date
orend_date
is missing (null),observed
will be false, andduration
will be null. - Otherwise, the
duration
is calculated as the interval betweenstart_date
andend_date
. - If
end_date
is not null,observed
will be true ifend_date <= observation_end
; otherwise, it will be false.
Was this page helpful?