This step calculates the duration between a start date and an end date and determines whether an event was observed. The output consists of two columns:

  • duration: The time interval between the start and end dates in the specified unit (default: days).
  • observed: A boolean column indicating whether the event was observed (i.e., if the end date occurs before the observation date).

This is particularly useful for preparing input data for survival analysis, such as Kaplan-Meier curves, where the event observation (censoring) status and duration are key inputs.

  • If either start_date or end_date is missing (null), observed will be false, and duration will be null.
  • Otherwise, the duration is calculated as the interval between start_date and end_date.
  • If end_date is not null, observed will be true if end_date <= observation_end; otherwise, it will be false.
observation_end
string
required

Observation end date. The cutoff date to determine if the event was observed (e.g., churn or any other event).

absolute
boolean

Absolute value for duration. Whether to use absolute values for the calculated duration.

unit
string
default: "days"

Unit for duration. The unit of measurement for the duration. Allowed values are: - “Y”, “year” - “Q”, “quarter” - “M”, “month” - “W”, “week” - “D”, “day” - “h”, “hour” - “m”, “minute” - “s”, “second” - “ms”, “millisecond” The unit name can be spelled in singular or plural and is case-insensitive.

Values must be one of the following:

Y year Year years Years Q quarter Quarter quarters Quarters M month Month months Months W week Week weeks Weeks D day Day days Days h hour Hour hours Hours m minute Minute minutes Minutes s second Second seconds Seconds ms millisecond Millisecond milliseconds Milliseconds