Cohorts are useful to determine a grouping of data across a date dimension. For example, how many users remain active n months after initial signup, etc.
We currently support 2 types of input data.
select COHORT(<Date Column>, <Cohort Date Definition>, <Cohort Period Definition>), <Cohort Operation>
group by <Cohort Date>, <Cohort Period>
Note:
Input data needs to be sorted by Date ascending order.
Cohort Period returns a number (ie: the period) or a date. Example:
a. 1m: Cohort Period as number
b. (1m): Cohort Period as Date
Example 1: If we already have the cohort date populated
select
cohort(
Transaction Date,
Register Date as Cohort Date,
1m as Cohort Period),
sum(Amount) as Total Amount
Example 2: If we only have transactional events like the following example:
select
cohort(
Transaction Date,
Transaction Date as Cohort Date where Event Type = Sign Up group by User ID,
1m as Cohort Period),
sum(Amount) as Total Amount
where Event Type = Purchase
Example 3: Cohorts can be used in combination with transpose to flatten the result based on date
select
cohort(
Transaction Date,
Transaction Date as Cohort Date where Event Type = Sign Up group by User ID,
1m as Cohort Period),
sum(Amount) as Total Amount
where Event Type = Purchase
group by Cohort Date, Cohort Period;
Example 4: A common cohort is retention in percentage format which can be computed as follows:
select
cohort(
Transaction Date,
Transaction Date as Cohort Date where Event Type = Sign Up group by User ID,
1m as Cohort Period,
Cohort Count),
count(distinct(User ID)) as Retention
where Event Type = Purchase and Cohort Date is not null
group by Cohort Date, Cohort Period;
select Cohort Date, Cohort Period, Cohort Count,
Retention * 100 / Cohort Count as Retention Percent