Q41👥Cohort & Retention★★★☆☆Window/CTESafe to Run
Classic Acquisition Cohort
Dialect:bigquerypostgresqlsnowflakemysql
#cohort#acquisition#retention#classic
Kapan Pakai Query Ini
Ini adalah query dasar cohort analysis yang wajib kamu kuasai. Pakai query ini ketika kamu mau tahu: dari semua user yang signup di bulan X, berapa persen yang balik lagi di bulan ke-1, ke-2, dan seterusnya? Cocok untuk laporan retention bulanan, evaluasi health produk, atau benchmarking antar cohort.
Yang Perlu Diganti
your_dataset.users→ nama tabel user (berisiuser_id,created_at)your_dataset.events→ nama tabel aktivitas (berisiuser_id,event_date)event_date→ kolom tanggal aktivitas kamucreated_at→ kolom tanggal signup user
Query
WITH user_cohorts AS ( -- Step 1: tentukan cohort bulan signup setiap user SELECT user_id, DATE_TRUNC(created_at, MONTH) AS cohort_month FROM your_dataset.users ), user_activities AS ( -- Step 2: join aktivitas dengan cohort user SELECT uc.user_id, uc.cohort_month, DATE_DIFF(DATE_TRUNC(e.event_date, MONTH), uc.cohort_month, MONTH) AS period_number FROM user_cohorts uc JOIN your_dataset.events e ON uc.user_id = e.user_id WHERE e.event_date >= uc.cohort_month ), cohort_sizes AS ( -- Step 3: hitung ukuran setiap cohort (total user saat signup) SELECT cohort_month, COUNT(DISTINCT user_id) AS cohort_size FROM user_cohorts GROUP BY cohort_month ), cohort_activity AS ( -- Step 4: hitung user aktif per cohort per periode SELECT cohort_month, period_number, COUNT(DISTINCT user_id) AS active_users FROM user_activities GROUP BY cohort_month, period_number ) -- Step 5: gabungkan dan hitung retention rate SELECT ca.cohort_month, cs.cohort_size, ca.period_number, ca.active_users, ROUND(ca.active_users / cs.cohort_size * 100, 1) AS retention_rate FROM cohort_activity ca JOIN cohort_sizes cs ON ca.cohort_month = cs.cohort_month WHERE ca.period_number <= 12 ORDER BY ca.cohort_month, ca.period_number;
Contoh Output
| cohort_month | cohort_size | period_number | active_users | retention_rate |
|---|---|---|---|---|
| 2024-01-01 | 3,840 | 0 | 3,840 | 100.0 |
| 2024-01-01 | 3,840 | 1 | 1,497 | 39.0 |
| 2024-01-01 | 3,840 | 2 | 998 | 26.0 |
| 2024-02-01 | 4,120 | 0 | 4,120 | 100.0 |
| 2024-02-01 | 4,120 | 1 | 1,689 | 41.0 |
Variasi
- Ganti unit ke
WEEKdiDATE_TRUNCdanDATE_DIFFuntuk cohort mingguan. Berguna kalau produk kamu punya weekly engagement cycle - Filter
WHERE period_number = 0untuk hanya lihat ukuran cohort tanpa retention - Tambah
WHERE cohort_month >= DATE_SUB(CURRENT_DATE(), INTERVAL 12 MONTH)untuk batasi ke 12 bulan terakhir
Variasi Dialect
PostgreSQL:
-- DATE_TRUNC pakai string, DATE_DIFF tidak ada — pakai subtraction DATE_TRUNC('month', created_at) AS cohort_month, (DATE_TRUNC('month', e.event_date)::date - DATE_TRUNC('month', uc.created_at)::date) / 30 AS period_number -- Catatan: ini approx. Lebih akurat: EXTRACT(YEAR FROM AGE(...))*12 + EXTRACT(MONTH FROM AGE(...))
Snowflake:
DATE_TRUNC('MONTH', created_at) AS cohort_month, DATEDIFF('month', uc.cohort_month, DATE_TRUNC('MONTH', e.event_date)) AS period_number
MySQL:
DATE_FORMAT(created_at, '%Y-%m-01') AS cohort_month, TIMESTAMPDIFF(MONTH, uc.cohort_month, DATE_FORMAT(e.event_date, '%Y-%m-01')) AS period_number