Cohort Retention Analysis
50,000 active users and the count looks flat — are you healthy, or quietly dying? Cohort retention is the only way to know.
What you'll learn
- What a cohort is and why blending everyone together hides the truth
- How to read a retention curve and what flattening means
- Why a flat total active-user count can mask a leaky bucket business
- How to spot product-market fit from a retention curve shape
Before you start
Your dashboard shows 50,000 active users this month. Same as last month. Same as the month before. The number looks stable, so everything must be fine — right?
Not necessarily. That flat headline can hide two completely different realities: a thriving business with loyal users, or a leaking bucket where thousands of new signups are quietly replaced by thousands of churned users every single month. Cohort retention analysis is the tool that tells you which world you actually live in.
What is a Cohort?
A cohort is a group of customers defined by when they first started — for example, “everyone who signed up in January.” Instead of blending all your users together into one big pool, you track each cohort separately over time.
The “blending” approach is like weighing a bucket of water every day without noticing that half the water is leaking out and someone is refilling it. The weight stays the same, but the bucket is broken. Cohorts let you watch the original water — where it goes, and how fast.
Retention and Churn
Retention is the share of a cohort still active after a given number of months. If you start with 1,000 January signups and 600 are still using the product in February (month 1), your month-1 retention is 60%.
Churn is the flip side: the share of the cohort who have left. Retention + churn for any step always add up to 100%. If month-1 retention is 60%, month-1 churn is 40%.
These two numbers are always two ways of describing the same fact:
Retention rate = Active users in cohort / Original cohort size
Churn rate = 1 - Retention rate
The Retention Curve
Take a January cohort of 1,000 users. Here is what their retention looks like over five months:
| Month since signup | Active users | Retention |
|---|---|---|
| 0 (signup) | 1,000 | 100% |
| 1 | 600 | 60% |
| 2 | 450 | 45% |
| 3 | 380 | 38% |
| 4 | 350 | 35% |
| 5 | 340 | 34% |
Plot those numbers and you get a retention curve — a line that drops steeply in the first month or two, then bends and flattens out. The shape of that bend carries enormous information.
The answer is: not necessarily. The first-month drop is almost always the largest — many users sign up out of curiosity and never come back. What matters more is what happens after that. If the curve flattens (as in the January example above, where retention settles around 34–35%), it means you have found a group of genuinely loyal users who keep coming back month after month. Practitioners call this a product-market fit signal — evidence that the product delivers real, repeatable value to at least a subset of users.
A curve that never stops falling — one that keeps dropping toward zero — tells the opposite story: every user eventually leaves, which means the product has not found a loyal core yet.
Reading the Cohort Retention Table
Real analysts track multiple cohorts side by side in a triangle-shaped table. Rows are cohorts (by signup month); columns are months since signup.
| Cohort | Month 0 | Month 1 | Month 2 | Month 3 |
|---|---|---|---|---|
| Jan | 100% | 60% | 45% | 38% |
| Feb | 100% | 57% | 43% | 36% |
| Mar | 100% | 62% | 47% | — |
Reading across a row shows you how a single cohort ages. Reading down a column shows you whether newer cohorts retain better or worse than older ones at the same age — a way to spot whether product improvements are actually helping.
The triangle shape (dashes at the bottom-right) just means those data points are in the future; the March cohort has not yet reached month 3.
The Diagram: What the Curve Actually Looks Like
January cohort (1,000 users): retention curve drops steeply then flattens — the highlighted tail is the loyal core.
The steep section (months 0–2) is normal; virtually every product sees it. The critical question is always: does the curve eventually flatten, or does it keep falling?
The Killer Insight: Why a Flat Total Can Lie
Now we can fully answer the opening question.
Imagine your product acquires 5,000 new users every month. But every month, it also loses 5,000 of its existing users — users who signed up in previous cohorts and eventually churned. The total active-user count stays locked at 50,000. The dashboard looks healthy.
Month N: 50,000 active users
+5,000 new signups this month
-5,000 churned from previous cohorts
= 50,000 active users next month
The headline is flat. The business is a leaky bucket. You will need to spend more and more on acquisition just to stand still — and the moment acquisition slows, total users collapse. Only cohort retention curves reveal this, because they show you whether the users you acquired six months ago are still around.
Putting the Numbers Together
Using the January cohort (1,000 users, month-5 retention of 34%):
Active in month 5 = 1,000 x 0.34 = 340 users
Churned by month 5 = 1,000 - 340 = 660 users
If you have ten such cohorts of 1,000 each (all at month 5), you have 3,400 active users — but 6,600 users have already left. That attrition has to be offset by new cohorts just to hold the total flat.
Quick check
Next
Once you know how many users from each cohort stick around — and for how long — you can assign a dollar value to each of them. The next lesson turns retention curves into customer lifetime value (CLV): the total revenue a single customer is expected to generate before churning.
Practice this in an interview
All questionsA retention drop investigation requires distinguishing between an acquisition-mix shift (newer cohorts are lower quality) and a genuine product regression (existing cohorts are performing worse). The two look identical in aggregate retention but have completely different fixes. Cohort analysis — plotting the D30 survival curve for each weekly acquisition cohort — is the first move.
Engagement is multi-dimensional: breadth (how many users engage), depth (how much they do per session), and frequency (how often they return). A robust engagement framework stacks these three layers into a metric hierarchy and links them to retention curves, because engagement that does not predict long-term retention is usually noise.
A metric drop investigation starts by confirming the drop is real — ruling out logging bugs and metric-definition changes — before hypothesising causes. Then segment by platform, geography, user cohort, and funnel step to isolate where the drop is concentrated, which points to the most likely root cause.
Lagging indicators (revenue, annual retention, NPS) measure outcomes after they have occurred — they are accurate but slow. Leading indicators (D1 retention, feature adoption rate, time-to-value) correlate with future outcomes and are available faster, making them suitable for early experiment decisions. A robust metrics system pairs both, with the leading metric as the experiment signal and the lagging metric as the validation gate.