Cohort analysis: how to actually read a retention curve

Your monthly active user count just ticked up 8% month-over-month. Your retention dashboard shows 61% of users returning in the last 30 days. Both numbers look healthy. Meanwhile, every cohort of users you have acquired in the past six months has quietly churned to near zero by week eight.

This is not a hypothetical. It is one of the most common misreadings in product analytics, and it has a name — Simpson’s paradox-style mixing — where an influx of newly acquired users inflates an aggregate metric even as the underlying experience for every individual cohort deteriorates. Cohort analysis is the tool that separates these signals.

Why aggregate retention lies

Suppose you acquire 1,000 new users in January, another 1,000 in February, and another 1,000 in March. If each cohort retains 50% of users by month one, 30% by month two, and 20% by month three, your aggregate “monthly active” count in April is 200 + 300 + 500 = 1,000 — flat, not declining. But every single cohort is exhibiting identical churn. The growth in new acquisition is masking the decay.

The fix is simple in concept: stop blending users who joined at different times. Group users by the period in which they first joined (their acquisition cohort), then track each group separately over time. You stop measuring “everyone active this month” and start measuring “of the users who joined in January, how many are still active by month three?”

How to build a cohort retention table

The standard format is a triangle table. Each row is a cohort — typically users grouped by the week or month they first signed up. Each column is a time period elapsed since signup (week 0, week 1, week 2, and so on). Each cell contains the percentage of that cohort still active at that elapsed period.

Period 0 is always 100% — every user is active in the period they signed up. From there, the numbers can only stay flat or fall.

Here is what a four-cohort monthly table might look like:

A cohort retention triangle. Each row is a signup cohort; each column is months since first sign-up. The Jan cohort’s plateau at ~32% is a product-market fit signal. The Feb cohort’s continued decline is a warning.

The table already tells a story. The January cohort stabilises around 32–34% by month four. The February cohort is still shedding users. If you were only looking at your blended “monthly active” metric, you would miss both signals entirely.

Reading the retention curve: flatten or decay

When you plot any single row of that table as a line chart — time elapsed on the x-axis, retention percentage on the y-axis — you get a retention curve. Its shape is the most information-dense signal in product analytics.

Every retention curve starts at 100% and drops steeply after period zero. That initial drop is not a catastrophe; it is normal. Most users who sign up for something do not immediately find their way into habitual use. The critical question is what happens next.

There are two archetypes:

The flattening curve (sometimes called a “smile curve” when plotted on a log scale). After the initial drop, the curve bends and levels off at some non-zero floor — 15%, 25%, 40%, whatever it is for your product category. This plateau means a stable core of users has found recurring value. They are not churning. The absolute level of the floor matters less than its existence. A product that retains 20% of users indefinitely is a fundamentally different business from one that retains 40% for three months and then falls to zero.

The decaying curve. The line continues downward and asymptotically approaches zero. Every cohort eventually churns entirely. This is the leaky bucket: no matter how much you pour in the top, the bucket empties. You can grow a leaky-bucket business with aggressive acquisition, but unit economics are brutal because you never accumulate a durable user base.

Two retention curves for the same initial cohort size. The solid curve flattens around 29–30% — a durable retained core. The dashed curve approaches zero, meaning the product has not found a repeating use case.

Why a flattening curve is the single best PMF signal

The flattening curve means something concrete: a subset of users has found a reason to come back that is independent of novelty. They are not returning because the product is new and curious. They are returning because it solved something for them on a recurring basis.

This is why cohort retention is often described as the most honest metric you have. Revenue can be manipulated by discounting. Download counts are inflated by paid campaigns. Monthly actives blend old and new users. The retention curve of a mature cohort — say, users 12 months after acquisition — shows you exactly what fraction of people you acquired are genuinely using your product today.

The mechanism matters too. A flattening curve implies that the users who remain are not a random sample; they are the users for whom the product fits a real job-to-be-done. They are also your most likely source of word-of-mouth referrals, expanded usage, and long-term revenue. See averages that lie for more on how aggregate metrics can obscure exactly this kind of signal.

N-day vs unbounded vs rolling retention — and where each misleads

The way you define “active” changes your retention number, sometimes dramatically.

N-day retention (also called classic or exact-day retention) asks: of the users who signed up on day zero, what fraction were active on exactly day N? Day-7 and day-30 retention are the most common benchmarks. The problem is that it misses users who were active on day 6 or day 8. A user who uses your product every week but on a slightly irregular schedule might look like a churned user in N-day retention.

Unbounded retention (also called range or bracket retention) asks: of the users who signed up in period zero, what fraction were active at any point during a defined window (say, weeks 4–5)? This is more forgiving of irregular cadences and tends to be the right measure for products with natural usage rhythms that are not strictly daily.

Rolling retention asks: of the users active on a given day, what fraction were also active N or more days after that day? It is a forward-looking measure and is often used in mobile gaming. It overstates health for products with a long-tail of casual users who occasionally re-engage.

None of these is wrong; they are measuring different things. The key is to be consistent within a product and to understand which definition your benchmarks were computed against before you compare.

Behavioral cohorts and segmentation: finding what drives retention

Acquisition cohorts (grouped by signup date) are the default, but they are not always the most useful cut.

Behavioral cohorts group users by an action they took, not by when they signed up. Examples: users who completed onboarding versus those who did not; users who connected an integration on day one; users who invited a teammate within the first week. When you plot retention curves for each behavioral cohort separately, you often find a dramatic divergence — the users who performed a specific action retain at two or three times the rate of those who did not. That action is a candidate activation moment: the thing your onboarding should drive every new user toward.

Channel cohorts split users by acquisition source. Retention by channel often varies more than retention by any other dimension. Paid social users frequently churn faster than organic search users. Referral users often retain best of all. If you are spending acquisition budget uniformly across channels, channel-level retention will tell you where you are buying retention and where you are buying churn.

Feature-adoption cohorts segment by which features users adopted in their first period. This is particularly useful in multi-feature products where different user types might come for different reasons and retain at different rates.

Explore business analytics for more frameworks on combining these segmentation approaches with funnel analysis.

Churn and compounding: why small retention gains matter disproportionately

Churn is simply 1 − retention. If your monthly retention is 85%, your monthly churn is 15%.

The reason retention improvements matter so much to business outcomes is compounding. Suppose you have two products: one retains 70% of users each month, the other retains 80%. After 12 months, the first product has retained roughly 1.4% of its original cohort; the second has retained roughly 6.9% — nearly five times as many. That difference compounds into customer lifetime value: a user who stays five times as long generates roughly five times the revenue (assuming flat revenue per period), and you paid the same customer acquisition cost to acquire them.

More precisely, if average revenue per user per period is R and monthly retention rate is r, the LTV of an acquired customer approximates R / (1 − r). Raising r from 0.70 to 0.80 nearly triples LTV. Raising it from 0.80 to 0.90 nearly doubles LTV again. The compounding effect means that retention improvements in the 70–90% range deliver much larger LTV gains than the percentage-point difference suggests.

This is why the flattening retention curve is such a valuable signal. A product with a stable 30% retained core at month six does not just have “good retention.” It has a fundamentally different LTV equation, a fundamentally different payback period for acquisition spend, and a fundamentally different ceiling for the business.

Visit the glossary for definitions of related terms like LTV, churn rate, and activation.

Putting it together: a diagnostic workflow

When you open a cohort table for the first time, follow this sequence:

First, look at the diagonal. In a triangle table, reading down the diagonal gives you all cohorts at the same elapsed period. If each newer cohort is lower than the previous one at the same elapsed period, your retention is deteriorating. If they are stable or improving, you have consistency. This is the first health check.

Second, plot the curves. Pick two or three cohorts and plot them as overlaid lines. Look for the shape: flattening or decaying. If flattening, note the floor level and where it stabilises.

Third, segment. Once you have the baseline, split by channel and by key activation actions. Look for the cohort segments with the steepest improvement in floor level — these are your highest-leverage interventions.

Fourth, link to business outcomes. Calculate LTV by cohort using the retention floor and average revenue per user. Compare that to acquisition cost by channel. This is the bridge from the retention curve to the actual economics of your business analytics stack.

Frequently asked questions

What is the difference between cohort analysis and segment analysis?

A cohort is defined by a shared experience in time — typically when users first joined or first performed an action. A segment is defined by a static attribute — users in a given geography, on a given plan tier, using a certain device. Cohort analysis tracks how a group evolves over time; segment analysis compares groups at a snapshot in time. Both are useful, but cohort analysis is better for understanding whether a product’s core engagement is improving or deteriorating.

How many users do I need in a cohort before the retention numbers are reliable?

There is no universal threshold, but cohorts smaller than a few hundred users produce retention percentages with wide confidence intervals. A cohort of 50 where 15 users return at month one gives you 30% ± roughly 12 percentage points at a 95% confidence level. The practical implication is that weekly cohorts are often too small for early-stage products; monthly cohorts are more stable. For precise analysis, compute the confidence interval on your percentage before reading meaning into small differences between cohorts.

Should I use calendar periods or elapsed periods for my cohort columns?

Elapsed periods (months 0, 1, 2, … since signup) are almost always the right choice for retention analysis. Calendar-period columns (January, February, March, …) are useful for understanding absolute business volume per period but not for comparing how different cohorts behave. Mixing the two is a common source of confusion in cohort dashboards built by non-specialists.

If my retention curve is decaying to zero, does that mean the product is broken?

Not necessarily — it depends on the product category. Some products are genuinely single-use or episodic: a tax-filing app, a wedding planning tool, a one-time event platform. For these, retention curves that approach zero are expected, and the right metric is something else entirely (NPS, referral rate, return in the following year). For products with a repeating use case — messaging, project management, habit-forming apps — a decaying curve is a meaningful signal that the core loop is not delivering recurring value.