Statistics & Probability Hard Asked at MetaAsked at GoogleAsked at AmazonAsked at Booking

What is CUPED and how does it reduce variance in A/B tests?

For Data Scientist Data Analyst ML Engineer

The short answer

CUPED (Controlled-experiment Using Pre-Experiment Data) removes variance in the outcome metric that is explained by a pre-experiment covariate — typically the same metric measured before the experiment. This makes the residual variance smaller, which is equivalent to running a more powerful test or reaching significance faster with the same sample.

How to think about it

The core idea

A user’s post-experiment revenue is highly correlated with their pre-experiment revenue. That pre-experiment signal is available before the experiment starts and is unaffected by treatment assignment (it is in the past). If you regress out this pre-experiment covariate from the post-experiment metric, the residual variance is substantially smaller — often 30–70 % variance reduction in practice.

The CUPED estimator replaces the raw outcome Y with an adjusted outcome:

Y_cuped = Y - theta * (X - E[X])

Where X is the pre-experiment covariate, theta is estimated via OLS regression (theta = Cov(Y, X) / Var(X)), and E[X] is the grand mean of the covariate. Because X is independent of treatment assignment, this adjustment does not bias the treatment effect estimate — it only reduces variance.

What this means practically

If CUPED cuts variance by 50 %, you need only half the sample size to achieve the same power. Equivalently, for a fixed sample size, your effective MDE shrinks — you can detect smaller effects. Microsoft (who invented CUPED, published 2013) and Booking.com report routine variance reductions of 40–60 % on revenue and engagement metrics.

What covariate to use

The best covariate is the same metric measured in the pre-experiment period (e.g., 14-day revenue before experiment launch). The longer and more stable the pre-period, the higher the correlation and the greater the variance reduction. For new users who have no pre-experiment history, the adjustment is zero — CUPED only helps for returning users with prior data.

CUPED vs. stratified sampling

Stratification at assignment time (e.g., block-randomize by user tenure decile) achieves a similar goal but must be planned before launch. CUPED is applied post-hoc and is therefore more flexible. Most modern experiment platforms support CUPED by default.

For continuous metrics with high between-user heterogeneity (spend, session duration), CUPED is almost always worth applying. For binary metrics with low baseline rates (rare conversion events), the variance reduction is modest because pre-experiment binary signals are noisier predictors.

Learn it properly A/B testing

What is CUPED and how does it reduce variance in A/B tests?

Keep practising

Explore further