How do network effects and interference violate A/B test assumptions, and what can you do about it?
Standard A/B tests assume the Stable Unit Treatment Value Assumption (SUTVA): one user's outcome is unaffected by another user's assignment. Network products violate this — a user in the control group can be affected by their friends in the treatment group, making naive estimates biased.
How to think about it
Why standard randomization fails
On a social network, marketplace, or two-sided platform, users interact with each other. If you randomly assign half of users to a new messaging feature, the treatment-group users start sending more messages — which land in the inboxes of control-group users. The control group’s behavior changes as a result. Your measured effect is the difference between a treated treatment group and a contaminated control group, not a clean causal effect.
This is called interference or spillover, and it causes naive A/B estimates to be biased — typically underestimating positive network effects and overestimating negative ones.
Design-level solutions
-
Cluster randomization (graph clustering). Assign entire clusters of tightly connected users (friend groups, geographic markets, seller-buyer pairs) to the same condition. The treatment boundary falls between clusters rather than between individuals. LinkedIn, Facebook, and Airbnb use this for social features and marketplace experiments. The downside is far fewer independent units, which means lower statistical power.
-
Geo-based experiments. Randomize at the city or region level rather than the user level. Used heavily in marketplace and advertising experiments (Lyft, DoorDash). Clean from interference if spillover across regions is low, but seasonal and local effects become confounders.
-
Ego network randomization. Assign the ego (focal user) and their entire first-degree network together. Clean for direct neighborhood effects but expensive in sample size.
-
Switchback experiments (time-based randomization). Used in two-sided marketplaces and ride-sharing (Uber, Lyft). Alternate treatment and control across time windows in the same market rather than across users. Assumes the market recovers quickly between windows.
Measurement-level corrections
When you cannot change the design, use network exposure models: estimate each unit’s exposure to treated neighbors as a covariate and model the outcome as a function of both direct treatment and network exposure. This requires knowing the network graph and is more assumption-dependent than design-level solutions.
When interference is a genuine concern, the experiment design conversation must involve the network structure, not just traffic volume.