datarekha

Stationarity, ADF & differencing

ARIMA keeps demanding a 'stationary' series. What does that mean, how do I check it, and how do I make my data stationary?

9 min read Intermediate Time Series Lesson 3 of 14

What you'll learn

  • Stationarity: why ARIMA needs constant mean, variance, and autocovariance over time
  • How to check stationarity — rolling statistics and the Augmented Dickey-Fuller test
  • Differencing and log transforms to make a non-stationary series stationary

Before you start

What stationarity means

A time series is stationary when its statistical properties don’t change as time moves forward. Formally, three things must hold across all time windows of the same length:

  • Mean — the average level stays roughly constant.
  • Variance — the spread of values stays roughly constant.
  • Autocovariance — the relationship between a value and its own past (its correlation structure) depends only on the lag, not on the absolute position in time.

A sales series that grows every year violates the first condition — its mean is drifting upward. A price series during a volatile crisis violates the second — variance explodes, then calms. ARIMA and most classical forecasting models assume the series it is learning from is stationary, because they try to model a stable, repeatable dynamic. If the mean keeps shifting, there is no stable dynamic to learn: the model’s coefficient estimates change depending on which time window you fit on, and forecasts diverge quickly.

A visual

The SVG below shows the contrast directly. The left panel has a drifting mean — classic non-stationarity. The right panel has a flat mean, constant spread, and no trend.

Non-stationaryStationaryrising meanflat meantime →time →

Left: a trending (non-stationary) series — mean drifts upward over time. Right: a stationary series — mean and spread are stable across the window.

How to check: rolling statistics and the ADF test

There are two complementary approaches. Start visual, then confirm statistically.

Visual check — rolling mean and standard deviation

Compute a rolling window mean and standard deviation and plot them alongside the original series. If the rolling mean is flat and the rolling std is roughly constant, the series looks stationary. If either drifts systematically, you have a problem. This is a quick sanity check, not a formal test.

The Augmented Dickey-Fuller test

The Augmented Dickey-Fuller (ADF) test is the standard statistical test for a unit root — the formal term for the kind of non-stationarity caused by a trend. The terminology needs unpacking:

  • Unit root — a particular structure in the time series model that causes the series to wander indefinitely rather than mean-revert. A random walk (y_t = y_{t-1} + noise) has a unit root.
  • Null hypothesis of ADF — the series has a unit root (i.e., it is non-stationary). This is the opposite of what you want.
  • Alternative hypothesis — the series does not have a unit root (i.e., it is stationary).

So a small p-value (say, below 0.05) means you reject the null — the series is stationary. A large p-value means you fail to reject — the series likely has a unit root and needs treatment.

The test is available in statsmodels. Because statsmodels is not available in the browser playground, here it is as a static block:

from statsmodels.tsa.stattools import adfuller
import pandas as pd

result = adfuller(series)
print(f"ADF statistic : {result[0]:.4f}")
print(f"p-value       : {result[1]:.4f}")
print(f"Critical vals : {result[4]}")

The output includes the ADF statistic (more negative = stronger evidence of stationarity), the p-value, and critical values at 1%, 5%, and 10% significance levels. A well-differenced series typically returns a small p-value, well below 0.05, and an ADF statistic more negative than the 5% critical value.

Making a series stationary: differencing

If the series has a trend (non-constant mean), the standard fix is differencing. The first difference of a series replaces each value with the change since the previous step:

diff1 = series.diff().dropna()

This removes a linear trend. If the trend is quadratic or the first difference still drifts, apply a second difference — but see the warning below.

For seasonality, seasonal differencing subtracts the value from the same period in the prior cycle. For monthly data with yearly seasonality:

diff_seasonal = series.diff(12).dropna()

Stabilizing variance with a log transform

If the series shows growing variance — the swings get larger as the level rises, common in revenue or stock prices — a log transform compresses that before you difference:

import numpy as np
log_series = np.log(series)
log_diff = log_series.diff().dropna()

The log transform converts multiplicative growth patterns into additive ones, which differencing can then remove.

The d in ARIMA

ARIMA(p, d, q) has three components. The middle one, d, is the order of differencing — how many times you apply .diff() before passing the series to ARIMA. d=0 means the series is already stationary. d=1 is the most common value and handles most linear trends. d=2 is occasionally needed and rarely anything higher.

Playground: watch the trend vanish

The playground below builds a trending series (constant drift plus noise), plots the original and the first difference side by side, and prints their rolling means so you can compare. The trend should disappear after differencing.

The differenced series should hover around zero with a roughly constant spread — no long upward march. That is what the ADF test confirms statistically.

Putting it all together: a workflow

  1. Plot the raw series and its rolling mean/std. Does the mean drift? Does variance grow?
  2. If yes, apply a log transform first if variance is growing, then difference.
  3. Run adfuller on the result. If the p-value is above 0.05, difference once more.
  4. Stop as soon as the p-value is small. Note how many times you differenced — that is d in ARIMA.
  5. Before modelling, look at the ACF and PACF of the stationary series to choose p and q.

Quick check

0/3
Q1The ADF test returns a p-value of 0.42 on your sales series. What does this tell you?
Q2Your revenue series grows exponentially and the swings double every year. Which pre-processing order is correct before running ARIMA?
Q3A junior analyst differences a daily temperature series (already flat and mean-reverting) twice 'just to be safe.' What is the likely outcome?

Practice this in an interview

All questions
What is stationarity in a time series, and how do you test for it?

A stationary series has a constant mean, constant variance, and autocovariance that depends only on lag — not on when you look. Most classical models (ARIMA, VAR) require it. The Augmented Dickey-Fuller (ADF) test is the standard check; a p-value below 0.05 lets you reject the unit-root null and conclude the series is stationary.

How do you choose p, d, and q for an ARIMA model?

Choose d by differencing until the ADF test confirms stationarity; choose p from the PACF cutoff and q from the ACF cutoff on the differenced series; then confirm with AIC or BIC to guard against over-fitting. In practice, an automated grid search over a small range of candidates with information criteria is more reliable than visual inspection alone.

What is the difference between ARIMA and SARIMA, and when do you use each?

ARIMA(p,d,q) models non-seasonal series by combining autoregression, differencing, and a moving average of errors. SARIMA extends it with a second set of seasonal parameters (P,D,Q,s) that operate at the seasonal lag s, handling periodic patterns that ARIMA alone cannot capture.

What is a VAR model, and when would you use it instead of a univariate ARIMA?

A Vector Autoregression (VAR) model extends ARIMA to multiple time series simultaneously: each variable is regressed on its own past values and the past values of all other variables in the system. Use VAR when the series have mutual predictive relationships (Granger-causality) and you want to model those interactions; ARIMA is sufficient when one series can be forecast in isolation.

Sign in to track your progress

Completed lessons, your XP, level, and streak save to your account — it's free and takes a few seconds.

Explore further

Related lessons

Skip to content