ACF & PACF
How ACF and PACF plots reveal the AR and MA order fingerprints that guide ARIMA model selection.
What you'll learn
- ACF measures correlation between a series and its own lagged values; PACF isolates that correlation after removing shorter-lag effects.
- An AR(p) signature: PACF cuts off after lag p, ACF tails off; an MA(q) signature: ACF cuts off after lag q, PACF tails off.
- Significance bands at roughly plus-or-minus 2 divided by the square root of n mark which lags are statistically meaningful.
Before you start
What is autocorrelation?
A lag is a time offset. Lag 1 means “one step back in time,” lag 2 means “two steps back,” and so on.
Autocorrelation at lag k is the ordinary Pearson correlation between the original series and a copy of itself shifted k steps into the past. If today’s value tends to resemble yesterday’s value, the lag-1 autocorrelation is high. If the series has a weekly cycle, lag-7 autocorrelation will spike.
The full set of autocorrelations across lags 0, 1, 2, … forms the Autocorrelation Function (ACF). By definition, lag-0 autocorrelation is always 1.
What is partial autocorrelation?
Partial autocorrelation at lag k asks a tighter question: what is the correlation between the series and its lag-k version after removing the influence of all the lags in between (lags 1 through k−1)?
Think of it this way. If lag-1 autocorrelation is strong, it will automatically create apparent correlation at lag 2 simply because today relates to yesterday and yesterday relates to the day before. The PACF strips that indirect path away, leaving only the direct association at each lag.
The Partial Autocorrelation Function (PACF) collects these cleaned-up correlations across lags.
Reading the fingerprints
The practical value of ACF and PACF comes from two classic patterns that distinguish AR and MA processes.
AR(p) signature
An autoregressive process of order p uses the last p observations directly. Its fingerprint:
- PACF cuts off sharply after lag
p— beyond that, partial correlations are near zero. - ACF tails off — it decays gradually (exponentially, or in a damped sinusoidal fashion) without a clean cutoff.
The PACF cutoff tells you p almost directly. If PACF is significant at lags 1 and 2 but essentially zero from lag 3 onward, try AR(2).
MA(q) signature
A moving-average process of order q is built from the last q error terms. Its fingerprint is the mirror image:
- ACF cuts off sharply after lag
q. - PACF tails off gradually.
If ACF is significant only at lag 1 and is near zero from lag 2 onward, try MA(1).
Mixed ARMA
When both ACF and PACF tail off (neither cuts off cleanly), the series likely has both AR and MA components. That is the signal to try ARMA(p, q) combinations, usually starting small.
Significance bands
Not every non-zero bar in an ACF or PACF plot is meaningful. Under the null hypothesis of no autocorrelation, sample autocorrelations are approximately normally distributed with standard error of roughly 1 divided by the square root of n, where n is the number of observations.
The conventional significance band is plus-or-minus 2 divided by the square root of n (a 95 % threshold). Bars that stay inside the band are consistent with noise. Only bars that poke outside the band warrant attention.
Most plotting libraries draw these bands for you as dashed horizontal lines.
The AR(1) signature in pictures
The diagram below shows what ACF and PACF look like for a simulated AR(1) process. The ACF decays gradually across lags (tailing off), while the PACF drops to near zero immediately after lag 1 (cutting off). The dashed lines mark the approximate significance band.
AR(1) fingerprint: ACF decays exponentially across lags (tails off); PACF is significant only at lag 1 and is near zero thereafter (cuts off). Dashed lines mark the approximate significance band.
Computing ACF by hand
The formula for the sample autocorrelation at lag k is:
- Subtract the series mean from every value to get mean-centered values.
- Compute the dot product of the mean-centered series with a copy of itself shifted by
ksteps. - Divide by the dot product at lag 0 (which equals the total variance times n).
The playground below walks through this computation so you can watch the decay unfold.
You should see a stem plot where every bar is positive and the heights decrease monotonically — the hallmark “tailing off” pattern. The bars shrink approximately by a factor of phi at each step, which is the theoretical ACF of an AR(1).
Change phi to a negative value (for example -0.7) and rerun. The bars will alternate in sign but still decay in magnitude — still tailing off, just oscillating.
Using statsmodels in practice
Hand-computing ACF is instructive, but in a real workflow you will use statsmodels:
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
import matplotlib.pyplot as plt
fig, axes = plt.subplots(1, 2, figsize=(10, 4))
plot_acf(x, lags=20, ax=axes[0])
plot_pacf(x, lags=20, ax=axes[1], method="ywm")
plt.tight_layout()
plt.show()
plot_acf and plot_pacf handle the significance shading automatically, and method="ywm" (Yule-Walker with bias correction) is the recommended default for PACF. The plots are the same diagnostic you saw in the diagram — you are just reading them rather than computing them.
Putting it together: the Box-Jenkins identification step
The classic workflow for choosing ARIMA orders is:
- Make the series stationary (difference as needed — see the stationarity lesson).
- Plot ACF and PACF on the stationary series.
- If PACF cuts off at lag
pand ACF tails off, start with AR(p). - If ACF cuts off at lag
qand PACF tails off, start with MA(q). - If both tail off, try small ARMA(p, q) combinations.
- Fit, check residuals, and use AIC or BIC to compare candidates.
ACF and PACF give you a shortlist, not a definitive answer. Always validate the chosen model by checking that its residuals look like white noise.
Quick check
Practice this in an interview
All questionsThe ACF measures correlation between a series and its own lags including indirect effects; the PACF strips out those indirect effects to show direct correlation at each lag. A cut-off in the PACF after lag p signals an AR(p) process; a cut-off in the ACF after lag q signals an MA(q) process.
Choose d by differencing until the ADF test confirms stationarity; choose p from the PACF cutoff and q from the ACF cutoff on the differenced series; then confirm with AIC or BIC to guard against over-fitting. In practice, an automated grid search over a small range of candidates with information criteria is more reliable than visual inspection alone.
Prophet is a curve-fitting model that decomposes the series into trend, seasonality, and holidays; it handles missing data, multiple seasonalities, and non-uniform time grids with minimal tuning and is accessible to non-statisticians. ARIMA is a statistical model based on autocorrelation structure; it is more appropriate when the series is short, noise is small, and you need principled uncertainty intervals from an explicit stochastic process.
ARIMA(p,d,q) models non-seasonal series by combining autoregression, differencing, and a moving average of errors. SARIMA extends it with a second set of seasonal parameters (P,D,Q,s) that operate at the seasonal lag s, handling periodic patterns that ARIMA alone cannot capture.