What is the difference between ARIMA and SARIMA, and when do you use each?
The short answer
ARIMA(p,d,q) models non-seasonal series by combining autoregression, differencing, and a moving average of errors. SARIMA extends it with a second set of seasonal parameters (P,D,Q,s) that operate at the seasonal lag s, handling periodic patterns that ARIMA alone cannot capture.
How to think about it
Explain each component letter by letter, then the seasonal extension, then a concrete use case. Interviewers often probe whether you know what the “MA” term actually models.
ARIMA(p, d, q) unpacked
- p — number of autoregressive (AR) lags: Yt is regressed on its own past values Yt-1 … Yt-p.
- d — degree of differencing applied to make the series stationary. d=1 means we model Yt - Yt-1.
- q — number of moving-average (MA) terms: Yt depends on past forecast errors εt-1 … εt-q, not on a rolling mean of past values.
The model for ARIMA(1,1,1) after one difference ΔYt = Yt - Yt-1:
ΔYt = c + φ1 ΔYt-1 + εt + θ1 εt-1
SARIMA(p,d,q)(P,D,Q,s)
SARIMA adds a seasonal layer with the same AR/I/MA structure applied at multiples of the seasonal period s (e.g., s=12 for monthly data):
- P — seasonal AR lags at s, 2s, …
- D — seasonal differences: Yt - Yt-s
- Q — seasonal MA errors at lags s, 2s, …
from statsmodels.tsa.statespace.sarimax import SARIMAX
model = SARIMAX(
train,
order=(1, 1, 1), # (p, d, q)
seasonal_order=(1, 1, 1, 12), # (P, D, Q, s)
enforce_stationarity=False,
enforce_invertibility=False,
)
result = model.fit(disp=False)
forecast = result.forecast(steps=12)
When to use which
| Scenario | Choice |
|---|---|
| Non-seasonal — stock prices, random walk | ARIMA |
| Monthly retail sales with a yearly cycle | SARIMA(p,d,q)(P,D,Q,12) |
| Weekly data with day-of-week pattern | SARIMA with s=7 |
| Multiple seasonalities (daily + weekly) | Prophet or TBATS |