datarekha

VAR (multivariate)

Model several interdependent time series at once — interest rates, inflation, and unemployment — so that every variable's forecast draws on the past of every other variable in the system.

9 min read Advanced Time Series Lesson 10 of 14

What you'll learn

  • Vector AutoRegression: each series is regressed on its own lags AND the lags of every other series in the system
  • Lag-order selection with AIC/BIC, and why all series must be stationary before fitting
  • Granger causality and impulse response functions: testing and visualising cross-variable effects

Before you start

Why separate ARIMAs are not enough

Suppose you fit an ARIMA on the unemployment rate using only its own past values. That model is blind to last quarter’s inflation reading, even if that reading reliably predicts where unemployment is heading. You lose signal every single period.

Vector AutoRegression (VAR) solves this by treating all K series as a joint system. At each time step, every variable is explained by p lags of itself and p lags of every other variable. The word vector signals that the outcome is not a scalar — it is the full column of values at time t.

The VAR(p) model

For a system with K variables and lag order p, the model at time t is:

y_t = c + A_1 y_{t-1} + A_2 y_{t-2} + ... + A_p y_{t-p} + e_t

where y_t is a column vector of length K (e.g. [interest_rate, inflation, unemployment]), each A_i is a K x K matrix of coefficients, c is a constant vector, and e_t is a vector of white-noise errors. The key insight is in the matrices: the off-diagonal entries let inflation’s past feed into the interest-rate equation, and vice versa. A separate ARIMA forces every off-diagonal to zero.

How many parameters are involved?

Each of the p lag matrices holds K x K coefficients, giving p * K^2 slope parameters plus K intercepts. With K = 3 variables and p = 4 lags that is already 36 slope parameters. Parameter count grows quadratically in K — keep the system small.

Interaction diagram

Interest Rate(r_t)Inflation(π_t)Unemployment(u_t)self-lagsself-lagsself-lags
Each node receives arrows from every other node (cross-variable lags) and from itself (own lags). VAR estimates all of these paths simultaneously.

Step 1: Make every series stationary

VAR, like ARIMA, assumes stationarity. If your series are trended or otherwise non-stationary, difference them before fitting. You can run an Augmented Dickey-Fuller test on each column independently, then difference as needed.

Step 2: Choose the lag order p

You do not need to guess p. Fit models at several candidate orders and compare AIC (Akaike Information Criterion) or BIC (Bayesian Information Criterion). BIC penalises complexity more heavily and tends to favour parsimonious models, which is usually preferable when K is not tiny.

statsmodels can scan candidates automatically when you pass maxlags to .fit().

Step 3: Fit and forecast

import pandas as pd
from statsmodels.tsa.vector_ar.var_model import VAR

df = pd.read_csv("macro.csv", index_col="date", parse_dates=True)
df_diff = df.diff().dropna()

model = VAR(df_diff)
result = model.fit(maxlags=8, ic="bic")

print(result.summary())

lag_order = result.k_ar
forecast_input = df_diff.values[-lag_order:]
forecast = result.forecast(y=forecast_input, steps=4)
forecast_df = pd.DataFrame(forecast, columns=df_diff.columns)
print(forecast_df)

Granger causality

After fitting you can ask a formal question: does knowing the past of variable X make forecasts of variable Y significantly more accurate, beyond what Y’s own past already tells us? This is called Granger causality — named after Clive Granger who formalised it.

Granger causality is not philosophical causality. It is a predictive test: X Granger-causes Y if lagged X has statistically significant coefficients in the Y equation.

from statsmodels.tsa.stattools import grangercausalitytests

results = grangercausalitytests(df_diff[["unemployment", "inflation"]], maxlag=4)

A low p-value on the F-test at a given lag means the lagged series does add predictive power.

Impulse response functions

After a shock to one variable — say, a sudden spike in the interest rate — how do the other variables respond over the following quarters? Impulse response functions (IRFs) trace that ripple through the system period by period.

irf = result.irf(periods=12)
irf.plot(orth=False)

IRFs are one of the most interpretable outputs of a VAR: they let you say “a one-standard-deviation shock to inflation is associated with unemployment rising for about three quarters before returning to baseline.”

VAR versus separate ARIMAs — summary

PropertySeparate ARIMAsVAR
Cross-variable effectsIgnoredCaptured via off-diagonal lags
ParametersK independent setsp * K^2 shared slopes
Granger testsNot possibleBuilt-in
Impulse responsesNot availableAvailable
Stationarity requiredPer seriesAll series must be stationary

Quick check

0/3
Q1In a VAR(2) model with K = 3 variables, how many slope coefficients (excluding intercepts) are estimated?
Q2Why must every series be stationary before fitting a VAR?
Q3A researcher fits separate ARIMAs to GDP growth and consumer confidence, then fits a VAR on the same data. The VAR's forecast for GDP growth is noticeably better. What is the most likely explanation?

Practice this in an interview

All questions
What is a VAR model, and when would you use it instead of a univariate ARIMA?

A Vector Autoregression (VAR) model extends ARIMA to multiple time series simultaneously: each variable is regressed on its own past values and the past values of all other variables in the system. Use VAR when the series have mutual predictive relationships (Granger-causality) and you want to model those interactions; ARIMA is sufficient when one series can be forecast in isolation.

When would you choose Prophet over ARIMA for a forecasting problem?

Prophet is a curve-fitting model that decomposes the series into trend, seasonality, and holidays; it handles missing data, multiple seasonalities, and non-uniform time grids with minimal tuning and is accessible to non-statisticians. ARIMA is a statistical model based on autocorrelation structure; it is more appropriate when the series is short, noise is small, and you need principled uncertainty intervals from an explicit stochastic process.

What is multicollinearity, how does it harm regression, and how do you detect and fix it?

Multicollinearity occurs when two or more predictors are highly linearly correlated, inflating the variance of coefficient estimates and making them numerically unstable and uninterpretable. The Variance Inflation Factor (VIF) quantifies how much each coefficient's variance is inflated relative to an orthogonal design.

What is the difference between ARIMA and SARIMA, and when do you use each?

ARIMA(p,d,q) models non-seasonal series by combining autoregression, differencing, and a moving average of errors. SARIMA extends it with a second set of seasonal parameters (P,D,Q,s) that operate at the seasonal lag s, handling periodic patterns that ARIMA alone cannot capture.

Sign in to track your progress

Completed lessons, your XP, level, and streak save to your account — it's free and takes a few seconds.

Explore further

Related lessons

Skip to content