Time Series Easy Asked at AmazonAsked at MicrosoftAsked at Airbnb

Why can't you shuffle a time series before splitting into train and test sets?

For Data Scientist ML Engineer Data Analyst

The short answer

Shuffling destroys temporal order, so the model trains on future data and is evaluated on the past — a direct information leak. Time series observations are serially correlated, meaning past values predict future ones, and any random split obliterates that structure entirely.

How to think about it

Keep the answer tight: shuffling breaks causality and leaks the future. Interviewers want to hear you name the exact mechanism, not just say “order matters.”

What goes wrong

A time series is a sequence where observation at time t depends on observations at t-1, t-2, and so on — that is the signal you are trying to learn. When you shuffle:

Training rows include timestamps that come after test rows. The model has effectively seen the future during training.
Rolling statistics, lags, and any feature derived from prior rows are computed on the shuffled order, producing meaningless or inflated values.
Evaluation looks great in-sample but the trained model fails completely in production, where time flows forward.

The correct split

Place the split at a single cutoff point. Everything before it is train; everything after is test.

import pandas as pd

df = pd.read_csv("sales.csv", parse_dates=["date"], index_col="date").sort_index()

cutoff = "2023-12-31"
train = df.loc[:cutoff]
test  = df.loc[cutoff:]   # strictly after cutoff

For hyperparameter tuning, extend this to walk-forward (expanding-window) cross-validation so each fold’s validation set is always in the future relative to its training set.

Why random k-fold is doubly wrong

Random k-fold does two things that hurt: it shuffles the rows, and then it places future observations in one fold’s training split while past ones sit in its validation split. Both effects inflate apparent performance.

Learn it properly Why time series is different

Why can't you shuffle a time series before splitting into train and test sets?

What goes wrong

The correct split

Why random k-fold is doubly wrong

Keep practising

Explore further