When would you use MAPE versus MASE to evaluate a forecast, and what are the failure modes of each?
MAPE (Mean Absolute Percentage Error) is intuitive and scale-free but breaks when actuals are near zero and penalises under-forecasts more than over-forecasts. MASE (Mean Absolute Scaled Error) solves both issues by scaling errors against a naive seasonal benchmark, making it valid even with zero values and comparable across series with different scales.
How to think about it
Cover the formulas, the asymmetry problem in MAPE, the zero-value problem, and when MASE is preferable. Interviewers at retail/supply-chain companies ask this constantly.
MAE and RMSE — the baseline pair
MAE = mean(|yi - yi_hat|): robust to outliers, in the same units as the series, but not comparable across series with different scales.
RMSE = sqrt(mean((yi - yi_hat)²)): penalises large errors more heavily; useful when big misses are disproportionately costly. Also in the same units as the series.
MAPE — convenient but fragile
MAPE = mean(|yi - yi_hat| / |yi|) × 100
Problems:
- Division by zero when yi = 0 (common in intermittent demand, new product launches).
- Asymmetry: a 100 % over-forecast (predict 200, actual 100) contributes the same as a 100 % under-forecast. But a 200 % under-forecast (predict 100, actual 300) contributes 67 % — capped at 100 % for over-forecasts, unbounded for under. This causes MAPE-optimised models to systematically bias toward lower forecasts.
- Meaningless on negative values (e.g., net revenue, temperatures).
MASE — the better default for business forecasting
MASE = MAE_model / MAE_naive_seasonal
where the naive seasonal baseline predicts Yt = Yt-s (last period’s same season). MASE < 1 means the model beats the naive baseline; MASE > 1 means it doesn’t.
import numpy as np
def mase(actual, forecast, train, period=1):
"""
actual : test set actuals (array)
forecast: model predictions on test set (array)
train : training actuals used to compute naive baseline
period : seasonal period (1 = non-seasonal naive)
"""
mae_model = np.mean(np.abs(actual - forecast))
naive_errors = np.abs(train[period:] - train[:-period])
mae_naive = np.mean(naive_errors)
return mae_model / mae_naive
Choosing which metric to report
| Situation | Metric |
|---|---|
| No zeros, single series, stakeholders want % | MAPE |
| Zeros or near-zeros in actuals | MASE or RMSSE |
| Comparing across series with different scales | MASE |
| Penalising large errors more | RMSE |
| All errors equally important | MAE |