When would you use MAPE versus MASE to evaluate a forecast, and what are the failure modes of each?

MAPE (Mean Absolute Percentage Error) is intuitive and scale-free but breaks when actuals are near zero and penalises under-forecasts more than over-forecasts. MASE (Mean Absolute Scaled Error) solves both issues by scaling errors against a naive seasonal benchmark, making it valid even with zero values and comparable across series with different scales.

When would you choose Prophet over ARIMA for a forecasting problem?

Prophet is a curve-fitting model that decomposes the series into trend, seasonality, and holidays; it handles missing data, multiple seasonalities, and non-uniform time grids with minimal tuning and is accessible to non-statisticians. ARIMA is a statistical model based on autocorrelation structure; it is more appropriate when the series is short, noise is small, and you need principled uncertainty intervals from an explicit stochastic process.

What is exponential smoothing, and how does Holt-Winters extend it to handle trend and seasonality?

Simple exponential smoothing computes a weighted average of all past observations where weights decay geometrically, controlled by a single smoothing parameter alpha. Holt's method adds a trend component with a second parameter beta; Holt-Winters (ETS) adds a seasonal component with a third parameter gamma, making it a strong baseline for series with both trend and seasonality.

How do you choose between batch and real-time inference for a model?

Decide based on how fresh the prediction must be versus the cost and complexity of serving live. Use batch when results are needed every few hours or days, like daily churn lists, because it is cheap, simple, and can use spot or scheduled compute. Use real-time when a late or stale decision causes immediate loss, like fraud or ad auctions needing sub-100ms responses, accepting higher cost and complexity. Most production systems are hybrid: precompute heavy signals offline and do lightweight re-ranking online.

Smoothing and Forecast Accuracy — Business Analytics

The last lesson named the enemy — noise, the random wiggle no forecast can predict — but naming it doesn’t write a number on the purchase order. Those four bouncing values, 980, 1040, 995, 1100, are exactly the jagged line we were left staring at. This lesson is how you average that noise out into one defensible level, and how you measure whether you got it right.

Every inventory planner faces the same problem. Last month you sold 980 units. The month before, 1040. Before that, 995, then 1100. None of those numbers is obviously “the right level.” Order too little and you stock out; order too much and you tie up cash in a warehouse. You need a single, reasonable estimate for next month — and you need to know how reliable that estimate is.

The answer is smoothing (averaging out the random month-to-month noise to reveal the underlying demand level, so you can forecast it) paired with a forecast error measure that keeps you honest about how wrong you tend to be.

Why raw demand is hard to plan with

Every demand observation is:

Observed demand = True underlying level + Random noise + (maybe) Trend + Seasonality

The random noise is real — it comes from weather, promotions running a day late, a competitor’s stock-out, a quirk in how orders were batched. If you plan directly off last month’s number, you are planning off the noise too. Smoothing strips most of the noise away before you forecast.

Two everyday smoothing methods

Moving average

A moving average (abbreviated MA) sets the forecast for next month equal to the average of the last N months — where N is the window size you choose.

Forecast (next month) = (Month-1 + Month-2 + ... + Month-N) / N

Simple to explain to any stakeholder. But it has two weaknesses:

It lags: the forecast reacts to a shift in demand only after N months of new data have accumulated.
It weights all N months equally, even the ones that are six or twelve months old — which may no longer reflect current conditions.

A larger window smooths more (less noise) but also lags more. A window of 1 is no smoothing at all — the forecast is just last month’s actual, perfectly copying every random bump.

Exponential smoothing

Exponential smoothing (ES) fixes the equal-weights problem by giving the most recent observation the most weight, and fading older observations geometrically — so two months ago counts less than last month, three months ago counts even less, and so on.

The single knob is alpha (a number between 0 and 1, called the smoothing factor, that controls how fast older observations fade):

Forecast (next month) = alpha × (Last month actual) + (1 - alpha) × (Last month forecast)

Alpha near 1: almost all the weight lands on last month’s actual — the forecast reacts fast, but it also chases every random bump.
Alpha near 0: almost all the weight stays on the previous forecast — the series is very smooth, but it lags real movements.

Measuring how good a forecast is

A forecast is only useful if you know how wrong it tends to be. Two standard measures:

MAE — Mean Absolute Error: the average size of the miss, in the original units (units of product, dollars, etc.).

MAE = average of |Actual - Forecast| across all months

Easy to interpret: “on average, we miss by about 60 units.” But hard to compare across products with very different scales (missing by 60 on a product that sells 100 is terrible; missing by 60 on one that sells 10,000 is fine).

MAPE — Mean Absolute Percentage Error: the average miss expressed as a percentage of the actual, so scale disappears.

MAPE = average of (|Actual - Forecast| / Actual) × 100%

A MAPE of 8% means you are typically off by about 8% of whatever demand turns out to be. The interactive widget below reports MAPE so you can compare methods on equal footing.

The core trade-off

Here is the central insight of this lesson, and it shows up in almost every modelling context you will ever encounter:

Too little smoothing (window = 1, or alpha near 1) copies last month and chases every random bump — high error on noisy data.
Too much smoothing (huge window, or alpha near 0) is so slow to update that it misses real shifts in demand — also high error.
The best parameter sits in between, at a sweet spot that minimises the forecast error on your data.

Try it yourself

Use the widget below. It shows a realistic monthly demand series with trend, seasonality, and noise layered in.

Exponential smoothing mode: drag alpha slowly from 0.05 up to 0.95 and watch the MAPE curve. Notice the sweet spot somewhere in the middle — not too reactive, not too sluggish.

Moving average mode: try window = 1 (perfect reaction, terrible MAPE), then 12 (very smooth, also worse MAPE), then find the window that minimises MAPE. Compare that best MAPE to exponential smoothing’s best.

Tryforecast smoothing

Smooth too little and you chase noise; too much and you lag

The faint line is the raw demand. The bold line is the one-step-ahead forecast. Find the parameter with the lowest error.

α (weight on the newest point)0.30

Forecast error (MAPE)9.1%

Reading a MAPE in plain language

A few benchmarks to build intuition:

MAPE	Plain interpretation
Under 5%	Excellent — rare outside very stable commodities
5 – 15%	Good for most business forecasting
15 – 30%	Acceptable for volatile consumer goods or promotions
Above 30%	The forecast is struggling — check for trend, seasonality, or data quality issues

When you present a forecast to a manager, lead with the MAPE: “Our model is typically off by about 9%. If we plan inventory at the forecast, we should hold a safety buffer sized to that uncertainty.”

Putting it together: choosing a method

Moving average is easier to explain and audit. Use it when you want full transparency or when your audience distrusts black boxes.

Exponential smoothing adapts faster and usually achieves a lower MAPE on typical business demand data. Use it when accuracy is the priority and you are comfortable tuning alpha on historical data.

Neither method handles a strong, sustained trend well on its own — both will lag. There are extensions (Holt’s method adds a trend term; Holt-Winters adds seasonality) that you will meet in a forecasting course. The instinct you are building here — quantify the error, find the parameter that minimises it, and be honest about lag — carries into every one of those methods.

In one breath

Smoothing averages out the random noise in a demand series so you can plan off the underlying level instead of the latest bump. A moving average averages the last N months (simple, but lags and weights stale months equally); exponential smoothing weights recent months more via a single knob alpha (near 1 = reactive but jumpy, near 0 = smooth but sluggish). Judge any forecast by its error: MAE (average miss in real units, easy to read but scale-bound) or MAPE (average miss as a percent, comparable across products — “typically off by 8%”). The central trade-off recurs everywhere in modelling: too little smoothing chases noise, too much erases the trend, and the best parameter sits in between — found by minimising error on data the model hasn’t seen, because window=1 fits the past perfectly by just copying it.

Practice

Quick check

0/3

Q1A product has monthly demand that is noisy but essentially flat — no trend, no seasonality. You test exponential smoothing with alpha = 0.05, 0.30, and 0.80. Which alpha is most likely to give the lowest MAPE?

Q2Your moving-average model has a MAPE of 4% on the last 12 months of historical data. Your manager says 'Great — we are essentially perfect.' What is the most important caveat?

Q3TRANSFER — A fashion retailer sells a trendy item whose demand has risen 20% every month for the past four months. An operations analyst proposes using a 6-month moving average to forecast next month. What is the most likely problem?

A question to carry forward

Step back and notice what forecasting actually gave you: a number. “Next month’s demand will be about 1,020 units, give or take 8%.” That’s genuinely useful — but it’s still just a prediction, not a decision. Knowing demand will be ~1,020 doesn’t tell you how many to produce when your factory caps at 900, your budget covers materials for only 850, and unsold units spoil. The forecast is an input; the action is still unchosen.

So the question to carry forward is: once you can predict what’s coming, how do you choose the best action given real limits — capacity, budget, time? That closes the forecasting chapter and opens the last analytical one, optimization: turning goals and constraints into a precise “produce exactly this much” answer, instead of a guess.

Smoothing and Forecast Accuracy

What you'll learn

Before you start

Why raw demand is hard to plan with

Two everyday smoothing methods

Moving average

Exponential smoothing

Measuring how good a forecast is

The core trade-off

Try it yourself

Smooth too little and you chase noise; too much and you lag

Reading a MAPE in plain language

Putting it together: choosing a method

In one breath

Practice

Quick check

A question to carry forward

Sign in to track your progress

Practice this in an interview

Related lessons

Explore further