Model Interpretability — SHAP vs LIME
Your model said no. Now explain why. The map of post-hoc interpretability — global vs local, LIME's local surrogate vs SHAP's additive Shapley values, plus permutation importance and PDP/ICE — and exactly when to reach for each.
What you'll learn
- The GLOBAL (whole-model) vs LOCAL (one-prediction) split that organizes every method
- LIME — perturb, weight by proximity, fit a sparse linear surrogate that is only locally faithful
- SHAP — Shapley attributions with the additivity guarantee; TreeSHAP fast, KernelSHAP model-agnostic
- Permutation importance, PDP, and ICE — and how to actually READ each output
- When to reach for which, and the gotchas (correlation isn't causation, split credit, LIME instability)
Before you start
The SHAP lesson covered the mechanics of one method in depth. This lesson zooms out: it places SHAP next to LIME — its most common rival for single-prediction explanations — and next to the rest of the toolkit you’ll actually use. The goal isn’t to crown a winner. It’s to know which question each method answers, and to stop misreading the output.
The one distinction that organizes everything: global vs local
Every post-hoc method answers one of two questions:
- Global — How does the model behave on average, across the whole dataset? Which features does it rely on most? What’s the average shape of a feature’s effect? Examples: permutation importance, partial dependence (PDP).
- Local — Why did the model produce this prediction for this one instance? Which features pushed this specific output up or down? Examples: LIME, a single SHAP explanation, ICE curves.
The neat part: SHAP lives in both columns. It is fundamentally a local method — one Shapley attribution per feature, per prediction — but those per-instance values stack into global views (mean absolute SHAP for importance, dependence/beeswarm plots for effect shape). That dual nature is a big part of why it’s so widely used.
LIME — fit a simple model that’s right near here
LIME (Local Interpretable Model-agnostic Explanations; Ribeiro, Singh, and Guestrin, “Why Should I Trust You?”, KDD 2016) has a gloriously simple idea. You can’t make a complex model simple everywhere — but you can approximate it with a straight line in the tiny neighborhood around one prediction. The recipe is four steps:
- Perturb — sample synthetic points around the instance you care about.
- Predict — ask the black-box model for its output on each perturbed point.
- Weight — weight those samples by proximity to the original
instance (a kernel
pi_x— closer points matter more). - Fit — train a simple, sparse linear surrogate on this weighted local dataset. Its signed coefficients are the explanation.
Formally, LIME solves an objective with two pieces — a proximity-weighted fidelity loss and a complexity penalty:
f is the black box, g the sparse linear surrogate, and πₓ the proximity kernel. Minimize local error while keeping the explanation simple.LIME’s strengths are real: it is model-agnostic (it only queries
predictions — it never looks inside f) and fast. Its weaknesses
are equally real:
- It aims only for local fidelity. The surrogate is a decent approximation right next to your instance and says nothing about the model elsewhere — the paper explicitly trades global faithfulness for local faithfulness.
- It is unstable. Because steps 1–2 use random sampling, re-running with a different seed can produce a different explanation for the same prediction, and two nearby points can get very different explanations. It’s also sensitive to the neighborhood kernel width, for which there’s no principled default. (This instability has spawned a stream of stabilization variants — S-LIME, DLIME, SLICE.)
SHAP — distribute the exact prediction, fairly
SHAP (SHapley Additive exPlanations; Lundberg and Lee, A Unified Approach to Interpreting Model Predictions, NIPS 2017) starts from a different place: cooperative game theory. Treat each feature as a player in a game whose payout is the prediction, and ask for each feature’s Shapley value — its fair average marginal contribution across all coalitions of features. The defining property is additivity (also called local accuracy):
f(x) = base_value + Σ_j shap_value_j
where base_value is E[f(x)], the model’s average output. The
contributions sum exactly to (prediction − base value). That
sum-to-the-prediction guarantee is the thing LIME’s surrogate does not
give you. SHAP is pinned down by three axioms — local accuracy
(sums to the output), missingness (an absent feature gets zero), and
consistency (if a feature’s marginal contribution never decreases,
its SHAP value can’t decrease) — and consistency is exactly what LIME
and naive gain-based tree importance lack.
There are two SHAP estimators you’ll meet:
| Estimator | Scope | Speed | When |
|---|---|---|---|
| TreeSHAP (TreeExplainer) | tree ensembles only | Fast, exact — polynomial time | XGBoost, LightGBM, CatBoost, random forests, sklearn |
| KernelSHAP (KernelExplainer) | any model (agnostic) | Slow — samples many coalitions | Neural nets, SVMs, anything non-tree |
KernelSHAP is the bridge between the two worlds: it estimates Shapley values by solving a specially-weighted linear regression over sampled feature coalitions — essentially LIME’s local-surrogate framing, but with the SHAP kernel that makes the result Shapley values. TreeSHAP is a different, tree-specific algorithm that is exact, not a sampler.
See it: the same prediction, two explanations
Here is the whole comparison in one place. One loan applicant, one black-box output of 0.74 default risk. Toggle between SHAP (a waterfall that starts at the base value and sums exactly to the prediction) and LIME (signed surrogate weights that approximate the model near this point). Then flip the scope to global to swap the per-instance attribution for permutation-importance bars — a different question entirely.
The aha the widget is built around: LIME approximates the model locally with a surrogate; SHAP fairly distributes the exact prediction. Same question (“why this prediction?”), different guarantees. Permutation importance answers a third question — not “why this row?” but “how much does the model lean on each feature overall?” — which is why its ranking can legitimately differ from a single SHAP explanation.
The rest of the global toolkit
Two more staples round out the map. Both are global, both are built into
scikit-learn’s inspection module.
- Permutation importance (Breiman, 2001). Measure a baseline score, randomly shuffle one feature’s column to break its link to the target, re-score, and take the drop in performance as that feature’s importance. Bigger drop = more reliance. Crucially it’s computed on held-out data and — unlike default impurity/gain importance — is not biased toward high-cardinality features.
- Partial Dependence Plot (PDP) (Friedman, 2001). Vary one feature across its range, average the prediction over all other features, and plot the result. It shows the average shape of that feature’s effect — is risk rising, falling, flat, or U-shaped as the feature grows?
- Individual Conditional Expectation (ICE) (Goldstein et al., 2015) disaggregates the PDP: one line per instance instead of the average. The PDP is literally the mean of all ICE lines. ICE exposes heterogeneity — if half your population rises and half falls, the PDP shows a flat line that’s true of nobody, and only ICE reveals it.
How to actually READ each output
This is where people go wrong. Each method’s output answers a specific question — read it as that, nothing more:
| Output | Reads as |
|---|---|
| LIME weights / one SHAP explanation | Signed per-feature push on this prediction — what moved it up vs down. (SHAP additionally sums exactly to prediction − base value.) |
| mean |SHAP| / permutation importance | A ranking of how much the model relies on each feature overall — magnitude only, no direction. |
| PDP | The average shape/direction of one feature’s effect across the dataset. |
| ICE | Per-instance effect shapes — exposes interactions and heterogeneity a PDP hides. |
The single most common misread: treating an importance number
(permutation or mean |SHAP|) as if it showed direction. It doesn’t.
Importance only ranks reliance. For direction and shape you need signed
SHAP values, a PDP, or ICE.
The gotchas — all from the methods’ own literature
The rest, briefly:
- Correlated features split credit. When two features are correlated, the attribution (and PDP, and permutation importance) gets divided between them — or evaluated at unrealistic feature combinations — so a single feature’s number can mislead. The SHAP-in-production field guide walks through a real case where credit went to the wrong correlated cousin.
- PDP and ICE assume independence. They vary one feature while holding others fixed; under strong correlation they evaluate points that never occur in reality.
- LIME is unstable and kernel-width-sensitive (above) — always check whether the explanation holds across seeds before you trust it.
- Default tree
feature_importances_(gain/impurity) is biased and inconsistent — it favors high-cardinality features and was a core motivation for both permutation importance and TreeSHAP.
When to reach for which
The practical default: tree ensemble? Use TreeSHAP — it’s fast, exact, consistent, and does local and global at once. Arbitrary black box and you just need a fast single-prediction sketch? LIME — but verify it across seeds. Want principled Shapley values regardless of model type and can eat the runtime? KernelSHAP. For a cheap global ranking, permutation importance; to see a feature’s effect shape, PDP, escalating to ICE when you suspect interactions. For the deep mechanics of computing and plotting SHAP values, return to the SHAP lesson; to judge whether an explanation is even trustworthy on your task, the metrics lesson is the companion.
Quick check
Quick check
Practice this in an interview
All questionsSHAP assigns each feature a contribution value based on Shapley values from cooperative game theory, providing globally consistent and locally accurate explanations with a solid theoretical foundation. LIME approximates the model locally around a single prediction using a simpler interpretable model, which is fast but can produce inconsistent explanations across similar inputs.
A model card documents a model's intended use, training data, evaluation results broken down by relevant subgroups, known limitations, and ethical considerations, so stakeholders can judge whether and where it should be used. Explainability is provided through methods like SHAP or LIME for feature attributions, plus logging the inputs and reasons behind each decision so it can be audited or contested. Together they support transparency, oversight, and regulatory requirements for high-risk systems.
Impurity-based importance (mean decrease in impurity) is systematically biased toward high-cardinality and continuous features because they offer more candidate splits. Permutation importance and SHAP values are less biased alternatives that measure actual predictive contribution on held-out data.
LLMOps extends classical MLOps to handle foundation model scale, prompt-based configuration, non-deterministic outputs, and evaluation without a scalar ground truth. Key new concerns include prompt versioning, output quality evaluation via LLM judges or human review, hallucination monitoring, cost management, and RAG pipeline observability.