What does NTILE do, and how would you use it to label customers into quartiles by spend?
NTILE(n) divides rows within a partition into n roughly equal buckets and assigns each row a bucket number from 1 to n. Rows are distributed as evenly as possible; when the row count is not evenly divisible, earlier buckets receive one extra row.
How to think about it
What the question is really testing
NTILE is the SQL equivalent of pd.qcut with equal-count bins. The question tests whether you know the difference between equal-count bucketing (NTILE) and equal-width bucketing (CASE with fixed thresholds) — and when each is appropriate.
How NTILE works
NTILE(n) orders the rows (within a partition if one is defined) and assigns each row a bucket number from 1 to n. Buckets are filled as evenly as possible — if the row count does not divide evenly, the first buckets get one extra row.
For example, 10 rows into NTILE(3) gives buckets of 4, 3, 3 — not 3, 3, 4.
Quartile labeling by customer spend
SELECT
customer_id,
total_spend,
NTILE(4) OVER (ORDER BY total_spend ASC) AS spend_quartile
FROM (
SELECT customer_id, SUM(order_total) AS total_spend
FROM orders
GROUP BY customer_id
) cust_totals;
Quartile 1 = lowest spenders, quartile 4 = highest spenders. Most analysts then add a CASE to attach tier labels.
Try it live — 9 customers into quartiles
9 rows cannot divide evenly into 4 buckets, so the first bucket gets 3 rows and the rest get 2. Change NTILE(4) to NTILE(3) or NTILE(10) to see how the distribution shifts.
Other common uses
Churn decile scoring:
SELECT
customer_id,
churn_score,
NTILE(10) OVER (ORDER BY churn_score DESC) AS churn_decile
FROM churn_model_output;
-- Decile 1 = top 10% most likely to churn
Per-segment quartiles with PARTITION BY:
NTILE(4) OVER (PARTITION BY region ORDER BY total_spend ASC) AS regional_quartile
This ranks customers within their region — so a mid-spend customer in a low-spend region might be Platinum while the same spend in a high-spend region is Bronze.