datarekha
Machine Learning Easy Asked at AmazonAsked at GoogleAsked at Flipkart

When should you use one-hot encoding versus label encoding for categorical features?

The short answer

Label encoding assigns each category an integer and implies an ordinal relationship that most algorithms will treat as meaningful distance. One-hot encoding creates a binary column per category and is correct for nominal data fed to linear or distance-based models. Use label encoding only when the category genuinely has an order, or when feeding tree-based models that handle it cleanly.

How to think about it

The distinction matters because models read the encoded numbers literally: an integer label of 3 looks “larger” than 1 to any algorithm that computes distances or derivatives.

Label encoding

Assigns each unique category a consecutive integer: {'cat': 0, 'dog': 1, 'fish': 2}. This is correct for ordinal features (e.g., low → 0, medium → 1, high → 2) where the rank genuinely carries information. For nominal categories, the implicit ordering is spurious and can mislead linear models, KNN, and neural networks.

Tree-based models (decision trees, Random Forest, XGBoost) can still work with label-encoded nominals because they split on a single threshold and will effectively partition categories, but it forces them to explore more splits than one-hot would require.

One-hot encoding

Creates one binary column per category. There is no implied ordering. It is the right default for nominal data in linear models, logistic regression, SVM, and neural networks.

import pandas as pd
from sklearn.preprocessing import OneHotEncoder
from sklearn.pipeline import Pipeline

enc = OneHotEncoder(sparse_output=False, handle_unknown="ignore")
X_encoded = enc.fit_transform(X_train[["color"]])
# In a full pipeline:
from sklearn.compose import ColumnTransformer
ct = ColumnTransformer([
    ("ohe", OneHotEncoder(handle_unknown="ignore"), ["city", "category"]),
], remainder="passthrough")

Decision rule

SituationEncoding
Ordinal feature (low/med/high)Label / ordinal
Nominal feature, linear/distance modelOne-hot
Nominal feature, tree modelEither; one-hot often cleaner
High-cardinality nominal (>20 levels)Target encoding or hashing (see separate question)

Keep practising

All Machine Learning questions

Explore further

Skip to content