Where does bias enter an ML pipeline, and what mitigation options do you have at each stage?
Bias can enter through the data (historical, sampling, or labeling bias), the features (proxies for protected attributes), the objective (optimizing only for accuracy), and deployment (feedback loops). Mitigations are grouped into pre-processing (reweighting or resampling data), in-processing (adding fairness constraints during training), and post-processing (adjusting thresholds per group). Removing the protected attribute alone is insufficient because of proxy variables.
How to think about it
The crisp answer
Bias enters at every stage: data collection (historical and sampling bias), labeling (annotator bias), features (proxies for protected attributes), the objective (optimizing accuracy on an imbalanced or skewed population), and deployment (feedback loops that amplify the model’s own decisions). Mitigations fall into three families: pre-processing, in-processing, and post-processing.
Why each stage matters
If historical hiring data reflects past discrimination, a model trained to predict “who got hired” learns that discrimination. Even without the protected attribute, correlated proxies (zip code, school, name) reintroduce it. Optimizing only aggregate accuracy lets the model sacrifice a minority group to do well on the majority.
Mitigation by stage
As surveyed in Microsoft’s fairness measurement write-up:
- Pre-processing: reweight or resample to balance groups, remove biased labels, repair feature distributions.
- In-processing: add a fairness term/constraint to the training objective (e.g. penalize disparity, adversarial debiasing).
- Post-processing: adjust decision thresholds per group to equalize a chosen metric on the model’s outputs.
Concrete example
A credit model showing lower approval for one group: you might reweight training data, add an equalized-odds constraint during training, or set group-specific thresholds after training — often validated with a toolkit like Fairlearn or AIF360.
The common trap
Believing “fairness through unawareness” — just deleting the protected attribute — solves it. It doesn’t, because of proxies, and it also prevents you from measuring disparity. You need the attribute (at least for auditing) to detect and fix bias. Follow-up: “Pick a mitigation stage and a tradeoff” — post-processing is simple and model-agnostic but can require treating groups differently, which may itself raise legal/ethical questions.