MLOps Medium

What is data poisoning, and why is loading a pickle model file dangerous?

For MLOps Engineer ML Engineer AI / LLM Engineer

The short answer

Data poisoning is an attack where an adversary injects malicious or mislabeled examples into the training data to bias the model, create backdoors, or degrade it, and it is hard to detect because the model still trains successfully. Loading a pickle model is dangerous because Python's pickle executes arbitrary code on deserialization, so a malicious .pkl or .pt file from an untrusted source can run attacker code the moment you load it. Defenses include trusted data provenance and validation, and using safe formats like safetensors plus scanning model files.

How to think about it

The short answer

Data poisoning is an attack where an adversary injects malicious or mislabeled examples into the training data to bias the model, plant a backdoor, or degrade accuracy. Loading a pickle model is dangerous because Python’s pickle executes arbitrary code during deserialization — a malicious .pkl or .pt file runs the attacker’s code the instant you load it. Both are top items in the OWASP ML / GenAI security risks.

Data poisoning, in depth

Because training still “succeeds,” poisoning is hard to spot. Variants include backdoor/trigger attacks (model behaves normally except on a secret trigger), label flipping, and supply-chain poisoning of public datasets (“split-view” or “frontrunning”). It’s especially dangerous for continual/online learning, where poisoned feedback bends the live model in real time. Defenses: trusted data provenance, input validation and anomaly detection, dataset versioning (so you can audit and roll back), and robust training.

The pickle problem

PyTorch and scikit-learn default to pickle-based serialization. Pickle’s __reduce__ can encode “run this code on load,” so a model downloaded from a public hub can carry an embedded payload that executes on torch.load / pickle.load. Studies have shown malicious pickles smuggled through model hubs.

Defenses:

Prefer safetensors (Hugging Face’s safe format — data only, no code execution) over pickle.
Scan model files (e.g., picklescan) and pin/verify checksums.
Treat third-party models like untrusted code: load in sandboxed, least-privilege environments.

Concrete example

A teammate grabs model.pt off an unknown repo and runs torch.load("model.pt") on a box with cloud credentials. The embedded payload exfiltrates those creds before the model even loads. Using safetensors plus a scan in CI would have prevented it.

Common follow-up / trap

Interviewers ask: “Is safetensors a complete fix?” It removes the code-execution risk of serialization but doesn’t address poisoning of the weights or data — a clean-format model can still be backdoored. The trap is conflating the two threats. Provenance and validation address poisoning; safe formats and scanning address deserialization.

Learn it properly ML security (MLSecOps)

What is data poisoning, and why is loading a pickle model file dangerous?

The short answer

Data poisoning, in depth

The pickle problem

Concrete example

Common follow-up / trap

Keep practising

Explore further