What is the difference between classification and regression, and how do you choose between them?

Classification predicts a discrete class label; regression predicts a continuous numeric value. The choice is determined by the nature of the target variable, not by the algorithm family — many algorithms (e.g., decision trees, neural nets) handle both.

What is the difference between supervised, unsupervised, and reinforcement learning?

Supervised learning trains on labeled input-output pairs to predict a target. Unsupervised learning finds structure in unlabeled data. Reinforcement learning trains an agent to maximize cumulative reward through trial-and-error interaction with an environment.

What is generalization in machine learning, and what factors determine how well a model generalizes?

Generalization is a model's ability to perform well on unseen data drawn from the same distribution as the training set. It is controlled by the interplay of model capacity, dataset size, regularization, and distributional shift between training and deployment.

What is AutoML, what does it automate, and where does it fall short?

AutoML automates parts of the ML pipeline such as data preprocessing, feature engineering, model selection, hyperparameter tuning, and sometimes neural architecture search, lowering the barrier to building models. It falls short on problem framing, data quality, domain feature engineering, careful evaluation against leakage, fairness, and deployment concerns, which still need human expertise. It's best as an accelerator and strong baseline generator, not a replacement for an ML engineer.

What ML actually is — Machine Learning

How does your inbox know a brand-new message is spam, when nobody wrote a rule for that exact email? It learned the pattern from millions of past messages people marked as spam or not. That, in one sentence, is the whole idea: machine learning is software that learns rules from examples instead of having them coded by hand. Everything else is implementation detail.

A traditional program looks like:

input + rules  →  output

ML flips this around:

input + output (lots of examples)  →  rules (the model)

Then at prediction time, the learned rules turn new inputs into outputs.

The inversion: classic code takes rules and data to make output; ML takes data and output to make the rules.

The three flavors

Flavor	What you give it	What it learns	Example
Supervised	Pairs of `(input, correct answer)`	A function mapping input → answer	Spam classifier, house-price predictor
Unsupervised	Just inputs, no labels	Hidden structure (clusters, components)	Customer segmentation, anomaly detection
Reinforcement	An environment and a reward signal	A policy that maximizes reward over time	Game-playing agents, robotics, ad bidding

In industry, supervised learning is 95% of what gets deployed. The other two show up in narrower domains. We’ll cover all three but spend most of our time on supervised.

When NOT to use ML

(The transaction example needs no ML at all — amount > 10000 is an exact, stable rule. Reaching for a model there adds cost and unpredictability for nothing.)

ML is the right tool when:

You can’t write the rules by hand (image recognition).
The rules are stable enough that examples won’t go stale next month.
You have lots of labeled data — or can get it cheaply.
A wrong prediction has bounded cost (not life-critical without safeguards).

ML is the wrong tool when:

A simple deterministic rule works. (Don’t use ML to detect even numbers.)
You have 50 training examples. (Use heuristics, or get more data first.)
The cost of a wrong answer is catastrophic and unreviewable.
The system needs to explain its reasoning legally (some regulators forbid black-box models).

ML vs “AI”

The terminology is a mess. A rough mapping:

Machine learning — the field. Linear regression to deep neural nets.
Deep learning — ML with deep neural networks (multiple hidden layers).
Generative AI — deep learning models that generate text/images/etc.
LLMs — generative AI for language.
“AI” in business contexts usually means generative AI + maybe some agents on top.

If a colleague says “AI,” they probably mean LLMs. If they say “ML,” they probably mean classical models (regression, trees, boosting). The distinction matters because the tools and skills differ.

In one breath

Machine learning is software that learns rules from examples instead of having them hand-coded — the inversion: traditional code is input + rules → output, ML is input + outputs (examples) → rules.
Three flavors: supervised (labeled pairs → a mapping), unsupervised (inputs only → hidden structure), reinforcement (environment + reward → a policy). Supervised is ~95% of what ships.
ML is the right tool when rules are too messy to write by hand, you have lots of (cheap) labels, and a wrong prediction has bounded cost.
ML is the wrong tool when a simple deterministic rule works, data is tiny, errors are catastrophic-and-unreviewable, or the law demands explainability.
“AI” in business usually means LLMs / generative; “ML” usually means classical models (regression, trees, boosting) — different tools, different skills.

Quick check

0/4

Q1What is the core inversion that distinguishes ML from traditional programming?

Q2You have customer records with no labels and want to discover natural groupings. Which flavor of ML fits?

Q3A colleague proposes 'adding AI' to flag every field value that is an even number. What's the right response?

Q4Which flavor of ML accounts for the large majority of what gets deployed in industry?

What’s coming in this section

The next lessons cover the scikit-learn API (the convention every ML library copies), the train/test split (the most important habit), and then we’ll work through regression, classification, trees, and ensembles — the techniques that win on most tabular problems today.

What ML actually is

What you'll learn

Before you start

The three flavors

When NOT to use ML

ML vs “AI”

In one breath

Quick check

Quick check

What’s coming in this section

Sign in to track your progress

Practice this in an interview

Related lessons

Explore further