DataFrame basics

The single most important class in the Python data stack. Create one, inspect it, and select from it without surprises.

⏱ 8 min read Beginner Pandas Updated May 2026

What you'll learn

  • Three reliable ways to construct a DataFrame
  • How to inspect a DataFrame in a few seconds
  • Why `[]` vs `.loc` vs `.iloc` matters
Prerequisites: python/getting-started

A DataFrame is a 2D labeled table — like a spreadsheet, like a SQL result set. It has row labels (the index) and column labels (the columns), and each column has a single dtype.

Three ways to construct one

Python · pandas
Ready
Output
(click Run)

In a real job you’ll almost always use option 3 (read from a file). But options 1 and 2 are perfect for tests and small examples.

Inspect a DataFrame in 10 seconds

Python · pandas
Ready
Output
(click Run)

.head(), .shape, .dtypes, .describe() and .value_counts() are the five things you’ll run within seconds of loading any new dataset. Burn them into muscle memory.

Selecting columns

df["age"]            # one column → returns a Series
df[["age", "city"]]  # multiple columns → returns a DataFrame

A Series is a 1D labeled array — basically a single column. A DataFrame is a collection of Series sharing an index.

Selecting rows — .loc vs .iloc

This is the most common source of confusion for newcomers, and the rule is actually simple:

  • .loc uses labels (the index values, column names).
  • .iloc uses integer positions (0-based).
Python · pandas
Ready
Output
(click Run)

Boolean filtering — the workhorse

Python · pandas
Ready
Output
(click Run)

The parentheses around each comparison are required because of Python’s operator precedence. Forget them and you’ll get a confusing error.

Creating, modifying, dropping columns

df["bonus"] = df["salary"] * 0.10        # new column
df["age"] = df["age"] + 1                 # modify in place
df = df.drop(columns=["bonus"])           # drop returns a new DataFrame
df = df.rename(columns={"city": "loc"})   # rename

Quick check

Quick check

0/3 answered
Q1.Which gives the FIRST row of `df`?
Q2.What does `df.describe()` show by default?
Q3.Which is the correct way to filter rows where age > 30 AND city == 'NYC'?

Finished the lesson?

Mark it complete to track your progress and keep your streak alive. +20 XP

Skip to content