Python Medium Asked at AmazonAsked at GoogleAsked at Microsoft

What is the difference between a shallow copy and a deep copy, and when does it matter?

For Data Scientist ML Engineer Data Engineer Data Analyst

The short answer

A shallow copy creates a new container but populates it with references to the same inner objects. A deep copy creates a new container and recursively copies every nested object. The difference only matters when the data structure contains mutable nested objects — for flat structures of immutables, shallow copy is sufficient and faster.

How to think about it

At heart this is a question about Python’s reference model. A copy makes a new container — but how far down does the copying go? A shallow copy stops at the first level: it duplicates the outer container but fills it with references to the same inner objects. A deep copy recurses all the way down, rebuilding every nested object so nothing is shared.

The distinction only bites when the nested objects are mutable. A flat list of ints or strings is perfectly safe to shallow-copy — and faster that way. But a list of lists, dicts, or class instances gives a shallow copy a false air of independence.

A worked example

Three scenarios — a bare alias (no copy at all), a shallow copy, and a deep copy — show exactly where the sharing stops:

import copy

original = [[1, 2], [3, 4], [5, 6]]

# Alias: not a copy — just a second name for the same list
alias = original
alias[0].append(99)
print("original after alias mutation:", original)

original = [[1, 2], [3, 4], [5, 6]]                  # reset

# Shallow copy: new outer list, SAME inner lists
shallow = copy.copy(original)        # also: original[:] or list(original)
print("Different outer container?", shallow is not original)
print("Same inner lists?", shallow[0] is original[0])
shallow[0].append(88)
print("original after shallow inner mutation:", original)   # inner leaks through
shallow.append([9, 9])
print("original after shallow outer change:", original)     # outer is isolated

original = [[1, 2], [3, 4], [5, 6]]                  # reset

# Deep copy: new outer list AND new inner lists
deep = copy.deepcopy(original)
print("Inner lists independent?", deep[0] is not original[0])
deep[0].append(77)
print("original after deep inner mutation:", original)      # untouched

original after alias mutation: [[1, 2, 99], [3, 4], [5, 6]]
Different outer container? True
Same inner lists? True
original after shallow inner mutation: [[1, 2, 88], [3, 4], [5, 6]]
original after shallow outer change: [[1, 2, 88], [3, 4], [5, 6]]
Inner lists independent? True
original after deep inner mutation: [[1, 2], [3, 4], [5, 6]]

Read the shallow block closely. shallow is not original is True — the outer list really is new — yet shallow[0] is original[0] is also True, so appending to shallow[0] shows up in original. Appending a whole new inner list to shallow, on the other hand, leaves original alone. That split — new outer, shared inner — is the entire concept.

The mental model

original = [[1, 2], [3, 4]]

shallow = copy.copy(original):
  original ──► [ ref_A, ref_B ]
  shallow  ──► [ ref_A, ref_B ]    ← new list, same inner refs

deep = copy.deepcopy(original):
  original ──► [ ref_A,  ref_B  ]
  deep     ──► [ ref_A2, ref_B2 ]  ← entirely separate objects

When to use which

Scenario	Use
Flat list of ints / strings / tuples	shallow copy — fast, safe
Nested mutable structure you’ll modify	deep copy
pandas DataFrame	`df.copy()` — deep by default
Object graph with circular references	`deepcopy` (it tracks them; manual recursion loops forever)

deepcopy is noticeably slower on large nested data — it visits every object and keeps a memo dict to handle shared references and cycles correctly.

Learn it properly Lists

What is the difference between a shallow copy and a deep copy, and when does it matter?

A worked example

The mental model

When to use which

Keep practising

Explore further