datarekha
Python Easy Asked at GoogleAsked at MetaAsked at Amazon

When would you use a Python list versus a NumPy array, and what are the performance trade-offs?

The short answer

Python lists are heterogeneous, pointer-based, and general-purpose. NumPy arrays are homogeneous, stored as contiguous typed memory, and support vectorised operations that run at C speed. For numerical work on more than a few hundred elements, NumPy is almost always faster and more memory-efficient.

How to think about it

What the interviewer wants to hear

In a data science or ML interview, this question is really about whether you understand why NumPy is fast — not just that it is. The answer is all about memory layout. Python lists store pointers to arbitrary objects; NumPy stores raw values of a single type contiguously, like a C array.

Memory layout — the root of everything

A Python list stores an array of pointers to Python objects. Each object carries a full header: type tag, reference count, value. For a list of a million floats, you have a million separate objects scattered in memory, plus a million pointers.

A NumPy array stores raw float64 values back-to-back, 8 bytes each, no headers, no pointers. That means:

  • Dramatically less memory.
  • Cache-friendly access — the CPU can prefetch contiguous memory efficiently.
  • Operations can be dispatched to compiled BLAS/LAPACK routines entirely outside the Python interpreter loop.
import sys, numpy as np

py = [1.0] * 1_000
arr = np.ones(1_000, dtype=np.float64)

sys.getsizeof(py)    # ~8,056 bytes — pointers only, ignores float objects
arr.nbytes           # 8,000 bytes — raw doubles, nothing else

Interactive benchmark — measure the gap

When to stick with a plain list

  • Heterogeneous elements: [42, "hello", None, True]
  • Dynamic append-heavy workloads where you do not know the final size ahead of time
  • Small collections where the NumPy import overhead outweighs the gains
  • Non-numeric data (strings, objects, nested dicts)

Choosing the right tool

NeedUse
General-purpose mixed datalist
Numeric computation / ML featuresnumpy.ndarray
Tabular data with named columnspandas.DataFrame
Typed compact sequence (stdlib only)array.array
Learn it properly Lists

Keep practising

All Python questions

Explore further

Skip to content