Python Medium Asked at NetflixAsked at UberAsked at PalantirAsked at Snowflake

What is the difference between CPU-bound and I/O-bound work, and how does the choice affect concurrency strategy in Python?

For Data Scientist ML Engineer Data Engineer AI / LLM Engineer

The short answer

CPU-bound work keeps the processor busy the whole time — matrix multiplication, compression, parsing. I/O-bound work spends most of its time waiting for a slow external resource — network, disk, database. The distinction directly determines which concurrency primitive to reach for: multiprocessing for CPU-bound (bypasses the GIL), threading or asyncio for I/O-bound (GIL released during waits).

How to think about it

Before you pick any concurrency tool, profile — because the whole decision turns on one diagnosis. Work that pins the processor near 100% with little idle time is CPU-bound: matrix math, compression, parsing, recursive algorithms. Work that spends most of its life waiting on something external — a socket, a disk, a database — is I/O-bound. Get this wrong and you’ll make things worse: bolt threads onto a CPU-bound loop and it runs slower, because they only fight over the GIL.

top        # watch the %CPU column
iostat 1   # watch the disk I/O rate

import cProfile
cProfile.run("my_function()", sort="cumulative")

Why the GIL decides it

Python’s Global Interpreter Lock lets only one thread run Python bytecode at a time. For CPU-bound work, threads therefore can’t run in parallel — the GIL serialises them. For I/O-bound work, the GIL is released during a blocking syscall (a network read, a disk write), so several threads genuinely overlap their waiting. That single fact is the whole mental model:

CPU-bound  ──▶ multiprocessing       (each process has its own GIL → real parallelism)
I/O-bound  ──▶ threading / asyncio    (GIL released during waits → overlap the waiting)

threads + CPU-bound  ──▶ SLOWER       (GIL serialises; adds context-switch overhead)

I/O-bound: threading or asyncio

Because the GIL frees up during the wait, threads that mostly wait overlap beautifully even though they share one interpreter:

from concurrent.futures import ThreadPoolExecutor
import requests

def fetch(url):
    return requests.get(url).status_code

urls = ["https://httpbin.org/delay/1"] * 8
with ThreadPoolExecutor(max_workers=8) as pool:
    codes = list(pool.map(fetch, urls))
# ~1 second total, not 8 — the eight waits overlap

For very high fan-out — thousands of concurrent requests — asyncio with an async HTTP library beats threads on memory and overhead, since there are no per-thread stacks to allocate.

CPU-bound: multiprocessing

Each Process gets its own interpreter and its own GIL, so every core works at once:

from concurrent.futures import ProcessPoolExecutor

def heavy(n):
    return sum(i ** 2 for i in range(n))

with ProcessPoolExecutor() as pool:
    results = list(pool.map(heavy, [5_000_000] * 4))
# ~4x faster on a 4-core machine

The exception worth knowing: NumPy releases the GIL inside its C routines, so np.dot on big arrays is effectively CPU-parallel even from threads — the heavy work happens below the interpreter.

Decision table

Workload	Tool	Why
network / DB / file I/O	`threading`, `asyncio`	GIL released during waits
numerical computation	`multiprocessing`, NumPy	true parallel; GIL bypassed
mixed pipeline	`ProcessPoolExecutor` for heavy stages, threads within	separate by bottleneck
shell commands	`subprocess`	OS-managed, no GIL concern

Learn it properly Multiprocessing

What is the difference between CPU-bound and I/O-bound work, and how does the choice affect concurrency strategy in Python?

Why the GIL decides it

I/O-bound: threading or asyncio

CPU-bound: multiprocessing

Decision table

Keep practising

Explore further