datarekha
Python Medium Asked at NetflixAsked at UberAsked at PalantirAsked at Snowflake

What is the difference between CPU-bound and I/O-bound work, and how does the choice affect concurrency strategy in Python?

The short answer

CPU-bound work keeps the processor busy the whole time — matrix multiplication, compression, parsing. I/O-bound work spends most of its time waiting for a slow external resource — network, disk, database. The distinction directly determines which concurrency primitive to reach for: multiprocessing for CPU-bound (bypasses the GIL), threading or asyncio for I/O-bound (GIL released during waits).

How to think about it

Diagnosing the bottleneck first

Before picking a tool, profile. A high CPU percentage with low idle time means CPU-bound. A process spending most of its time sleeping, waiting on sockets or disk, means I/O-bound. Getting this diagnosis wrong is expensive — adding threads to a CPU-bound loop makes things slower, not faster (more GIL contention).

top        # look at %CPU column
iostat 1   # disk I/O rate

In Python:

import cProfile
cProfile.run("my_function()", sort="cumulative")

The GIL is the key

Python’s Global Interpreter Lock means only one thread executes Python bytecode at a time. For CPU-bound work, threads cannot run in parallel — the GIL serialises them. For I/O-bound work, the GIL is released during blocking system calls (network read, disk write), so threads genuinely overlap their waiting.

I/O-bound: threading or asyncio

The GIL is released during blocking syscalls. Multiple threads can overlap their wait times even though they share one interpreter:

from concurrent.futures import ThreadPoolExecutor
import requests

def fetch(url):
    return requests.get(url).status_code

urls = ["https://httpbin.org/delay/1"] * 8

with ThreadPoolExecutor(max_workers=8) as pool:
    codes = list(pool.map(fetch, urls))
# ~1 second total, not 8

For very high fan-out (thousands of concurrent requests), asyncio + an async HTTP library beats threads on memory and overhead because there are no thread stacks.

CPU-bound: multiprocessing

Each Process has its own Python interpreter and its own GIL, so all cores can work simultaneously:

from concurrent.futures import ProcessPoolExecutor

def heavy(n):
    return sum(i ** 2 for i in range(n))

with ProcessPoolExecutor() as pool:
    results = list(pool.map(heavy, [5_000_000] * 4))
# ~4x speedup on a 4-core machine

For numerical work, NumPy releases the GIL inside its C routines — so np.dot on large arrays is effectively CPU-parallel even from threads.

Decision table

WorkloadToolWhy
Network / DB / file I/Othreading, asyncioGIL released during waits
Numerical computationmultiprocessing, NumPyTrue parallel; GIL bypassed
Mixed pipelineProcessPoolExecutor for heavy stages, threads withinSeparate by bottleneck
Shell commandssubprocessOS-managed, no GIL concern
Learn it properly Multiprocessing

Keep practising

All Python questions

Explore further

Skip to content