datarekha
Python Medium Asked at UberAsked at NetflixAsked at AirbnbAsked at Amazon

When should you use threading versus multiprocessing in Python?

The short answer

Use threading for I/O-bound work — network calls, file reads, database queries — because threads release the GIL during blocking syscalls and share memory cheaply. Use multiprocessing for CPU-bound work — number crunching, image processing — because each process gets its own GIL and can run on a separate core.

How to think about it

The one question that settles it

Before reaching for either module, ask: is the bottleneck waiting for something external, or computing something hard?

  • Waiting (network, disk, database) → threading
  • Computing (matrix math, parsing, image processing) → multiprocessing

The reason comes down to the GIL. Threads share one GIL and take turns on bytecode — perfect for waiting in parallel, useless for true CPU parallelism. Processes each have their own GIL and run on separate cores.

Threading — right for I/O-bound work

The GIL is released whenever a thread enters a blocking syscall (socket read, file write, time.sleep). While Thread A waits on a network response, Thread B runs Python bytecode. Threads share the same process memory, so passing data between them is zero-copy — but that means you need a threading.Lock or a queue when mutating shared state.

from concurrent.futures import ThreadPoolExecutor
import urllib.request

urls = ["https://example.com"] * 10

def fetch(url):
    with urllib.request.urlopen(url) as r:
        return len(r.read())

with ThreadPoolExecutor(max_workers=10) as ex:
    results = list(ex.map(fetch, urls))

Multiprocessing — right for CPU-bound work

Each Process is a separate OS process with its own interpreter and GIL. Python pickles arguments across a pipe, so data transfer has overhead, but every process runs on its own core in true parallel. The ProcessPoolExecutor API mirrors ThreadPoolExecutor, making it easy to switch.

from concurrent.futures import ProcessPoolExecutor

def crunch(n):
    return sum(i * i for i in range(n))

with ProcessPoolExecutor() as ex:
    results = list(ex.map(crunch, [10_000_000] * 4))

Quick reference

Dimensionthreadingmultiprocessing
GILShared — serialises CPUSeparate — true parallel
MemoryShared address spaceCopied / serialised
OverheadLow (OS threads)Higher (process fork + pickle)
Best forI/O-boundCPU-bound
Crash isolationNone — one thread can corrupt allStrong — process boundaries
Learn it properly Threading

Keep practising

All Python questions

Explore further

Skip to content