What is Python's GIL and why does it exist?
The Global Interpreter Lock is a mutex inside CPython that allows only one thread to execute Python bytecode at a time. It exists because CPython's memory management — reference counting — is not thread-safe, so the GIL prevents race conditions on object reference counts without per-object locking overhead.
How to think about it
How to frame this answer
Start with why the GIL exists — it’s not an accident or oversight. Then explain what it does and doesn’t block. Finishing with the practical consequence (use multiprocessing for CPU-bound work, not threads) shows you understand the real impact.
Why the GIL exists
CPython tracks every object’s live references with an integer (ob_refcnt). When a thread decrements a count to zero, the object is freed. Without synchronisation, two threads decrementing simultaneously can corrupt that counter — causing use-after-free bugs or memory leaks.
Rather than adding a lock to every single Python object (expensive and complex), CPython takes one process-wide lock: the GIL. It’s a pragmatic tradeoff: simpler implementation, safe reference counting, fast single-threaded performance.
The GIL is a CPython implementation detail — not a Python language requirement. Jython and PyPy-STM don’t have one. Python 3.13 introduced an experimental free-threaded mode (--disable-gil) as the first step toward removing it.
What the GIL actually does
The GIL is released every 5 ms (the sys.getswitchinterval() quantum) so other threads get a turn. It is also released voluntarily whenever a thread enters a blocking syscall — file I/O, network I/O, time.sleep, and most C extensions including NumPy’s array operations.
What the GIL does NOT block
- C extensions that release the GIL — NumPy’s array operations,
hashlib, most I/O libraries asyncio— single-threaded by design, no GIL contention- Multiple processes — each has its own Python interpreter and its own GIL
import sys
print(sys.getswitchinterval()) # 0.005 — 5 ms default quantum
The practical takeaway
Threads are excellent for I/O-bound work — while one thread waits on a network call, the others run. For CPU-bound work (pure Python computation), the GIL means adding threads typically makes things slower, not faster. Reach for multiprocessing or a compiled extension (Cython, NumPy) when you need CPU parallelism.