Python Medium Asked at GoogleAsked at MetaAsked at AmazonAsked at Stripe

What is Python's GIL and why does it exist?

For Data Scientist ML Engineer Data Engineer AI / LLM Engineer

The short answer

The Global Interpreter Lock is a mutex inside CPython that allows only one thread to execute Python bytecode at a time. It exists because CPython's memory management — reference counting — is not thread-safe, so the GIL prevents race conditions on object reference counts without per-object locking overhead.

How to think about it

Open with why the GIL exists, because the most common mistake is treating it as an oversight rather than a deliberate trade-off. Then say what it does and doesn’t block, and close on the practical consequence — that arc shows you understand the real impact, not just the trivia.

Why it exists

CPython tracks every object’s live references with a plain integer counter (ob_refcnt); when a thread drops that count to zero, the object is freed. Let two threads decrement the same counter at once without synchronisation and it corrupts — you get use-after-free crashes or leaked memory. The fix could have been a lock on every single object, but that’s expensive and intricate. So CPython took one process-wide lock instead — the Global Interpreter Lock — which makes reference counting safe and keeps single-threaded code fast, at the cost of letting only one thread run Python bytecode at a time.

Crucially, the GIL is a CPython implementation detail, not a rule of the Python language. Jython doesn’t have one; Python 3.13 shipped an experimental free-threaded build (--disable-gil) as the first real step toward removing it.

What it actually does

The GIL is handed off roughly every 5 milliseconds — the sys.getswitchinterval() quantum — so threads take turns. It is also released voluntarily the moment a thread enters a blocking syscall: file or network I/O, time.sleep, and most C extensions, including NumPy’s array operations.

Threads share one GIL and take turns on one core. Processes each have their own interpreter and run on separate cores in true parallel.

So what it does not block is just as important:

C extensions that release it — NumPy array math, hashlib, most I/O libraries;
asyncio — single-threaded by design, so there’s no GIL contention to begin with;
separate processes — each has its own interpreter and its own GIL.

import sys
print(sys.getswitchinterval())   # 0.005 — the 5 ms default quantum

The practical takeaway

Threads shine on I/O-bound work: while one waits on a network call, the GIL is free and another runs. But on CPU-bound pure-Python work, adding threads usually makes things slower — they just contend for the one lock and pay context-switch overhead on top. For real CPU parallelism, reach for multiprocessing (a separate GIL per process) or a compiled extension like NumPy or Cython.

Learn it properly The GIL

What is Python's GIL and why does it exist?

Why it exists

What it actually does

The practical takeaway

Keep practising

Explore further