Python

Python Concurrency: Threading, Multiprocessing, and Async Explained

A practical guide to Python's concurrency tools—threading for I/O, multiprocessing for CPU-bound tasks, and async for high-throughput—with a decision tree to choose the right approach.

June 2026 · 7 min read · 1 views · 0 hearts

Try in editor Tutorial catalog

Python and the Art of Doing Many Things at Once

For decades, Python developers had a dirty little secret: the Global Interpreter Lock. The GIL meant that no matter how many cores your shiny new CPU threw at you, your Python code would stubbornly refuse to run more than one thread at a time. But times have changed. Python's approach to concurrency and parallelism is richer, more nuanced, and frankly more powerful than most developers realize.

The GIL: Your Frenemy

Let's get the elephant out of the room. The Global Interpreter Lock (GIL) is a mutex that protects access to Python objects, preventing multiple threads from executing Python bytecodes simultaneously. It exists because Python's memory management isn't thread-safe, and the GIL was a simpler solution than rewriting the entire CPython runtime.

The good news: The GIL only applies to CPython. Implementations like Jython and IronPython don't have it. And for I/O-bound tasks, the GIL is barely an issue—your threads will release it while waiting for network responses or disk reads.

The bad news: CPU-bound tasks die here. Four threads calculating prime numbers will run at roughly the speed of one, with overhead from context switching.

Threading: I/O's Best Friend

Python's threading module wraps around OS-level threads. They're real threads, but they share the Python interpreter—and that GIL.

import threading
import time

def download_url(url):
    print(f"Starting download: {url}")
    time.sleep(2)  # Simulated network I/O
    print(f"Finished: {url}")

threads = []
for url in ["https://site1.com", "https://site2.com", "https://site3.com"]:
    t = threading.Thread(target=download_url, args=(url,))
    threads.append(t)
    t.start()

for t in threads:
    t.join()

This runs in about 2 seconds total, not 6. Why? Because each thread releases the GIL during the sleep() call, allowing other threads to run in the meantime.

Where threading shines: - Web scraping multiple sites - Handling many simultaneous database queries - Serving web requests (frameworks like Flask use threads)

Where it doesn't: - Image processing - Data analysis with NumPy (which releases the GIL internally anyway) - Any heavy number crunching

Multiprocessing: Bypassing the GIL Entirely

When you need real parallelism—multiple cores working simultaneously—multiprocessing is your answer. It spawns separate Python processes, each with its own GIL and memory space.

from multiprocessing import Pool

def calculate_pi(iterations):
    # Simulated heavy computation
    total = 0
    for i in range(iterations):
        total += (-1)**i / (2*i + 1)
    return 4 * total

with Pool(processes=4) as pool:
    results = pool.map(calculate_pi, [1000000, 1000000, 1000000, 1000000])

This uses all four cores. No GIL fights. But there's a catch: communication between processes is expensive. You can't just share variables; you need queues, pipes, or special shared memory objects.

Process vs. thread overhead: Each process carries its own Python interpreter. Spawning a process takes roughly 10-50ms versus microseconds for a thread. For large, long-running tasks, this is negligible. For thousands of tiny tasks, it's a dealbreaker.

Async IO: The Speed of Single-threaded

asyncio takes a different approach: cooperative multitasking in a single thread. Your code runs until it says "I'm waiting, do something else," at which point the event loop switches to another task.

import asyncio

async def fetch_data(url):
    print(f"Requesting {url}")
    await asyncio.sleep(1)  # Pretend this is an HTTP request
    return f"Data from {url}"

async def main():
    tasks = [fetch_data(f"https://api{i}.com") for i in range(5)]
    results = await asyncio.gather(*tasks)
    return results

asyncio.run(main())

This completes in about 1 second. No threads. No processes. Pure event-driven magic.

The catch: Async code must be async throughout. You can't call a synchronous time.sleep() inside async code—that blocks the entire event loop. You need await asyncio.sleep(). This "coloring problem" means your codebase either all-in on async or split into async and sync parts.

Choosing Your Weapon

Here's a rough decision tree:

I/O-bound task, many simultaneous connections (web scraping, API calls) → asyncio for maximum throughput, or threading if you need simpler code
I/O-bound but simple (reading a few files) → Just use synchronous code. Don't overengineer.
CPU-bound, compute-intensive (image processing, machine learning) → multiprocessing for pure Python, or use libraries like NumPy that bypass the GIL
Mixed workload (server that handles requests and does computation) → Use concurrent.futures with a ThreadPoolExecutor for I/O and ProcessPoolExecutor for CPU tasks

The Unsung Hero: concurrent.futures

This module provides a clean abstraction over both threads and processes. You write your function once, then decide whether to run it with threads or processes:

from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor

def work(item):
    # Could be I/O or CPU bound
    return item * 2

# Try with threads first
with ThreadPoolExecutor(max_workers=4) as executor:
    results = list(executor.map(work, range(10)))

# If threads are slow (GIL contention), switch to processes
with ProcessPoolExecutor(max_workers=4) as executor:
    results = list(executor.map(work, range(10)))

What's Emerging

Python 3.12 introduced "subinterpreters" with the interpreters module—lighter than processes but without the GIL contention. Still experimental, but promising. There's ongoing work on a GIL-free CPython (the "nogil" project, now being integrated into CPython 3.13 as an optional feature).

The bottom line: Concurrency in Python isn't about fighting the GIL—it's about picking the right tool for the pattern of waiting you're doing. Threads for waiting on networks, processes for waiting on computations, async for waiting on everything else. And always, always measure before optimizing.

Comments

Questions, corrections, and tips stay visible for everyone reading this page.

0 in thread

Join the discussion

No comments yet

Be the first to leave a note — it helps the next reader.