Python

Threading, Multiprocessing, and Async: When to Use What in Python

Understand Python's three concurrency tools — threading, multiprocessing, and async — and learn which one fits your I/O-bound, CPU-heavy, or high-scale server tasks. Includes a decision matrix and real-world gotchas for production code.

June 2026 · 8 min read · 1 views · 0 hearts

Try in editor Tutorial catalog

Threading, Multiprocessing, and Async: When to Use What in Python

Python gives you three tools to handle concurrency, and choosing the wrong one can tank your performance. Let's cut through the confusion.

The GIL: Your Invisible Gatekeeper

Before we dive in, you need to understand the Global Interpreter Lock (GIL). It's Python's safety mechanism that ensures only one thread executes Python bytecode at a time. This means no matter how many threads you spawn, your Python code is never truly parallel on multiple CPU cores.

But here's the thing — not all tasks are equal. Some wait, others compute, and that distinction is everything.

Threading: Best for I/O-Bound Tasks

Threading shines when your program spends most of its time waiting — waiting for web requests, database queries, file reads, or API calls.

import threading
import requests
import time

def fetch_url(url):
    response = requests.get(url)
    print(f"Got {len(response.content)} bytes from {url}")

urls = ["https://example.com" for _ in range(10)]

# Sequential
start = time.time()
for url in urls:
    fetch_url(url)
print(f"Sequential: {time.time() - start:.2f}s")

# Threaded
start = time.time()
threads = []
for url in urls:
    t = threading.Thread(target=fetch_url, args=(url,))
    threads.append(t)
    t.start()
for t in threads:
    t.join()
print(f"Threaded: {time.time() - start:.2f}s")

The threaded version will be significantly faster because while one thread waits for a network response, the GIL is released, and other threads can run.

When to use threads: - Web scraping and API calls - File I/O operations - Database queries - Any task where the CPU sits idle waiting

Multiprocessing: For CPU-Intensive Work

When you need to crunch numbers, process images, or run heavy computations, multiprocessing is your answer. It bypasses the GIL entirely by spawning separate Python processes, each with its own memory space and GIL.

from multiprocessing import Pool
import time

def compute_heavy(n):
    # Simulate heavy calculation
    result = 0
    for i in range(n * 10_000_000):
        result += i
    return result

# Sequential
start = time.time()
results = [compute_heavy(i) for i in range(1, 5)]
print(f"Sequential: {time.time() - start:.2f}s")

# Parallel
start = time.time()
with Pool(processes=4) as pool:
    results = pool.map(compute_heavy, range(1, 5))
print(f"Parallel: {time.time() - start:.2f}s")

On a quad-core machine, you'll see roughly a 4x speedup with multiprocessing.

Caveats: - Higher memory overhead (each process duplicates memory) - More complex data sharing (need special objects like Queue or Pipe) - Startup cost is higher than threads

Async: Lightweight Concurrency for I/O

Async (with asyncio) is like threading's cooler, more efficient cousin. Instead of OS threads, it uses cooperative multitasking — your code yields control at await points.

import asyncio
import aiohttp
import time

async def fetch_url(session, url):
    async with session.get(url) as response:
        content = await response.read()
        print(f"Got {len(content)} bytes")

async def main():
    urls = ["https://example.com" for _ in range(10)]
    async with aiohttp.ClientSession() as session:
        tasks = [fetch_url(session, url) for url in urls]
        await asyncio.gather(*tasks)

start = time.time()
asyncio.run(main())
print(f"Async: {time.time() - start:.2f}s")

The huge advantage? Async can handle thousands of concurrent connections with a single thread. Threads have overhead — each one consumes about 8KB of memory on Linux, and context switching is expensive. Async tasks are lightweight coroutines that cost almost nothing.

When to reach for async: - High-concurrency network servers (think web frameworks like FastAPI) - Many simultaneous I/O operations - Real-time applications like chat servers - When you need maximum scalability with minimal resources

Quick Decision Matrix

Task Type	Best Tool	Why
CPU-heavy (math, image processing)	Multiprocessing	Bypasses GIL, uses all cores
Many I/O operations (web, files)	Async or Threading	Both wait efficiently
Simple script, low concurrency	Threading	Easier to write
High-scale server	Async	Handles 10K+ connections per thread
Need shared state between tasks	Threading	Easier than multiprocessing

The Gotchas Nobody Talks About

Threading gotcha: Python's threading.Thread doesn't support cancellation. Once started, that thread runs until completion. Async lets you cancel tasks with task.cancel().

Multiprocessing gotcha: Debugging multiprocessing code is miserable. Stack traces from child processes don't always propagate cleanly, and print statements might output in random order.

Async gotcha: Async code is "contagious" — once you call an async function, everything calling it needs to be async too. You can't easily mix async and sync code without extra work (like asyncio.run() or loop.run_in_executor()).

Real Talk: What Actually Happens in Production

In practice, most Python services use a hybrid approach. FastAPI uses async for request handling, then delegates CPU-intensive work to a thread pool or process pool via run_in_executor. Database drivers often use threading under the hood even when you write async code.

The secret? Match the tool to the bottleneck. Profile your code. If you're waiting 90% of the time, threading or async will save you. If you're computing 90% of the time, multiprocessing or switching to a different language (looking at you, Rust) might be better.

Start simple, measure everything, and only reach for complexity when the numbers tell you to.

Comments

Questions, corrections, and tips stay visible for everyone reading this page.

0 in thread

Join the discussion

No comments yet

Be the first to leave a note — it helps the next reader.