Python
Threading, Multiprocessing, and Async: When to Use What in Python
Understand Python's three concurrency tools — threading, multiprocessing, and async — and learn which one fits your I/O-bound, CPU-heavy, or high-scale server tasks. Includes a decision matrix and real-world gotchas for production code.
June 2026 · 8 min read · 1 views · 0 hearts
Advertisement
Threading, Multiprocessing, and Async: When to Use What in Python
Python gives you three tools to handle concurrency, and choosing the wrong one can tank your performance. Let's cut through the confusion.
The GIL: Your Invisible Gatekeeper
Before we dive in, you need to understand the Global Interpreter Lock (GIL). It's Python's safety mechanism that ensures only one thread executes Python bytecode at a time. This means no matter how many threads you spawn, your Python code is never truly parallel on multiple CPU cores.
But here's the thing — not all tasks are equal. Some wait, others compute, and that distinction is everything.
Threading: Best for I/O-Bound Tasks
Threading shines when your program spends most of its time waiting — waiting for web requests, database queries, file reads, or API calls.
import threading
import requests
import time
def fetch_url(url):
response = requests.get(url)
print(f"Got {len(response.content)} bytes from {url}")
urls = ["https://example.com" for _ in range(10)]
# Sequential
start = time.time()
for url in urls:
fetch_url(url)
print(f"Sequential: {time.time() - start:.2f}s")
# Threaded
start = time.time()
threads = []
for url in urls:
t = threading.Thread(target=fetch_url, args=(url,))
threads.append(t)
t.start()
for t in threads:
t.join()
print(f"Threaded: {time.time() - start:.2f}s")
The threaded version will be significantly faster because while one thread waits for a network response, the GIL is released, and other threads can run.
When to use threads: - Web scraping and API calls - File I/O operations - Database queries - Any task where the CPU sits idle waiting
Multiprocessing: For CPU-Intensive Work
When you need to crunch numbers, process images, or run heavy computations, multiprocessing is your answer. It bypasses the GIL entirely by spawning separate Python processes, each with its own memory space and GIL.
from multiprocessing import Pool
import time
def compute_heavy(n):
# Simulate heavy calculation
result = 0
for i in range(n * 10_000_000):
result += i
return result
# Sequential
start = time.time()
results = [compute_heavy(i) for i in range(1, 5)]
print(f"Sequential: {time.time() - start:.2f}s")
# Parallel
start = time.time()
with Pool(processes=4) as pool:
results = pool.map(compute_heavy, range(1, 5))
print(f"Parallel: {time.time() - start:.2f}s")
On a quad-core machine, you'll see roughly a 4x speedup with multiprocessing.
Caveats:
- Higher memory overhead (each process duplicates memory)
- More complex data sharing (need special objects like Queue or Pipe)
- Startup cost is higher than threads
Async: Lightweight Concurrency for I/O
Async (with asyncio) is like threading's cooler, more efficient cousin. Instead of OS threads, it uses cooperative multitasking — your code yields control at await points.
import asyncio
import aiohttp
import time
async def fetch_url(session, url):
async with session.get(url) as response:
content = await response.read()
print(f"Got {len(content)} bytes")
async def main():
urls = ["https://example.com" for _ in range(10)]
async with aiohttp.ClientSession() as session:
tasks = [fetch_url(session, url) for url in urls]
await asyncio.gather(*tasks)
start = time.time()
asyncio.run(main())
print(f"Async: {time.time() - start:.2f}s")
The huge advantage? Async can handle thousands of concurrent connections with a single thread. Threads have overhead — each one consumes about 8KB of memory on Linux, and context switching is expensive. Async tasks are lightweight coroutines that cost almost nothing.
When to reach for async: - High-concurrency network servers (think web frameworks like FastAPI) - Many simultaneous I/O operations - Real-time applications like chat servers - When you need maximum scalability with minimal resources
Quick Decision Matrix
| Task Type | Best Tool | Why |
|---|---|---|
| CPU-heavy (math, image processing) | Multiprocessing | Bypasses GIL, uses all cores |
| Many I/O operations (web, files) | Async or Threading | Both wait efficiently |
| Simple script, low concurrency | Threading | Easier to write |
| High-scale server | Async | Handles 10K+ connections per thread |
| Need shared state between tasks | Threading | Easier than multiprocessing |
The Gotchas Nobody Talks About
Threading gotcha: Python's threading.Thread doesn't support cancellation. Once started, that thread runs until completion. Async lets you cancel tasks with task.cancel().
Multiprocessing gotcha: Debugging multiprocessing code is miserable. Stack traces from child processes don't always propagate cleanly, and print statements might output in random order.
Async gotcha: Async code is "contagious" — once you call an async function, everything calling it needs to be async too. You can't easily mix async and sync code without extra work (like asyncio.run() or loop.run_in_executor()).
Real Talk: What Actually Happens in Production
In practice, most Python services use a hybrid approach. FastAPI uses async for request handling, then delegates CPU-intensive work to a thread pool or process pool via run_in_executor. Database drivers often use threading under the hood even when you write async code.
The secret? Match the tool to the bottleneck. Profile your code. If you're waiting 90% of the time, threading or async will save you. If you're computing 90% of the time, multiprocessing or switching to a different language (looking at you, Rust) might be better.
Start simple, measure everything, and only reach for complexity when the numbers tell you to.
Advertisement
Comments
Questions, corrections, and tips stay visible for everyone reading this page.
Join the discussion
No comments yet
Be the first to leave a note — it helps the next reader.