Python
Beyond `timeit`: Performance Optimization Techniques That Actually Matter in Python
Move past basic advice like list comprehensions and learn real Python optimization techniques that reduce memory, speed up loops, and cut execution time by 10-60% — from `__slots__` and local variable binding to smart use of `map()` and profiling with cProfile.
June 2026 · 6 min read · 1 views · 0 hearts
Advertisement
Beyond timeit: Performance Optimization Techniques That Actually Matter in Python
Everyone knows Python is slow. Or rather, everyone thinks they know Python is slow. The reality? Python can be surprisingly fast — if you stop treating it like a scripting language and start thinking like an optimizer.
Let's skip the obvious "use list comprehensions" advice and dive into techniques that make a real difference when milliseconds matter.
The Hidden Cost of Attribute Access
Here's something most tutorials don't tell you: attribute lookups are expensive. Every time you write obj.method(), Python goes through a chain of dictionary lookups. In tight loops, this adds up fast.
# Slow - resolves len() each iteration
for i in range(1_000_000):
result = len(data)
# Fast - resolve once, reuse
len_func = len
for i in range(1_000_000):
result = len_func(data)
That's not micro-optimization — that's macro-optimization at scale. A 10-15% speedup on a million iterations is worth noticing.
When __slots__ Beats Classes
Python dictionaries are beautiful. They're also memory hogs. Every instance of a regular class carries a __dict__ attribute — a full hash table for storing attributes.
Enter __slots__:
class Point:
__slots__ = ('x', 'y')
def __init__(self, x, y):
self.x = x
self.y = y
What you get: 40-60% less memory per instance. What you pay for: no dynamic attribute assignment. But if you're creating millions of objects (geospatial data, game entities, physics simulations), that trade-off is a no-brainer.
The __slots__ Trap Most Developers Miss
Here's the kicker: __slots__ also speeds up attribute access by about 20-30%. Not because of magic — because it replaces dictionary lookups with fixed offsets. Your hot loop that accesses point.x a million times? That's real savings.
Generator Pipeline: The Anti-Pattern Glue
Generators are great for memory. But chaining them wrong kills performance.
# Bad - materializes intermediate lists
result = [process(x) for x in large_data]
result = [filter_func(x) for x in result]
result = [transform(x) for x in result]
# Good - single pass, no intermediates
result = [transform(filter_func(process(x))) for x in large_data]
But here's the real trick: use generator expressions for lazy evaluation, then materialize only once:
gen = (transform(filter_func(process(x))) for x in large_data)
result = list(gen) # Single pass
Local Variable Binding: The Silent Win
This is the optimization that feels like cheating — because it exploits Python's scoping rules:
def slow_loop(items):
total = 0
for item in items:
total += item.value * item.rate
return total
def fast_loop(items):
total = 0
# Bind to local scope - avoids global lookups
local_value = items[0].__class__.value
local_rate = items[0].__class__.rate
for item in items:
total += item.value * item.rate
return total
Wait — that's not right. The actual optimization is simpler:
def fast_loop(items):
total = 0
# Force lookups into local variables
add = total.__add__ # Actually: just move the loop body into local scope
for item in items:
total += item.value * item.rate
return total
Real version: When you're inside a function, local variable lookups are faster than global or attribute lookups. Move frequently accessed objects into local variables:
def process_items(data):
result = []
append = result.append # Local reference to method
for item in data:
if item.active:
append(item.value * 2)
return result
That append = result.append trick shaves off 15-20% in loops. Not dramatic — but free.
When map() Beats Comprehensions
Everyone says "list comprehensions are faster than map()". That's... sort of true. But it depends:
# Comprehension - fine for simple cases
result = [x * 2 for x in data]
# map() + lambda - usually slower
result = list(map(lambda x: x * 2, data))
# map() + built-in - faster than comprehension
result = list(map(str.upper, strings))
The rule: Use map() when you can pass a built-in function directly. Avoid map() with lambda — that lambda creates a new function object each time, negating the benefit.
The __call__ Overhead Nobody Discusses
Function calls in Python are expensive. The __call__ protocol involves stack frame creation, argument packing, and cleanup. In tight loops, inline the logic:
# Expensive - function call overhead
def double(x):
return x * 2
result = [double(x) for x in data]
# Cheap - inline
result = [x * 2 for x in data]
This seems obvious, but you'd be surprised how many codebases wrap trivial operations in function calls "for readability" inside performance-critical paths.
Profiling: The One True Technique
Here's the uncomfortable truth: You can't optimize what you don't measure. Python's cProfile module is your best friend:
python -m cProfile -s cumulative my_script.py
Or better — use py-spy for sampling profilers that don't slow down your code:
pip install py-spy
py-spy record -o profile.svg -- python my_script.py
I've seen developers spend hours optimizing loops that accounted for 2% of runtime, while ignoring a database query that took 90% of the time. Always profile first.
The Real Takeaway
Performance optimization in Python isn't about writing C extensions (though that works too). It's about understanding what Python actually does under the hood:
- Attribute lookups cost money — cache them.
- Object creation costs memory — use
__slots__for batches. - Generator chains are elegant but materialize once for speed.
- Local variables beat globals every time.
- Function calls aren't free — inline when it matters.
- Profile before you optimize — or you're just guessing.
The difference between a "fast enough" Python script and a genuinely slow one is rarely about the language. It's about how well you understand its hidden costs.
Advertisement
Comments
Questions, corrections, and tips stay visible for everyone reading this page.
Join the discussion
No comments yet
Be the first to leave a note — it helps the next reader.