Build a Python Performance Profiler That Generates Readable Reports
Use cProfile and pstats to profile Python functions and print a sorted performance report showing the top time-consuming calls.
Python code
34 linesimport cProfile
import pstats
import io
from pathlib import Path
def slow_function():
total = 0
for i in range(500_000):
total += i ** 2
return total
def fast_function():
total = sum(i * i for i in range(500_000))
return total
def profile_functions():
profiler = cProfile.Profile()
profiler.enable()
slow_function()
fast_function()
profiler.disable()
stream = io.StringIO()
stats = pstats.Stats(profiler, stream=stream)
stats.sort_stats('cumtime')
stats.print_stats(5)
report = stream.getvalue()
print("=== Performance Report (top 5 calls by cumulative time) ===")
print(report)
if __name__ == "__main__":
profile_functions()
Output
=== Performance Report (top 5 calls by cumulative time) ===
5 function calls in 0.XXX seconds
Ordered by: cumulative time
List reduced from ... to 5 due to restriction <5>
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.XXX 0.XXX 0.XXX 0.XXX <stdin>:1(slow_function)
1 0.XXX 0.XXX 0.XXX 0.XXX <stdin>:1(fast_function)
1 0.000 0.000 0.000 0.000 <stdin>:1(profile_functions)
1 0.000 0.000 0.000 0.000 {built-in method builtins.print}
1 0.000 0.000 0.000 0.000 {method 'enable' of '_lsprof.Profiler' objects}
How it works
cProfile.Profile() creates a deterministic profiler that records every function call in the context. Enabling/disabling around the target code captures only the calls of interest. Importing pstats.Stats with a StringIO stream lets you format the profile as a string without writing to a file. Calling stats.sort_stats('cumtime') sorts by cumulative time so you see the most expensive calls first, and print_stats(5) limits output to the top 5 entries for a concise report.
Common mistakes
- Forgetting to create a new `Profile()` instance for each profiling session; reusing one can accumulate old data.
- Profiling too large a scope (e.g., the whole script) making the report noisy; wrap only the code you care about.
- Not sorting the stats; the default order (internal call order) is less useful for identifying bottlenecks.
Variations
- Use `profiler.runcall(slow_function)` to profile a single call without manually enabling/disabling.
- Save stats to a file with `pstats.Stats('profile_output.prof')` and analyze it later with `snakeviz` for interactive visualization.
Real-world use cases
- Identifying which function in a data pipeline consumes the most CPU time before optimising loops or algorithms.
- Comparing two implementation variants (e.g., list comprehension vs. for-loop) to decide which to deploy in production.
- Profiling request handlers in a web app to pinpoint slow endpoints before scaling or caching.
Sponsored
Keep learning
Related tutorials and quizzes for this topic.