How to Generate Project Statistics Including Lines of Code and Complexity in Python
Walk through a Python script that scans a project directory for Python files, counts lines of code excluding blanks and comments, and estimates cyclomatic complexity by counting decision keywords.
Python code
48 linesimport os
from pathlib import Path
def count_lines_of_code(filepath):
"""Counts lines of code in a Python file, excluding blank lines and comments."""
try:
with open(filepath, 'r') as f:
lines = f.readlines()
code_lines = [line for line in lines if line.strip() and not line.strip().startswith('#')]
return len(code_lines)
except Exception:
return 0
def compute_cyclomatic_complexity(code_text):
"""Estimates cyclomatic complexity by counting decision points."""
decision_keywords = ['if ', 'elif ', 'for ', 'while ', 'and ', 'or ', 'except', 'with ']
complexity = 1 # Base complexity
for keyword in decision_keywords:
complexity += code_text.count(keyword)
return complexity
def generate_project_stats(project_dir):
"""Generates and prints statistics for all Python files in a directory."""
stats = []
for filepath in Path(project_dir).rglob('*.py'):
nloc = count_lines_of_code(filepath)
with open(filepath, 'r') as f:
content = f.read()
complexity = compute_cyclomatic_complexity(content)
stats.append((filepath, nloc, complexity))
if not stats:
print("No Python files found.")
return
print(f"{'File':<40} {'LOC':<6} {'Complexity':<10}")
print("-" * 58)
total_loc = 0
total_complexity = 0
for fpath, nloc, complexity in stats:
print(f"{str(fpath):<40} {nloc:<6} {complexity:<10}")
total_loc += nloc
total_complexity += complexity
print("-" * 58)
print(f"{'Total':<40} {total_loc:<6} {total_complexity:<10}")
if __name__ == "__main__":
generate_project_stats(".")
Output
File LOC Complexity
----------------------------------------------------------
src\main.py 45 12
src\utils.py 30 8
tests\test_main.py 20 5
----------------------------------------------------------
Total 95 25
How it works
The script uses pathlib.Path.rglob('*.py') to recursively find all Python files in the given directory. count_lines_of_code filters out blank lines and lines that start with # to produce a meaningful LOC metric. compute_cyclomatic_complexity adds a base complexity of 1 and increments for each occurrence of decision keywords like if, elif, for, while, and, or, except, and with. This is a practical approximation; true cyclomatic complexity counts each decision point once, but a simple keyword count provides a useful heuristic, especially for smaller projects. The output is formatted as a table using f-strings with widths, making it easy to scan.
Common mistakes
- Counting all lines including blank lines and comments instead of only code lines.
- Using a basic string count that may match keywords inside strings or comments, inflating complexity.
- Forgetting to handle exceptions when opening files, causing the script to fail on one bad file.
Variations
- Use the `radon` library (pip install radon) to get accurate cyclomatic complexity per function.
- Sort results by complexity descending to identify the most complex files first.
Real-world use cases
- Running inside a CI pipeline to track codebase growth and flag increasing complexity over time.
- Generating an automated report before a code review to highlight files that may need simplification.
- Integrating into a developer dashboard that visualizes code quality metrics across multiple repositories.
Sponsored
More from Automation & scripting
- Automatically Clean Temporary Files from Applications Using Python medium
- Automatically Download the Latest Software Release from GitHub with Python medium
- Automatically Generate Charts from CSV Files with One Command medium
- Automatically Generate Hardware Inventory Reports in Python easy
- Automatically Log CPU, RAM, and Disk Usage Every Minute in Python easy
- Batch Rename Hundreds of Files in Python easy
Keep learning
Related tutorials and quizzes for this topic.