Maintenance

Site is under maintenance — quizzes are still available.

Go to quizzes
Sponsored Reserved space — layout preview until AdSense is connected

How to Generate Project Statistics Including Lines of Code and Complexity in Python

Walk through a Python script that scans a project directory for Python files, counts lines of code excluding blanks and comments, and estimates cyclomatic complexity by counting decision keywords.

Medium Python 3.9+ Jun 28, 2026 Automation & scripting 2 views 0 copies

Python code

48 lines
Python 3.9+
import os
from pathlib import Path

def count_lines_of_code(filepath):
    """Counts lines of code in a Python file, excluding blank lines and comments."""
    try:
        with open(filepath, 'r') as f:
            lines = f.readlines()
        code_lines = [line for line in lines if line.strip() and not line.strip().startswith('#')]
        return len(code_lines)
    except Exception:
        return 0

def compute_cyclomatic_complexity(code_text):
    """Estimates cyclomatic complexity by counting decision points."""
    decision_keywords = ['if ', 'elif ', 'for ', 'while ', 'and ', 'or ', 'except', 'with ']
    complexity = 1  # Base complexity
    for keyword in decision_keywords:
        complexity += code_text.count(keyword)
    return complexity

def generate_project_stats(project_dir):
    """Generates and prints statistics for all Python files in a directory."""
    stats = []
    for filepath in Path(project_dir).rglob('*.py'):
        nloc = count_lines_of_code(filepath)
        with open(filepath, 'r') as f:
            content = f.read()
        complexity = compute_cyclomatic_complexity(content)
        stats.append((filepath, nloc, complexity))
    
    if not stats:
        print("No Python files found.")
        return
    
    print(f"{'File':<40} {'LOC':<6} {'Complexity':<10}")
    print("-" * 58)
    total_loc = 0
    total_complexity = 0
    for fpath, nloc, complexity in stats:
        print(f"{str(fpath):<40} {nloc:<6} {complexity:<10}")
        total_loc += nloc
        total_complexity += complexity
    print("-" * 58)
    print(f"{'Total':<40} {total_loc:<6} {total_complexity:<10}")

if __name__ == "__main__":
    generate_project_stats(".")

Output

stdout
File                                     LOC    Complexity
----------------------------------------------------------
src\main.py                              45     12
src\utils.py                             30     8
tests\test_main.py                       20     5
----------------------------------------------------------
Total                                    95     25

How it works

The script uses pathlib.Path.rglob('*.py') to recursively find all Python files in the given directory. count_lines_of_code filters out blank lines and lines that start with # to produce a meaningful LOC metric. compute_cyclomatic_complexity adds a base complexity of 1 and increments for each occurrence of decision keywords like if, elif, for, while, and, or, except, and with. This is a practical approximation; true cyclomatic complexity counts each decision point once, but a simple keyword count provides a useful heuristic, especially for smaller projects. The output is formatted as a table using f-strings with widths, making it easy to scan.

Common mistakes

  • Counting all lines including blank lines and comments instead of only code lines.
  • Using a basic string count that may match keywords inside strings or comments, inflating complexity.
  • Forgetting to handle exceptions when opening files, causing the script to fail on one bad file.

Variations

  1. Use the `radon` library (pip install radon) to get accurate cyclomatic complexity per function.
  2. Sort results by complexity descending to identify the most complex files first.

Real-world use cases

  • Running inside a CI pipeline to track codebase growth and flag increasing complexity over time.
  • Generating an automated report before a code review to highlight files that may need simplification.
  • Integrating into a developer dashboard that visualizes code quality metrics across multiple repositories.

Sponsored

Sponsored Reserved space — layout preview until AdSense is connected

Run this sample

Open the browser IDE to tweak the example and see results without installing anything.

Open editor

More from Automation & scripting

Related tutorials and quizzes for this topic.