Find Most Active Contributors in a Repository with Python

Filter recent commits by date and count the most active contributors using Counter and datetime.

Easy Python 3.9+ Jun 28, 2026 Lists & loops 2 views 0 copies

collections datetime counter commits contributors sorting

Python code

25 lines

Python 3.9+

from collections import Counter
from datetime import datetime, timedelta

# Simulated commit data
commits = [
    {"author": "Alice", "timestamp": datetime.now() - timedelta(days=1)},
    {"author": "Bob", "timestamp": datetime.now() - timedelta(days=2)},
    {"author": "Alice", "timestamp": datetime.now() - timedelta(days=3)},
    {"author": "Charlie", "timestamp": datetime.now() - timedelta(days=5)},
    {"author": "Bob", "timestamp": datetime.now() - timedelta(days=7)},
    {"author": "Alice", "timestamp": datetime.now() - timedelta(days=10)},
    {"author": "David", "timestamp": datetime.now() - timedelta(days=15)},
]

def find_most_active_contributors(commit_list, days=30, top_n=3):
    cutoff = datetime.now() - timedelta(days=days)
    recent_commits = [c for c in commit_list if c["timestamp"] >= cutoff]
    contributor_counts = Counter(c["author"] for c in recent_commits)
    return contributor_counts.most_common(top_n)

if __name__ == "__main__":
    active = find_most_active_contributors(commits)
    print("Most active contributors (last 30 days):")
    for contributor, count in active:
        print(f"  {contributor}: {count} commits")

Output

stdout

Most active contributors (last 30 days):
  Alice: 3 commits
  Bob: 2 commits
  Charlie: 1 commits

How it works

The function takes a list of commit dictionaries with an 'author' and 'timestamp' field. It calculates a cutoff date by subtracting the specified number of days from now. Using a list comprehension, it filters commits newer than the cutoff. A Counter then tallies each author's commits, and most_common(top_n) returns the top contributors sorted by count descending. This approach is efficient because it avoids manual grouping and sorting.

Common mistakes

Forgetting to import Counter from collections and timedelta from datetime.
Using naive datetime comparisons when timestamps might include timezone information.
Assuming commits are already sorted by date instead of filtering by cutoff.
Passing an empty list or invalid date format causing crashes.

Variations

Replace simulated data with actual Git log parsing using subprocess to run 'git log --format=%an'.
Use a pandas DataFrame to filter and group commits if already working in a data pipeline.

Real-world use cases

Generating weekly team reports by analyzing recent commit activity in a shared repository.
Identifying top contributors for recognition or sprint review dashboards.
Filtering stale contributors to handle repository maintenance or onboarding outreach.

Find Most Active Contributors in a Repository with Python

Python code

Output

How it works

Common mistakes

Variations

Real-world use cases

Tutorials

Quizzes

Python code

Output

How it works

Common mistakes

Variations

Real-world use cases

Keep learning

Tutorials

Quizzes