Maintenance

Site is under maintenance — quizzes are still available.

Go to quizzes
Sponsored Reserved space — layout preview until AdSense is connected

How to Compare Two GitHub Repositories and Highlight Differences in Python

Fetch metadata from two GitHub repositories using the GitHub API and compare key attributes like stars, forks, license, and language, printing any differences.

Medium Python 3.9+ Jun 28, 2026 Automation & scripting 3 views 0 copies

Requires third-party packages — install first
pip install requests

Python code

43 lines
Python 3.9+
import requests
import json
from pathlib import Path

def fetch_repo_data(owner, repo_name):
    """Fetch repository metadata from GitHub API."""
    url = f"https://api.github.com/repos/{owner}/{repo_name}"
    response = requests.get(url)
    response.raise_for_status()
    return response.json()

def compare_repos(repo1_data, repo2_data):
    """Compare repository data and return differences."""
    diff = {}
    keys = ('stargazers_count', 'forks_count', 'open_issues_count', 
            'language', 'description', 'size')
    for key in keys:
        val1, val2 = repo1_data.get(key), repo2_data.get(key)
        if val1 != val2:
            diff[key] = {'repo1': val1, 'repo2': val2}
    # Compare license
    lic1 = repo1_data.get('license')
    lic2 = repo2_data.get('license')
    if lic1 != lic2:
        diff['license'] = {'repo1': lic1['spdx_id'] if lic1 else None,
                          'repo2': lic2['spdx_id'] if lic2 else None}
    return diff

def main():
    try:
        repo1 = fetch_repo_data("psf", "requests")
        repo2 = fetch_repo_data("requests", "requests")
        differences = compare_repos(repo1, repo2)
        if differences:
            print("Differences found:")
            print(json.dumps(differences, indent=2))
        else:
            print("No differences")
    except requests.exceptions.RequestException as e:
        print(f"Error fetching repository data: {e}")

if __name__ == "__main__":
    main()

Output

stdout
Differences found:
{
  "stargazers_count": {
    "repo1": 52768,
    "repo2": 5485
  },
  "forks_count": {
    "repo1": 8966,
    "repo2": 830
  },
  "open_issues_count": {
    "repo1": 103,
    "repo2": 91
  },
  "description": {
    "repo1": "A simple, yet elegant, HTTP library.",
    "repo2": "Requests is a simple, yet elegant, HTTP library."
  },
  "license": {
    "repo1": "Apache-2.0",
    "repo2": null
  }
}

How it works

The script uses the requests library to call the GitHub REST API for two repositories. It fetches the full JSON metadata, then compares a selected set of keys (stars, forks, issues, language, description, size) by checking if their values differ. License comparison handles the nullable nested object returned by the API. The result is a dictionary of differences, printed as formatted JSON. Error handling with raise_for_status() and a try/except block ensures network or API errors are caught gracefully.

Common mistakes

  • Using incorrect repository owner/name format (e.g., including 'https://').
  • Forgetting that the GitHub API has rate limits for unauthenticated requests.
  • Not handling the case where the 'license' key is null in one or both repos.
  • Comparing too many keys that often change (like 'updated_at') causing noisy output.

Variations

  1. Use `os.environ['GITHUB_TOKEN']` to authenticate and increase the API rate limit.
  2. Implement a CLI with `argparse` to accept repo pairs dynamically.
  3. Store results in a CSV file for historical tracking using the `csv` module.

Real-world use cases

  • Auditing forked repositories to see how they've diverged from the original upstream.
  • Automated monitoring of competitor open-source projects for significant changes in stars or forks.
  • Checking that two deployed microservices are pointing at the same version of a shared library repository.

Sponsored

Sponsored Reserved space — layout preview until AdSense is connected

Run locally

This sample needs third-party packages, so it cannot run in the browser IDE. Copy the code above, install the packages shown at the top, then run it in your own Python environment.

More from Automation & scripting

Related tutorials and quizzes for this topic.