How to Track GitHub Stars, Forks, and Watchers in Python
Automatically fetch and track stars, forks, and watchers for multiple GitHub repositories, saving snapshots locally as JSON files for historical analysis.
pip install requests
Python code
46 linesimport os
import time
import json
import requests
from pathlib import Path
from datetime import datetime
REPOS = [
"psf/requests",
"python/cpython",
"pallets/flask",
]
DATA_DIR = Path("github_metrics")
def fetch_repo_stats(repo):
url = f"https://api.github.com/repos/{repo}"
resp = requests.get(url, headers={"Accept": "application/vnd.github.v3+json"})
resp.raise_for_status()
data = resp.json()
return {
"full_name": data["full_name"],
"stars": data["stargazers_count"],
"forks": data["forks_count"],
"watchers": data["subscribers_count"],
"timestamp": datetime.utcnow().isoformat(),
}
def save_stats(stats):
DATA_DIR.mkdir(exist_ok=True)
filepath = DATA_DIR / f"{stats['full_name'].replace('/', '_')}.json"
history = []
if filepath.exists():
history = json.loads(filepath.read_text())
history.append(stats)
filepath.write_text(json.dumps(history, indent=2))
print(f"Saved {len(history)} snapshots for {stats['full_name']}")
if __name__ == "__main__":
for repo in REPOS:
try:
stats = fetch_repo_stats(repo)
save_stats(stats)
print(f"{repo}: ⭐ {stats['stars']} 🍴 {stats['forks']} 👁 {stats['watchers']}")
time.sleep(1) # Respect rate limits
except Exception as e:
print(f"Error fetching {repo}: {e}")
Output
Saved 1 snapshots for psf/requests
psf/requests: ⭐ 56400 🍴 9700 👁 560
Saved 1 snapshots for python/cpython
python/cpython: ⭐ 63000 🍴 12000 👁 620
Saved 1 snapshots for pallets/flask
pallets/flask: ⭐ 68000 🍴 14000 👁 650
How it works
This script uses the GitHub REST API to retrieve repository metadata, including stargazers_count, forks_count, and subscribers_count (watchers). The requests library simplifies HTTP calls and error handling. Each snapshot includes a UTC timestamp and is appended to a per-repo JSON file under github_metrics/. A 1-second delay between requests helps avoid GitHub's unauthenticated rate limit of 60 requests per hour. The script reuses existing history so you can run it periodically and accumulate trends.
Common mistakes
- Hardcoding API tokens in plain text instead of environment variables.
- Not respecting rate limits—sending requests too fast without a delay.
- Assuming watchers equals stargazers_count; GitHub uses subscribers_count for repo watchers.
- Forgetting to handle network errors or unexpected API responses with try/except.
Variations
- Use the PyGithub wrapper for a more Pythonic API and automatic pagination.
- Store metrics in a database like SQLite or TimescaleDB for long-term graphing.
Real-world use cases
- Automating weekly reports for an open-source project's popularity trends.
- Monitoring competitor repositories to benchmark growth in developer mindshare.
- Scheduling a cron job to collect metrics for dashboards or data analytics pipelines.
Sponsored
More from Automation & scripting
- Automatically Clean Temporary Files from Applications Using Python medium
- Automatically Download the Latest Software Release from GitHub with Python medium
- Automatically Generate Charts from CSV Files with One Command medium
- Automatically Generate Hardware Inventory Reports in Python easy
- Automatically Log CPU, RAM, and Disk Usage Every Minute in Python easy
- Batch Rename Hundreds of Files in Python easy
Keep learning
Related tutorials and quizzes for this topic.