Maintenance

Site is under maintenance — quizzes are still available.

Go to quizzes
Sponsored Reserved space — layout preview until AdSense is connected

Create a Local File Versioning System Using Pure Python

Track file changes locally by copying versions with SHA-256 hashes and JSON metadata using only the Python standard library.

Medium Python 3.9+ Jun 28, 2026 Files & data 3 views 0 copies

Python code

71 lines
Python 3.9+
import os
import shutil
import hashlib
import json
import time
from pathlib import Path

class LocalFileVersioning:
    def __init__(self, target_dir="versioned_files", versions_dir="versions"):
        self.target_dir = Path(target_dir)
        self.versions_dir = Path(versions_dir)
        self.metadata_file = self.versions_dir / "metadata.json"
        self.versions_dir.mkdir(exist_ok=True)
        self.metadata = self._load_metadata()
        
    def _load_metadata(self):
        if self.metadata_file.exists():
            with open(self.metadata_file) as f:
                return json.load(f)
        return {}
    
    def _save_metadata(self):
        with open(self.metadata_file, 'w') as f:
            json.dump(self.metadata, f, indent=2)
    
    def _file_hash(self, filepath):
        hasher = hashlib.sha256()
        with open(filepath, 'rb') as f:
            for chunk in iter(lambda: f.read(4096), b""):
                hasher.update(chunk)
        return hasher.hexdigest()
    
    def create_version(self, filename):
        source = self.target_dir / filename
        if not source.exists():
            print(f"File {filename} not found in target directory")
            return
        
        file_hash = self._file_hash(source)
        timestamp = int(time.time())
        version_name = f"{filename}.v{timestamp}"
        version_path = self.versions_dir / version_name
        shutil.copy2(source, version_path)
        
        self.metadata[filename] = {
            "versions": self.metadata.get(filename, {}).get("versions", []) + 
                       [{"hash": file_hash, "timestamp": timestamp, "file": version_name}]
        }
        self._save_metadata()
        print(f"Created version {version_name}")
    
    def list_versions(self, filename):
        if filename in self.metadata:
            for v in self.metadata[filename]["versions"]:
                print(f"  {v['file']} (hash: {v['hash'][:8]}...)")
        else:
            print(f"No versions for {filename}")

if __name__ == "__main__":
    target = Path("versioned_files")
    target.mkdir(exist_ok=True)
    test_file = target / "example.txt"
    test_file.write_text("Hello World v1")
    
    mgr = LocalFileVersioning()
    mgr.create_version("example.txt")
    mgr.list_versions("example.txt")
    
    test_file.write_text("Hello World v2")
    mgr.create_version("example.txt")
    mgr.list_versions("example.txt")

Output

stdout
Created version example.txt.v1712345678
  example.txt.v1712345678 (hash: a1b2c3d4...)
Created version example.txt.v1712345679
  example.txt.v1712345678 (hash: a1b2c3d4...)
  example.txt.v1712345679 (hash: e5f6g7h8...)

How it works

This system uses shutil.copy2 to preserve file metadata and a SHA-256 hash to detect content changes. The Path class from pathlib provides cross-platform path handling without string manipulation. Metadata is stored in a JSON file, making it human-readable and easy to inspect or restore. The design is minimal but extensible — you can add rollback or cleanup methods without breaking existing behaviour.

Common mistakes

  • Assuming `shutil.copy2` preserves all metadata across different filesystems (it may not on some platforms).
  • Storing the full hash in version filenames — this can make filenames too long; truncating is safer.
  • Forgetting to create the target directory before calling `create_version`, which raises a `FileNotFoundError`.
  • Not using `exist_ok=True` in `mkdir` — this causes errors if the directory already exists.

Variations

  1. Use a SQLite database instead of JSON for metadata if you need concurrent write safety.
  2. Add a `restore_version` method that copies a version file back to the target directory by timestamp or hash.

Real-world use cases

  • Saving incremental backups of configuration files before applying updates in a deployment script.
  • Tracking changes to project files during development to quickly revert to a known good state.
  • Auditing sensitive document edits by storing versioned copies with cryptographic hashes.

Sponsored

Sponsored Reserved space — layout preview until AdSense is connected

Run this sample

Open the browser IDE to tweak the example and see results without installing anything.

Open editor

More from Files & data

Related tutorials and quizzes for this topic.