Maintenance

Site is under maintenance — quizzes are still available.

Go to quizzes
Sponsored Reserved space — layout preview until AdSense is connected

Build a Python Utility That Verifies Backup Integrity Automatically

Automatically compute and verify SHA-256 checksums of backup files using a JSON manifest to detect missing or corrupted data.

Medium Python 3.9+ Jun 28, 2026 Automation & scripting 2 views 0 copies

Python code

47 lines
Python 3.9+
import hashlib
import os
import json

def compute_checksum(filepath, algorithm='sha256'):
    """Compute checksum for the given file."""
    hash_func = hashlib.new(algorithm)
    with open(filepath, 'rb') as f:
        for chunk in iter(lambda: f.read(4096), b''):
            hash_func.update(chunk)
    return hash_func.hexdigest()

def verify_backup_integrity(backup_dir, manifest_file='integrity_check.json'):
    """Verify backup files against a stored manifest of checksums."""
    if not os.path.isfile(manifest_file):
        print(f"Manifest file {manifest_file} not found. Creating it now...")
        manifest = {}
        for root, _, files in os.walk(backup_dir):
            for file in files:
                filepath = os.path.join(root, file)
                manifest[filepath] = compute_checksum(filepath)
        with open(manifest_file, 'w') as f:
            json.dump(manifest, f, indent=2)
        print("Manifest created. Run again to verify.")
        return

    with open(manifest_file, 'r') as f:
        manifest = json.load(f)

    errors = []
    for filepath, expected_checksum in manifest.items():
        if not os.path.exists(filepath):
            errors.append(f"Missing: {filepath}")
            continue
        current_checksum = compute_checksum(filepath)
        if current_checksum != expected_checksum:
            errors.append(f"Corrupted: {filepath} (expected {expected_checksum}, got {current_checksum})")

    if errors:
        print("Integrity check failed:")
        for err in errors:
            print(f"  - {err}")
    else:
        print("Backup integrity verified successfully.")

if __name__ == "__main__":
    verify_backup_integrity('/path/to/backup')

Output

stdout
Manifest file integrity_check.json not found. Creating it now...
Manifest created. Run again to verify.

# Second run:
Backup integrity verified successfully.

How it works

The script first walks the backup directory tree with os.walk(), computing a SHA-256 checksum for every file using hashlib. Each checksum is stored in a JSON manifest file. On subsequent runs, it reloads this manifest and recomputes checksums for every listed file, flagging any mismatches as corruption and any missing paths as deletions. This two-phase approach lets you detect both file tampering and accidental deletions.

Common mistakes

  • Forgetting to update the manifest after adding new files — the script only creates the manifest once.
  • Using the wrong algorithm name (e.g., 'sha' instead of 'sha256') which raises an exception.
  • Storing the manifest inside the backup directory itself, causing a false recursive inclusion.

Variations

  1. Use `blake2b` instead of `sha256` for faster hashing on some systems.
  2. Add parallel hashing with `concurrent.futures.ThreadPoolExecutor` for large backup sets.

Real-world use cases

  • Verifying nightly database backups before archiving to cloud storage.
  • Monitoring a backup drive for silent data corruption between rotation cycles.
  • Validating that a migrated file server copy matches the original checksums.

Sponsored

Sponsored Reserved space — layout preview until AdSense is connected

Run this sample

Open the browser IDE to tweak the example and see results without installing anything.

Open editor

More from Automation & scripting

Related tutorials and quizzes for this topic.