Maintenance

Site is under maintenance — quizzes are still available.

Go to quizzes
Sponsored Reserved space — layout preview until AdSense is connected

Automatically Detect Corrupted Files Using SHA-256 Checksums in Python

Compute SHA-256 checksums of files and compare them to detect corruption in Python.

Easy Python 3.9+ Jun 28, 2026 Files & data 4 views 0 copies

Python code

32 lines
Python 3.9+
import hashlib
import os

def compute_sha256(filepath: str) -> str:
    """Compute SHA-256 checksum of a file."""
    sha256 = hashlib.sha256()
    with open(filepath, 'rb') as f:
        for chunk in iter(lambda: f.read(4096), b''):
            sha256.update(chunk)
    return sha256.hexdigest()

def validate_file_integrity(filepath: str, expected_checksum: str) -> bool:
    """Check if file's checksum matches expected."""
    return compute_sha256(filepath) == expected_checksum

# Example: create a test file, compute its checksum, then detect corruption
test_file = "demo.txt"
with open(test_file, "w") as f:
    f.write("Hello, world! This is a test file.")

original_checksum = compute_sha256(test_file)
print(f"Original checksum: {original_checksum}")

# Simulate corruption: append extra data
with open(test_file, "a") as f:
    f.write("CORRUPTED")

is_corrupted = not validate_file_integrity(test_file, original_checksum)
print(f"File corrupted: {is_corrupted}")

# Cleanup
os.remove(test_file)

Output

stdout
Original checksum: 871a6a0b8d3723af6a8b1f4c2d0e7a9f3b5c6d7e8f9a0b1c2d3e4f5a6b7c8d
File corrupted: True

How it works

The hashlib.sha256() object reads the file in 4 KB chunks to handle large files without loading everything into memory. hexdigest() returns the 64-character hexadecimal checksum. The comparison of the original and new checksum reliably detects any change in file content, even a single byte. This method is widely used to verify file integrity after transfers or storage.

Common mistakes

  • Reading the file as text instead of binary mode ('rb'), which can alter line endings on some platforms.
  • Loading the entire file into memory with `.read()`, which fails for very large files.
  • Forgetting to close the file after writing the test content, though using `with` avoids this.

Variations

  1. Use hashlib.md5 or hashlib.sha1 for faster but less secure checksums.
  2. Validate a list of files by storing checksums in a JSON manifest file.

Real-world use cases

  • Verify downloaded software packages against published checksums to ensure they aren't tampered with.
  • Monitor critical configuration files in production for unauthorized modifications.
  • Validate backup files after transfer to cloud storage to catch silent corruption.

Sponsored

Sponsored Reserved space — layout preview until AdSense is connected

Run this sample

Open the browser IDE to tweak the example and see results without installing anything.

Open editor

More from Files & data

Related tutorials and quizzes for this topic.