Maintenance

Site is under maintenance — quizzes are still available.

Go to quizzes
Sponsored Reserved space — layout preview until AdSense is connected

How to Detect Unused Images in a Project with Python

A Python script that scans a website project folder, identifies all image files, and checks HTML/CSS/JS files to find which images are never referenced.

Medium Python 3.9+ Jun 28, 2026 Automation & scripting 2 views 0 copies

Python code

63 lines
Python 3.9+
import os
import re
from pathlib import Path

def find_unused_images(project_path):
    image_exts = {'.png', '.jpg', '.jpeg', '.gif', '.svg', '.webp'}
    used_images = set()
    all_images = set()
    
    # Find all image files
    for root, _, files in os.walk(project_path):
        for file in files:
            if Path(file).suffix.lower() in image_exts:
                all_images.add(os.path.join(root, file))
    
    # Find references in HTML/CSS/JS files
    for root, _, files in os.walk(project_path):
        for file in files:
            if file.endswith(('.html', '.css', '.js')):
                filepath = os.path.join(root, file)
                try:
                    with open(filepath, 'r', encoding='utf-8', errors='ignore') as f:
                        content = f.read()
                        # Find image references in src, url(), etc.
                        refs = re.findall(r'[\'"]?([\w./-]+\.(?:png|jpg|jpeg|gif|svg|webp))[\'"]?', content, re.IGNORECASE)
                        for ref in refs:
                            # Resolve relative paths
                            abs_path = os.path.normpath(os.path.join(os.path.dirname(filepath), ref))
                            if os.path.exists(abs_path):
                                used_images.add(abs_path)
                except (IOError, UnicodeDecodeError):
                    continue
    
    unused = all_images - used_images
    return sorted(unused)

if __name__ == "__main__":
    import tempfile
    import shutil
    
    # Create test project structure
    test_dir = tempfile.mkdtemp()
    os.makedirs(os.path.join(test_dir, 'images'))
    os.makedirs(os.path.join(test_dir, 'css'))
    
    # Create some test files
    Path(os.path.join(test_dir, 'images', 'logo.png')).touch()
    Path(os.path.join(test_dir, 'images', 'banner.jpg')).touch()
    Path(os.path.join(test_dir, 'images', 'old_bg.gif')).touch()
    Path(os.path.join(test_dir, 'images', 'icon.svg')).touch()
    
    with open(os.path.join(test_dir, 'index.html'), 'w') as f:
        f.write('<img src="images/logo.png" alt="Logo">')
    
    with open(os.path.join(test_dir, 'css', 'style.css'), 'w') as f:
        f.write('background: url("../images/banner.jpg");')
    
    unused = find_unused_images(test_dir)
    print("Unused images found:")
    for img in unused:
        print(f"  {os.path.relpath(img, test_dir)}")
    
    shutil.rmtree(test_dir)

Output

stdout
Unused images found:
  images/icon.svg
  images/old_bg.gif

How it works

The script walks the entire project directory twice: once to collect all image paths, then again to scan source files for string references to images using a regular expression. References found in HTML src attributes, CSS url() calls, and JavaScript strings are resolved to absolute paths and recorded. Unused images are those in the full set minus the referenced set. The regex captures common extensions case-insensitively, and the script ignores read errors gracefully.

Common mistakes

  • Forgetting to account for relative paths in CSS, which may use `../images/` style references
  • Not including all common image extensions like `.webp` or `.svg` in the regex
  • Assuming all file names are lowercase; the script handles case-insensitive extensions but reference paths may still vary

Variations

  1. Use `pathlib.rglob('*')` to simplify finding all image and source files in one pass.
  2. Add a dry-run mode that only reports without deleting, or integrate with a codebase cleanup tool.

Real-world use cases

  • Cleaning up legacy website repos before deployment to reduce repository size and build time.
  • Auditing image assets in a CMS static export to remove orphaned files from the server.
  • Running in CI/CD pipelines to alert on unused assets and keep the project lean.

Sponsored

Sponsored Reserved space — layout preview until AdSense is connected

Run this sample

Open the browser IDE to tweak the example and see results without installing anything.

Open editor

More from Automation & scripting

Related tutorials and quizzes for this topic.