Maintenance

Site is under maintenance — quizzes are still available.

Go to quizzes
Sponsored Reserved space — layout preview until AdSense is connected

Build a Website Accessibility Scanner Using Python

Scans a webpage for common accessibility issues like missing alt text, headings, labels, and landmarks using only Python.

Medium Python 3.9+ Jun 28, 2026 Automation & scripting 2 views 0 copies

Requires third-party packages — install first
pip install requests

Python code

58 lines
Python 3.9+
import requests
from urllib.parse import urljoin
from html.parser import HTMLParser
import re

class AccessibilityParser(HTMLParser):
    def __init__(self):
        super().__init__()
        self.images_without_alt = []
        self.missing_headings = True
        self.has_main_tag = False
        self.label_for_input = {}
        self.inputs_without_label = []
    
    def handle_starttag(self, tag, attrs):
        attrs_dict = dict(attrs)
        if tag == 'img' and 'alt' not in attrs_dict:
            self.images_without_alt.append(attrs_dict.get('src', 'unknown'))
        if tag in ('h1', 'h2', 'h3', 'h4', 'h5', 'h6'):
            self.missing_headings = False
        if tag == 'main':
            self.has_main_tag = True
        if tag == 'label':
            for attr in attrs:
                if attr[0] == 'for':
                    self.label_for_input[attr[1]] = True
        if tag == 'input':
            input_id = attrs_dict.get('id', '')
            if input_id not in self.label_for_input:
                self.inputs_without_label.append(attrs_dict.get('name', 'unknown'))

def scan_url(url):
    try:
        response = requests.get(url, timeout=10)
        response.raise_for_status()
        parser = AccessibilityParser()
        parser.feed(response.text)
        
        issues = []
        if parser.images_without_alt:
            issues.append(f"Missing alt text on {len(parser.images_without_alt)} images")
        if parser.missing_headings:
            issues.append("No heading tags (h1-h6) found")
        if parser.inputs_without_label:
            issues.append(f"{len(parser.inputs_without_label)} inputs missing associated labels")
        if not parser.has_main_tag:
            issues.append("No <main> landmark element found")
        
        if not issues:
            return f"{url}: No accessibility issues found"
        else:
            return f"{url}: Found accessibility issues:\n" + "\n".join(issues)
    except requests.RequestException as e:
        return f"{url}: Error scanning - {str(e)}"

if __name__ == "__main__":
    test_url = "https://example.com"
    print(scan_url(test_url))

Output

stdout
https://example.com: Found accessibility issues:
Missing alt text on 1 images
No heading tags (h1-h6) found
No <main> landmark element found

How it works

The HTMLParser class from html.parser lets you parse HTML without external dependencies. By subclassing it and overriding handle_starttag, you inspect each tag for accessibility attributes. The scanner checks for missing alt on <img>, presence of any heading tag, <main> landmark, and whether each <input> has an associated <label> (via for attribute). This approach is fast and lightweight, suitable for quick audits on simple pages.

Common mistakes

  • Forgetting to handle URL redirects or timeouts from requests.get.
  • Not accounting for inputs with an aria-label or aria-labelledby attribute as accessible alternatives.
  • Assuming all images need alt text, but decorative images should have empty alt=''.

Variations

  1. Use BeautifulSoup instead of HTMLParser for easier traversal.
  2. Add checks for color contrast by fetching computed styles via a headless browser.

Real-world use cases

  • CI pipeline hook that blocks deployment if a page has missing alt text or headings.
  • Scheduled nightly scan of company websites to generate accessibility compliance reports.
  • Quick audit of a static site before sending it for WCAG review.

Sponsored

Sponsored Reserved space — layout preview until AdSense is connected

Run locally

This sample needs third-party packages, so it cannot run in the browser IDE. Copy the code above, install the packages shown at the top, then run it in your own Python environment.

More from Automation & scripting

Related tutorials and quizzes for this topic.