Maintenance

Site is under maintenance — quizzes are still available.

Go to quizzes
Sponsored Reserved space — layout preview until AdSense is connected

Create a Local Search Engine to Instantly Find Files on Your Computer in Python

Build a local file search engine in Python that indexes files by name, extension, and glob pattern for instant retrieval.

Medium Python 3.9+ Jun 27, 2026 Automation & scripting 1 views 0 copies

Python code

76 lines
Python 3.9+
import os
import sys
import time
from pathlib import Path
import fnmatch

class LocalSearchEngine:
    def __init__(self, root_directory="."):
        self.root_directory = Path(root_directory)
        self.file_index = {}
        
    def build_index(self):
        """Build a complete index of files in the root directory."""
        start_time = time.time()
        indexed_count = 0
        
        for root, dirs, files in os.walk(self.root_directory):
            for file in files:
                file_path = Path(root) / file
                file_lower = file.lower()
                
                # Index by exact filename
                self.file_index[file_lower] = file_path
                
                # Index by extension
                ext = file_path.suffix.lower()
                if ext not in self.file_index:
                    self.file_index[ext] = []
                self.file_index[ext].append(file_path)
                
                indexed_count += 1
                
        elapsed_time = time.time() - start_time
        print(f"Indexed {indexed_count} files in {elapsed_time:.2f} seconds")
        
    def search_by_name(self, query):
        """Search for files by exact name (case-insensitive)."""
        query_lower = query.lower()
        if query_lower in self.file_index:
            return [self.file_index[query_lower]]
        return []
    
    def search_by_pattern(self, pattern):
        """Search for files matching a glob pattern."""
        results = []
        pattern_lower = pattern.lower()
        
        for file_lower, file_path in self.file_index.items():
            if isinstance(file_path, Path) and fnmatch.fnmatch(file_lower, pattern_lower):
                results.append(file_path)
                
        return results
    
    def search_by_extension(self, extension):
        """Search for files with a specific extension."""
        ext = extension.lower() if extension.startswith('.') else f".{extension}"
        return self.file_index.get(ext, [])

if __name__ == "__main__":
    # Demo with current directory
    engine = LocalSearchEngine(".")
    engine.build_index()
    
    # Example searches
    print("\n--- Search Examples ---")
    query_name = "test.py"
    results = engine.search_by_name(query_name)
    print(f"Files named '{query_name}': {results}")
    
    pattern = "*.txt"
    results = engine.search_by_pattern(pattern)
    print(f"Files matching '{pattern}': {results}")
    
    extension = ".py"
    results = engine.search_by_extension(extension)
    print(f"Files with extension '{extension}': {results[:5]}...")

Output

stdout
Indexed 42 files in 0.03 seconds

--- Search Examples ---
Files named 'test.py': [PosixPath('test.py')]
Files matching '*.txt': [PosixPath('notes.txt'), PosixPath('README.txt')]
Files with extension '.py': [PosixPath('main.py'), PosixPath('test.py'), PosixPath('utils.py'), PosixPath('search.py'), PosixPath('config.py')]...

How it works

The os.walk generator traverses the directory tree efficiently, yielding root, directories, and files. We store file paths in a dictionary keyed by lowercase filename for O(1) exact match lookups. For extension searches, we group files by their suffix into lists under a separate key (e.g., '.py'). The fnmatch.fnmatch function provides Unix-like glob pattern matching on the indexed filenames. Building the index once up front avoids slow repeated directory scans, making repeated searches fast.

Common mistakes

  • Forgetting to use `Path(root) / file` to construct the full path, which works cross-platform.
  • Overwriting filename entries when two files share the same name in different directories — the index stores only the last encountered path.
  • Using `fnmatch` on the full path instead of just the filename, which can match unintended directories.
  • Not handling hidden files or files with special characters in names.

Variations

  1. Use `pathlib.Path.rglob('*')` for a more Pythonic walk and list comprehension to build the index.
  2. Add a `search_by_size` or `search_by_date` method by extracting file stats from `os.stat` during indexing.

Real-world use cases

  • Quickly locate configuration files (e.g., .env, .ini) across a large project without relying on OS search tools.
  • Build a file manager utility that lets users filter files by extension or wildcard in a GUI or terminal app.
  • Implement a log-file scanner that indexes logs by date pattern and then searches for error messages within matching files.

Sponsored

Sponsored Reserved space — layout preview until AdSense is connected

Run this sample

Open the browser IDE to tweak the example and see results without installing anything.

Open editor

More from Automation & scripting

Related tutorials and quizzes for this topic.