Create a Local Search Engine to Instantly Find Files on Your Computer in Python
Build a local file search engine in Python that indexes files by name, extension, and glob pattern for instant retrieval.
Python code
76 linesimport os
import sys
import time
from pathlib import Path
import fnmatch
class LocalSearchEngine:
def __init__(self, root_directory="."):
self.root_directory = Path(root_directory)
self.file_index = {}
def build_index(self):
"""Build a complete index of files in the root directory."""
start_time = time.time()
indexed_count = 0
for root, dirs, files in os.walk(self.root_directory):
for file in files:
file_path = Path(root) / file
file_lower = file.lower()
# Index by exact filename
self.file_index[file_lower] = file_path
# Index by extension
ext = file_path.suffix.lower()
if ext not in self.file_index:
self.file_index[ext] = []
self.file_index[ext].append(file_path)
indexed_count += 1
elapsed_time = time.time() - start_time
print(f"Indexed {indexed_count} files in {elapsed_time:.2f} seconds")
def search_by_name(self, query):
"""Search for files by exact name (case-insensitive)."""
query_lower = query.lower()
if query_lower in self.file_index:
return [self.file_index[query_lower]]
return []
def search_by_pattern(self, pattern):
"""Search for files matching a glob pattern."""
results = []
pattern_lower = pattern.lower()
for file_lower, file_path in self.file_index.items():
if isinstance(file_path, Path) and fnmatch.fnmatch(file_lower, pattern_lower):
results.append(file_path)
return results
def search_by_extension(self, extension):
"""Search for files with a specific extension."""
ext = extension.lower() if extension.startswith('.') else f".{extension}"
return self.file_index.get(ext, [])
if __name__ == "__main__":
# Demo with current directory
engine = LocalSearchEngine(".")
engine.build_index()
# Example searches
print("\n--- Search Examples ---")
query_name = "test.py"
results = engine.search_by_name(query_name)
print(f"Files named '{query_name}': {results}")
pattern = "*.txt"
results = engine.search_by_pattern(pattern)
print(f"Files matching '{pattern}': {results}")
extension = ".py"
results = engine.search_by_extension(extension)
print(f"Files with extension '{extension}': {results[:5]}...")
Output
Indexed 42 files in 0.03 seconds
--- Search Examples ---
Files named 'test.py': [PosixPath('test.py')]
Files matching '*.txt': [PosixPath('notes.txt'), PosixPath('README.txt')]
Files with extension '.py': [PosixPath('main.py'), PosixPath('test.py'), PosixPath('utils.py'), PosixPath('search.py'), PosixPath('config.py')]...
How it works
The os.walk generator traverses the directory tree efficiently, yielding root, directories, and files. We store file paths in a dictionary keyed by lowercase filename for O(1) exact match lookups. For extension searches, we group files by their suffix into lists under a separate key (e.g., '.py'). The fnmatch.fnmatch function provides Unix-like glob pattern matching on the indexed filenames. Building the index once up front avoids slow repeated directory scans, making repeated searches fast.
Common mistakes
- Forgetting to use `Path(root) / file` to construct the full path, which works cross-platform.
- Overwriting filename entries when two files share the same name in different directories — the index stores only the last encountered path.
- Using `fnmatch` on the full path instead of just the filename, which can match unintended directories.
- Not handling hidden files or files with special characters in names.
Variations
- Use `pathlib.Path.rglob('*')` for a more Pythonic walk and list comprehension to build the index.
- Add a `search_by_size` or `search_by_date` method by extracting file stats from `os.stat` during indexing.
Real-world use cases
- Quickly locate configuration files (e.g., .env, .ini) across a large project without relying on OS search tools.
- Build a file manager utility that lets users filter files by extension or wildcard in a GUI or terminal app.
- Implement a log-file scanner that indexes logs by date pattern and then searches for error messages within matching files.
Sponsored
More from Automation & scripting
- Batch Rename Hundreds of Files in Python easy
- Build a Command-Line Password Generator in Python easy
- Build a Complete Web Scraper with Requests and BeautifulSoup in Python medium
- Build a Network Ping Monitor in Python medium
- Create a Simple HTTP File Server in Python easy
- Detect and Remove Blurry Images in Python with OpenCV medium
Keep learning
Related tutorials and quizzes for this topic.