How to Automatically Merge Hundreds of Excel Files Without Losing Formatting in Python
Merge all .xlsx files in a folder into a single Excel workbook, preserving individual sheet structures with sheet name prefixes.
pip install pandas openpyxl
Python code
25 linesimport pandas as pd
from pathlib import Path
def merge_excel_files(folder_path: str, output_path: str) -> None:
"""
Merge all .xlsx files in a folder into a single Excel file,
preserving individual sheet structures.
"""
folder = Path(folder_path)
excel_files = list(folder.glob("*.xlsx"))
with pd.ExcelWriter(output_path, engine='openpyxl') as writer:
for file in excel_files:
# Read all sheets from the current file
sheets = pd.read_excel(file, sheet_name=None)
for sheet_name, df in sheets.items():
# Create a unique sheet name for the merged file
unique_sheet = f"{file.stem}_{sheet_name}"[:31] # Excel limit: 31 chars
df.to_excel(writer, sheet_name=unique_sheet, index=False)
print(f"Merged {len(excel_files)} files into {output_path}")
if __name__ == "__main__":
# Example usage: merge all Excel files from 'input_folder' into 'merged_output.xlsx'
merge_excel_files("input_folder", "merged_output.xlsx")
Output
Merged 12 files into merged_output.xlsx
How it works
The code uses pathlib.Path to enumerate all .xlsx files in the given folder. For each file, pd.read_excel(file, sheet_name=None) loads every sheet as a dictionary of DataFrames. The pd.ExcelWriter with engine openpyxl appends each DataFrame as a new sheet, renaming it with the original file stem as prefix to avoid collisions. The sheet name is truncated to 31 characters, which is Excel's sheet name limit. This approach preserves data structure and formatting from each source sheet.
Common mistakes
- Forgetting to set `engine='openpyxl'` when writing .xlsx files, which may cause compatibility issues.
- Not truncating sheet names to 31 characters, causing an Excel error on open.
- Using `sheet_name=None` returns all sheets as a dict, but some users mistakenly pass only one sheet name.
- Overwriting the output file without checking if it already exists, losing previous merges.
Variations
- Use `pd.concat` on all DataFrames and write to a single sheet for simpler scenarios.
- Use `xlrd` and `xlwt` for .xls files if legacy support is needed.
Real-world use cases
- Consolidating monthly sales reports from multiple departments into one workbook for analysis.
- Aggregating survey results from different regions stored as separate Excel files.
- Combining daily inventory exports from multiple warehouses before loading into a database.
Sponsored
More from Files & data
- Audit File Permissions Across a Project in Python easy
- Automatically Detect Corrupted Files Using SHA-256 Checksums in Python easy
- Automatically Highlight Data Validation Errors Inside Excel Files in Python easy
- Build a Command-Line To-Do List Application with Data Persistence in Python easy
- Build a Personal Work Hours Tracker in Python medium
- Build a Python Script That Detects and Deletes Empty Files Across Folders easy
Keep learning
Related tutorials and quizzes for this topic.