Scrape HTML Tables and Convert Them to CSV Using Beautiful Soup in Python

Scrape a Wikipedia table with Beautiful Soup and write the data to a CSV file using the csv module.

Medium Python 3.9+ Jun 27, 2026 Files & data 2 views 0 copies

web scraping beautiful soup csv tables requests data extraction

Requires third-party packages — install first

pip install requests beautifulsoup4

Python code

24 lines

Python 3.9+

import requests
from bs4 import BeautifulSoup
import csv

url = "https://en.wikipedia.org/wiki/List_of_countries_by_GDP_(nominal)"
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')

tables = soup.find_all('table', {'class': 'wikitable'})

if tables:
    target_table = tables[2]
    rows = target_table.find_all('tr')
    
    with open('countries_gdp.csv', 'w', newline='', encoding='utf-8') as f:
        writer = csv.writer(f)
        for row in rows:
            cols = row.find_all(['th', 'td'])
            cols = [col.get_text(strip=True) for col in cols]
            writer.writerow(cols)
    
    print("CSV file 'countries_gdp.csv' created successfully.")
else:
    print("No tables found with class 'wikitable'.")

Output

stdout

CSV file 'countries_gdp.csv' created successfully.

How it works

The code sends an HTTP GET request to the target URL and parses the HTML with BeautifulSoup. It finds all <table> elements with the CSS class 'wikitable' and selects the third one (index 2), which is the GDP table on that page. Each <tr> row is processed: table cells (<th> or <td>) are extracted with get_text(strip=True) to remove extra whitespace, then written to the CSV file. The newline='' argument in open() prevents blank rows on Windows. The result is a clean CSV file that can be opened in spreadsheet applications.

Common mistakes

Using an incorrect table index; the target table may be at a different position if the page changes.
Forgetting to install the required packages: `pip install requests beautifulsoup4`.
Not handling missing tables gracefully — always check if `tables` is non-empty before accessing an index.
Omitting `newline=''` in `open()` which can cause extra blank lines in the CSV on Windows.

Variations

Use `pandas.read_html()` to directly parse HTML tables into DataFrames and then save as CSV.
Loop through all wikitable tables and save each to a separate CSV file.
Filter rows based on a condition (e.g., countries above a certain GDP) before writing.

Real-world use cases

Extracting a list of country statistics from Wikipedia for a data visualization project.
Automating collection of sports leaderboard tables from a website for a report.
Gathering product pricing tables from a comparison page to analyze market trends.

Scrape HTML Tables and Convert Them to CSV Using Beautiful Soup in Python

Python code

Output

How it works

Common mistakes

Variations

Real-world use cases

More from Files & data

Quizzes

Python code

Output

How it works

Common mistakes

Variations

Real-world use cases

More from Files & data

Keep learning

Quizzes