How to Find HTML Elements by Tag, Class, ID, CSS Selector, and Attribute in BeautifulSoup
Parse an HTML string with BeautifulSoup and demonstrate five distinct ways to locate elements: by tag name, by class, by ID, by CSS selector, and by attribute.
pip install beautifulsoup4
Python code
38 linesfrom bs4 import BeautifulSoup
html_content = """
<html><body>
<h1 id="title" class="heading">Hello World</h1>
<p class="content">First paragraph</p>
<p class="content special">Second paragraph</p>
<a href="https://example.com" class="link">Click here</a>
<div id="footer">
<p>© 2024</p>
</div>
</body></html>
"""
soup = BeautifulSoup(html_content, 'html.parser')
# By tag
for tag in soup.find_all('p'):
print(f"Tag: {tag.name} -> {tag.text}")
# By class
print("\nBy class 'content':")
for elem in soup.find_all(class_='content'):
print(elem.text)
# By ID
print("\nBy ID 'title':")
print(soup.find(id='title').text)
# By CSS selector
print("\nBy CSS selector '.link' with href attribute:")
for elem in soup.select('.link[href]'):
print(elem['href'])
# By attribute
print("\nBy attribute href:")
for elem in soup.find_all(href=True):
print(elem['href'])
Output
Tag: p -> First paragraph
Tag: p -> Second paragraph
Tag: p -> © 2024
By class 'content':
First paragraph
Second paragraph
By ID 'title':
Hello World
By CSS selector '.link[href]':
https://example.com
By attribute href:
https://example.com
How it works
find_all() is the workhorse for locating multiple tags, supporting keyword arguments like class_ and href=True to filter by attribute. The select() method accepts full CSS selector syntax, enabling complex queries (e.g., .link[href]). find(id='title') returns a single Tag object. All methods work on a parsed BeautifulSoup tree, which normalizes the HTML regardless of its original quality.
Common mistakes
- Using `class` instead of `class_` (since `class` is a reserved word in Python)
- Forgetting that `find_all()` returns a list of Tag objects, not a single object
- Assuming `select()` works with HTML that has mismatched tags
- Using `find(id='title')` when the id contains special characters that break CSS selectors
Variations
- Use `soup.select_one('.link')` to get only the first matching element instead of all matches
- Combine `find_all()` with `limit=1` to restrict the number of results
Real-world use cases
- Scraping product names from an e-commerce site by finding all 'span' tags of a certain class
- Extracting specific links from a webpage using CSS selectors with attribute filters
- Parsing an HTML email template to locate the footer div by its unique ID
Sponsored
More from Files & data
- Build a Command-Line To-Do List Application with Data Persistence in Python easy
- Build a Python Script That Detects and Deletes Empty Files Across Folders easy
- Compare Two Folder Structures and Find Differences in Python easy
- Compress and Extract ZIP Files Programmatically in Python easy
- Convert CSV Files to JSON in Python easy
- Convert Image to ASCII Art in Python medium
Keep learning
Related tutorials and quizzes for this topic.