Introduction
This Python web scraping tutorial shows how to build a simple scraper using Python and Beautiful Soup. If you want to extract data from website pages for research, monitoring, or lightweight Python automation, this is a practical place to start. We will keep the example small and focused so you can understand the core workflow quickly.
What You Need
Before coding, install the required libraries. We will use requests to download page HTML and Beautiful Soup to parse it. This short Beautiful Soup guide assumes you already have Python installed.
pip install requests beautifulsoup4How a Simple Scraper Works
A basic scraper usually follows three steps:
1. Request the page
Send an HTTP request to the target URL and get its HTML response.
2. Parse the HTML
Use Beautiful Soup to read the markup and locate the elements you need.
3. Extract the data
Collect text, links, headings, prices, or other structured values from the page.
Example: Scrape Article Titles
The example below fetches a page and prints all h2 titles. This is a common pattern when you want to extract data from website content without building a large crawler.
import requests
from bs4 import BeautifulSoup
url = "https://example.com"
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
for title in soup.find_all("h2"):
print(title.get_text(strip=True))Understanding the Code
Using requests
requests.get() downloads the page content. In real projects, always check the response status before parsing.
Using Beautiful Soup
BeautifulSoup(response.text, "html.parser") converts raw HTML into a searchable object. You can then use methods like find(), find_all(), or CSS selectors.
Cleaning extracted text
get_text(strip=True) removes extra spaces and returns clean text, which is useful for Python automation workflows and data pipelines.
Extract Links from a Page
If you also want URLs, loop through anchor tags and read the href attribute.
for link in soup.find_all("a"):
href = link.get("href")
if href:
print(href)Best Practices
Respect website rules
Check the site terms and robots.txt before scraping. Not every page should be scraped.
Handle errors
Add basic checks for failed requests, missing elements, and timeouts so your scraper does not break easily.
Scrape responsibly
Avoid sending too many requests too quickly. Responsible scraping is an important part of Python automation.
Conclusion
This Beautiful Soup guide covered the essentials: requesting a page, parsing HTML, and extracting useful values. With this Python web scraping tutorial, you now have a simple foundation to extract data from website pages and expand into more advanced Python automation tasks such as saving results to CSV, scraping multiple pages, or scheduling recurring jobs.