Proxy locations

Europe

North America

South America

Asia

Africa

Oceania

See all locations

Network status Careers

hello@oxylabs.io

English (EN)

English

中文

Proxies

Proxies & Advanced Proxy Solutions

Residential Proxies

Human-like scraping without IP blocking

Mobile Proxies

Harness the power of IP addresses from real mobile devices

Rotating ISP Proxies

Extract the required data without the fear of getting blocked

Web Unblocker

AI-powered proxy solution for block-free scraping

Shared Datacenter Proxies

Fast and reliable proxies for cost-effective scraping

Dedicated Datacenter Proxies

The highest performing proxies on the market

Static Residential Proxies

Combined power of Datacenter and Residential IPs

Tools & Addons

Oxy Proxy Extension for Chrome

Free Chrome proxy manager extension that works with any proxy provider.

Oxy Proxy Manager for Android

Free Android proxy manager app that works with any proxy provider.

Proxy RotatorAdd-on

Rotates your Datacenter Proxies to help increase success rates.

Scraper APIs

SERP Scraper APIFREE TRIAL

Scalable SERP data delivery from major search engines

E-Commerce Scraper APIFREE TRIAL

Enterprise-level data from largest e-commerce marketplaces

Real Estate Scraper APIFREE TRIAL

Real-time data from popular real estate websites

Web Scraper APIFREE TRIAL

Public data delivery from a majority of websites

Features

Web Crawler

Discovers all pages on a website and fetches data at scale.

Scheduler

Schedules multiple scraping and parsing jobs at specified frequencies.

Custom Parser

Parses scraped documents by executing given parsing instructions.

Headless BrowserNEW

Render JavaScript and execute browser instructions.

DatasetsNew

Datasets

Company Data

Comprehensive datasets for business profiling

E-Commerce Product Data

Datasets for product catalog insights from E-Commerce stores

Job Postings Data

Datasets for labour market research and insights

Community and Code Data

Datasets for developer community trends

Product Review Data

Fresh datasets for user sentiment analysis

Pricing

Proxies

Residential Proxies

Human-like scraping

Starts from

$10

Pay as you go

Mobile Proxies

3G/4G/5G Mobile Proxies

Starts from

$22

Pay as you go

Rotating ISP Proxies

Extended sessions

Starts from

$340/month

Shared Datacenter Proxies

Cost-effective solution

Starts from

$50/month

Dedicated Datacenter Proxies

Superior performance

Starts from

$50/month

Scraper APIs

SERP Scraper API

Scalable SERP data delivery

Starts from

$49/month

E-Commerce Scraper API

Enterprise-level product page data

Starts from

$49/month

Web Scraper API

Data from a majority of websites

Starts from

$49/month

Real Estate Scraper API

Real-time real estate data

Starts from

$49/month

Advanced Proxy Solutions

Web Unblocker

AI-powered proxy solution

Starts from

$75/month

Learn

Getting Started

Knowledge Base

Read the latest articles about the world of web scraping, proxies, and more

Webinars

Check our webinars to learn more about data gathering issues and solutions

White papers

Get extensive white papers to understand the most complex scraping topics

OxyCon

Join inspiring discussions at Oxylabs’ annual web scraping conference

Scraping Experts

Watch lessons by industry-leading experts to gain insights on data gathering

Useful Information

Quick Start Guides

Featured

Explore tutorials and code samples to build a web scraping infrastructure with Oxylabs solutions.

Solutions

By Industry

E-Commerce

Get access to valuable e-commerce data with the help of advanced scraping solutions

Cybersecurity

Collect threat intelligence and inspect risky activities anonymously with reliable proxies

Brand protection

Monitor the web on a large scale to ensure no unauthorized product seeped into the market

SERP Monitoring

Monitor SERPs to enhance your business strategy

Travel and hospitality

Gather real-time flight and hotel data to and build a solid strategy for your travel business.

By Use Case

View all

By Target

View all

Back to blog

Tutorials Scrapers

How to Scrape E-Commerce Websites With Python

Maryia Stsiopkina

2023-10-173 min read

Scraping e-commerce websites and product pages is a common practice driven by several key reasons. It enables businesses to conduct market research, monitor prices, enhance product catalogs, generate leads, and aggregate content. In this tutorial, you’ll learn how to scrape e-commerce businesses using Python and Oxylabs’ E-Commerce Scraper API. The E-Commerce API will help you avoid any antibot protection or CAPTCHA without writing a complex script. Let’s get started.

1. Project setup

First, you’ll have to install Python. Please download it from here.

2. Install dependencies

Next, you’ll have to install a couple of libraries so that you can interact with the E-Commerce Scraper API and parse the HTML content. You can run the below command:

pip install bs4 requests

This will install Beautiful Soup and the requests libraries for you.

3. Import libraries

Now, you can import these libraries using the following code:

from bs4 import BeautifulSoup
import requests

4. Retrieve API credentials

Next, you’ll have to log in to your Oxylabs account to retrieve API credentials. If you don’t have an account yet, you can simply sign up for free and go to the dashboard. There, you’ll get the necessary credentials for the API.

Once you've retrieved your credentials, you can add them to your code.

username, password = 'USERNAME', 'PASSWORD'

Don’t forget to replace USERNAME and PASSWORD with your username and password.

5. Prepare payload

Oxylabs’ E-Commerce Scraper API expects a JSON payload in a POST request. You’ll have to prepare the payload before sending the POST request. The source must be set to universal_ecommerce.

url  = "https://sandbox.oxylabs.io/products"
payload = {
    'source': 'universal_ecommerce',
    'render': 'html',
    'url': url,
}

’render’: 'html' tells the API to execute JavaScript when loading the website content.

Note: For demonstration, we’ll use the sandbox.oxylabs.io store page.

6. Send a POST request to the API

Let’s send the payload using the requests module’s `post()` method. You can pass the credentials using the `auth` parameter.

response = requests.post(
    'https://realtime.oxylabs.io/v1/queries',
    auth=(username, password),
    json=payload,
)
print(response.status_code)

Since the `payload` needs to be in JSON format, you can use the `json` parameter of the `post()` method. If you run this code now, you should see the output of the `status_code` as `200`. Any other numbers mean there are some errors, if you get one, check your credentials, payload, and URL thoroughly and make sure they all are correct.

7. Parse Data

You can extract the HTML content from the `JSON` response of the API and create a Beautiful Soup object named `soup`.

content = response.json()["results"][0]["content"]
soup = BeautifulSoup(content, "html.parser")

Using Web Browser’s developer tools, you can inspect the various elements of the website to find the necessary CSS selectors. Once you've gathered the CSS selectors, you can use the `soup` object to extract those elements. To activate the developer tools, you can simply browse to the target website, right-click, and select inspect. Let’s parse the title, price, and availability of all products.

Title

If you inspect the title, you’ll notice it’s inside a `<h4>` tag with a class `title`.

So, you can use the `soup` object to extract the title as below:

title = soup.find('h4', {"class": "title"}).get_text(strip=True)

Price

Similarly, inspect the price element.

As you can see, it’s wrapped in a <div> with the class price-wrapper. So, use the find() method, as shown below, to extract the price text.

price = soup.find('div', {"class": "price-wrapper"}).get_text(strip=True)

Availability

There are two types of availability on this website, In Stock and Out of Stock. If you inspect both elements, you’ll notice they have different classes.

Fortunately, the Beautiful Soup library’s find() method supports multiple class lookups. You’ll have to pass the classes in a list object.

availability = soup.find('p', {"class": ["in-stock", "out-of-stock"]}).get_text(strip=True)

All products

To extract all product data, you’ll have to inspect the product elements and find the appropriate CSS selectors.

Since each of the product elements is wrapped in a <div> with class product-card, you can loop through each element using a for loop.

data = []
for elem in soup.find_all("div", {"class": "product-card"}):
    title = elem.find('h4', {"class": "title"}).get_text(strip=True)
    price = elem.find('div', {"class": "price-wrapper"}).get_text(strip=True)


    availability = elem.find('p', {"class": ["in-stock", "out-of-stock"]}).get_text(strip=True)
    data.append({
        "title": title,
        "price": price,
        "availability": availability,
    })
print(data)

The data list will contain all the product data.

Full source code

The entire scraper is given below for your convenience. You can use it as a building block for your next scraper. You’ll only have to replace the URL and parsing logic with your own.

from bs4 import BeautifulSoup
import requests


username, password = 'USERNAME', 'PASSWORD'
url  = "https://sandbox.oxylabs.io/products"
payload = {
    'source': 'universal_ecommerce',
    'render': 'html',
    'url': url,
}
response = requests.post(
    'https://realtime.oxylabs.io/v1/queries',
    auth=(username, password),
    json=payload,
)
print(response.status_code)


content = response.json()["results"][0]["content"]
soup = BeautifulSoup(content, "html.parser")


data = []
for elem in soup.find_all("div", {"class": "product-card"}):
    title = elem.find('h4', {"class": "title"}).get_text(strip=True)
    price = elem.find('div', {"class": "price-wrapper"}).get_text(strip=True)


    availability = elem.find('p', {"class": ["in-stock", "out-of-stock"]}).get_text(strip=True)
    data.append({
        "title": title,
        "price": price,
        "availability": availability,
    })
print(data)

Here’s the output:

Conclusion

So far, you’ve learned how to scrape e-commerce stores using Python. You also explored Oxylabs’ E-Commerce Scraper API and learned how to use it for scraping complex websites with ease. By using the techniques described in this article, you can perform large-scale web scraping on websites with bot protection and CAPTCHA.

Frequently asked questions

How do I scrape an e-commerce website?

For e-commerce and product details data scraping, you’ll first need to pick a programming language you are most comfortable with. Python, Go, JavaScript, Ruby, and Elixir are popular programming languages with excellent support for large-scale e-commerce data scraping. After that, you’ll have to find the necessary tools and libraries available to help you extract data from the target website. You can learn the web scraping best practices here.

Is it ethical to web scrape a website?

Web scraping is ethical as long as the scrapers respect all the rules set by the target websites, don’t harm the website, don’t breach any laws, and use the scraped data with good intentions. It’s essential to respect the ToS of the website and obey the rules of the robots.txt file. Read this article to learn more about ethical web scraping.

About the author

Maryia Stsiopkina

Senior Content Manager

Maryia Stsiopkina is a Senior Content Manager at Oxylabs. As her passion for writing was developing, she was writing either creepy detective stories or fairy tales at different points in time. Eventually, she found herself in the tech wonderland with numerous hidden corners to explore. At leisure, she does birdwatching with binoculars (some people mistake it for stalking), makes flower jewelry, and eats pickles.

Learn more about Maryia Stsiopkina

All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.

Scrapers Tutorials