Proxy locations

Europe

North America

South America

Asia

Africa

Oceania

See all locations

Network status Careers

hello@oxylabs.io

English (EN)

English

中文

Proxies

Proxies & Advanced Proxy Solutions

Residential Proxies

Human-like scraping without IP blocking

Mobile Proxies

Harness the power of IP addresses from real mobile devices

Rotating ISP Proxies

Extract the required data without the fear of getting blocked

Web Unblocker

AI-powered proxy solution for block-free scraping

Shared Datacenter Proxies

Fast and reliable proxies for cost-effective scraping

Dedicated Datacenter Proxies

The highest performing proxies on the market

Static Residential Proxies

Combined power of Datacenter and Residential IPs

Tools & Addons

Oxy Proxy Extension for Chrome

Free Chrome proxy manager extension that works with any proxy provider.

Oxy Proxy Manager for Android

Free Android proxy manager app that works with any proxy provider.

Proxy RotatorAdd-on

Rotates your Datacenter Proxies to help increase success rates.

Scraper APIs

SERP Scraper APIFREE TRIAL

Scalable SERP data delivery from major search engines

E-Commerce Scraper APIFREE TRIAL

Enterprise-level data from largest e-commerce marketplaces

Real Estate Scraper APIFREE TRIAL

Real-time data from popular real estate websites

Web Scraper APIFREE TRIAL

Public data delivery from a majority of websites

Features

Web Crawler

Discovers all pages on a website and fetches data at scale.

Scheduler

Schedules multiple scraping and parsing jobs at specified frequencies.

Custom Parser

Parses scraped documents by executing given parsing instructions.

Headless BrowserNEW

Render JavaScript and execute browser instructions.

DatasetsNew

Datasets

Company Data

Comprehensive datasets for business profiling

E-Commerce Product Data

Datasets for product catalog insights from E-Commerce stores

Job Postings Data

Datasets for labour market research and insights

Community and Code Data

Datasets for developer community trends

Product Review Data

Fresh datasets for user sentiment analysis

Pricing

Proxies

Residential Proxies

Human-like scraping

Starts from

$10

Pay as you go

Mobile Proxies

3G/4G/5G Mobile Proxies

Starts from

$22

Pay as you go

Rotating ISP Proxies

Extended sessions

Starts from

$340/month

Shared Datacenter Proxies

Cost-effective solution

Starts from

$50/month

Dedicated Datacenter Proxies

Superior performance

Starts from

$50/month

Scraper APIs

SERP Scraper API

Scalable SERP data delivery

Starts from

$49/month

E-Commerce Scraper API

Enterprise-level product page data

Starts from

$49/month

Web Scraper API

Data from a majority of websites

Starts from

$49/month

Real Estate Scraper API

Real-time real estate data

Starts from

$49/month

Advanced Proxy Solutions

Web Unblocker

AI-powered proxy solution

Starts from

$75/month

Learn

Getting Started

Knowledge Base

Read the latest articles about the world of web scraping, proxies, and more

Webinars

Check our webinars to learn more about data gathering issues and solutions

White papers

Get extensive white papers to understand the most complex scraping topics

OxyCon

Join inspiring discussions at Oxylabs’ annual web scraping conference

Scraping Experts

Watch lessons by industry-leading experts to gain insights on data gathering

Useful Information

Quick Start Guides

Featured

Explore tutorials and code samples to build a web scraping infrastructure with Oxylabs solutions.

Solutions

By Industry

E-Commerce

Get access to valuable e-commerce data with the help of advanced scraping solutions

Cybersecurity

Collect threat intelligence and inspect risky activities anonymously with reliable proxies

Brand protection

Monitor the web on a large scale to ensure no unauthorized product seeped into the market

SERP Monitoring

Monitor SERPs to enhance your business strategy

Travel and hospitality

Gather real-time flight and hotel data to and build a solid strategy for your travel business.

By Use Case

View all

By Target

View all

Back to blog

Tutorials Scrapers

How to Scrape Google Shopping Results: A Step-by-Step Guide

Yelyzaveta Nechytailo

2023-03-166 min read

In today’s competitive business environment, it’s hard to imagine a scenario where an e-commerce company or a retailer stays in demand without turning to web scraping. To shortly answer why, gathering accurate public data from thousands of targets worldwide is what gives them a chance to draw actionable insights and, eventually, present customers with the best deals.

This tutorial will demonstrate how you can scrape publicly-available data from Google Shopping hassle-free. In addition to the guide itself, we’ll shortly cover whether it’s legal to scrape Google Shopping and what difficulties you can encounter in the process.

What is Google Shopping?

Formerly known as Google Products Search, Google Products, and Froogle, Google Shopping is a service that allows users to browse, compare, and shop for products from different suppliers who have paid to be featured on the website.

While giving consumers an opportunity to choose the best offers among thousands of brands, Google Shopping is also beneficial for retailers. When a user clicks on a product link, they are redirected to the vendor’s website for purchasing; thus, Google Shopping acts as a solution for businesses to advertise their products online.

More information on how Google Shopping works can be found here.

Google Shopping results page structure overview

The data you get when browsing Google Shopping depends on some input parameters: Search, Product, and Price. Let's briefly discuss each of these parameters:

Search: A list of the items on Google Shopping with information about each item, such as its ID, title, description, price, and availability.
Product: Information on a single product's listing, details about other retailers selling it, and the costs at which it’s offered.
Price: A list of all the product retailers along with the prices they offer and other details like delivery information, total costs, store name, etc.

Search page

The Google Shopping search results page lists all the relevant items available for the required product. The below screenshot highlights different attributes of a results page for the query “levis.”

Search bar: Allows a user to search for any product on Google Shopping.
List of products: Lists all the products and the details of the searched product.
Filters: Allows you to apply any filter to your search, for example, price range, color, style, etc.
Sorting options: This drop-down list enables you to sort your search on multiple attributes, for example, increasing price, decreasing price, popularity, etc.
The list of products shows an individual product with the following product attributes: product name, price, name of the retailer or store, delivery Information.

Products page

When you select a specific item from the search page, you are directed to the Products page. This page contains detailed information about that particular product, such as its pictures, key features, product details, product reviews, retailers and prices information, and much more.

Product name: Title of the product.
Product Highlights: Main features to have a quick product overview.
Product details: Detailed description of the product.
Prices: List of different retailers and their prices.
Product reviews: Product rating and customer reviews.
Min and max prices: Product’s minimum to maximum pricing range sold by different sellers.
General specifications: General information about the product.

Pricing page

This page lists all the prices of different retailers’ products. It also shows if a store or retailer is a trusted one or not. Moreover, it gives information if the retailer has a Google Guarantee.

Product name: Name of the searched product.
Rating: Overall rating of the product and number of reviews.
Prices from different stores: List of retailers, along with their offers, prices, and the link to visit their website to buy the product.
Filters: These filters can be applied to the retailers’ list.

Is it legal to scrape Google Shopping results?

In general, web scraping is legal as long as you strictly follow all the regulations surrounding the public data you wish to gather. However, we still recommend seeking professional legal advice to rule out any possible risks.

If you wish to dive deeper into the topic of web scraping legality, check out our extensive blog post.

The pain of scraping Google Shopping

Though doable, scraping Google Shopping might not be the easiest task to take on. Not only is Google Shopping good at detecting automated requests, but it also requires parsing JavaScript, which is an “expensive” operation that slows down the scraping process.

Therefore, to make sure you effortlessly scrape and parse a variety of Google Shopping page types, it’s best to rely on a high-quality scraping solution, such as Oxylabs’ Google Shopping API. This SERP API is specifically designed to deal with the challenges of Google scraping process and lets you gather accurate real-time data globally. If you want to extract data from the Google search engine, check out our other tutorial on how to scrape Google search results.

Step-by-step guide for scraping Google Shopping results using Google Shopping API

Step 1: Set up Python and install required libraries

To get started, you must have Python 3.6+ installed on your system. Then, you need to install the following packages to code the scraper.

Requests - to send the request to the API.
Pandas - to populate the data in the DataFrame data structure.

To install the packages, use the following command:

pip install requests pandas

Step 2: Set up a payload

Search page

The first step is creating a structure payload containing different query parameters. Below is a list of the query parameters and their brief description.

Parameter	Description	Default Value
source	This parameter sets the type of scraper to use.	google_shopping_search
domain	Domain name	com
start_page	Starting page number	1
pages	Number of pages that you want to retrieve from the search result.	1
locale	Accept-Language header value to change in web interface language of Google Shopping page.	-
results_language	Languages supported by Google.	-
geo_location	The region for which the output should be adjusted. Using this parameter correctly is important if you want the right info.	-
user_agent_type	The type of device and browser.	desktop
render	It allows you to execute Javascript.	-
callback_url	This is the URL where your POST request will be returned with the response.	-
parse	If its value is set to true, it will return the structured data.	-
context: nfpr	If it is set to true, it will turn off auto-correct spelling.	false
context: sort_by	It sorts the products list in different forms. The r value is for default sorting, rv is for review score, p is for increasing price, and pd is for decreased pricing.	r
context: min_price	Apply filter for the minimum price value.	-
context: max_price	Apply filter for the maximum price value.	-

For more detailed information on the parameters, check out our documentation.

Using the parameters mentioned in the table, we can create a payload structure as follows:

payload = {
   'source': 'google_shopping_search',
   'domain': 'com',
   'query': 'levis',
   'pages': 1,
   'context': [
       {'key': 'sort_by', 'value': 'pd'},
       {'key': 'min_price', 'value': 30},
   ],
   'parse': 'true',
}

Step 3: Send a POST request

After the payload structure is ready, you can create the request by passing your authentication key.

response = requests.request(
   'POST',
   'https://realtime.oxylabs.io/v1/queries',
   auth=('username', 'password'),
   json=payload,
)

Step 4: Extract product data from a JSON response

We will be extracting Product Title, Price, and Store name from the response. Since we made the payload parameter parse: true, so we will get the JSON response. We can get all this data from the JSON response.

The code below extracts the data from JSON format and stores it in DataFrame.

#Get the content from the response
result=response.json()['results'][0]['content']
products = result['results']['organic']


#Create a DataFrame
df = pd.DataFrame(columns=['Product Title', 'Price', 'Store'])


#iterate through all the products
for p in products:
   title = p['title']
   price = p['price_str']
   store = p['merchant']['name']
   df = pd.concat([pd.DataFrame([[title, price, store]], columns=df.columns),
                   df], ignore_index=True)

The script extracts relevant product information from the response and stores it in the df DataFrame.

Step 5: Save extracted data to a CSV using Pandas

Using the following script, we can easily export the DataFrame to CSV or JSON files:

df.to_csv('google_shopping_search.csv', index=False)
df.to_json('google_shopping_search.json', orient='split', index=False)

Let’s put all the code together and see the output.

import pandas as pd
import requests


# Structure payload
payload = {
   'source': 'google_shopping_search',
   'domain': 'com',
   'query': 'levis',
   'pages': 1,
   'context': [
       {'key': 'sort_by', 'value': 'pd'},
       {'key': 'min_price', 'value': 30},
   ],
   'parse': 'true',
}


# Get response
response = requests.request(
   'POST',
   'https://realtime.oxylabs.io/v1/queries',
   auth=('username', 'password'),
   json=payload,


)


#Get the content from the response
result=response.json()['results'][0]['content']
products = result['results']['organic']


#Create a DataFrame
df = pd.DataFrame(columns=['Product Title', 'Price', 'Store'])


#iterate through all the products
for p in products:
   title = p['title']
   price = p['price_str']
   store = p['merchant']['name']
   df = pd.concat([pd.DataFrame([[title, price, store]], columns=df.columns),
                   df], ignore_index=True)


#Copy the DataFrame to CSV and JSON files
df.to_csv('google_shopping_search.csv', index=False)
df.to_json('google_shopping_search.json', orient='split', index=False)

The script doesn’t contain any print statements and writes everything in CSV and JSON files. Let’s look at a portion of the output CSV file.

As expected, the output CSV contains the Product Titles, Prices, and Store information for all the products listed on the search page.

Now, let’s scrape a specific product page.

Product page

The payload structure will be created using different parameters for the products page. Below is a list of the query parameters and their brief description.

Parameter	Description	Default Value
source	This parameter sets the type of scraper to use.	google_shopping_product
domain	Domain name	com
locale	Accept-Language header value to change in web interface language of Google Shopping page.	-
results_language	Languages supported by Google.	-
geo_location	The region for which the output should be adjusted. Using this parameter correctly is important if you want the right info.	-
user_agent_type	The type of device and browser.	desktop
render	It allows you to execute Javascript.	-
callback_url	This is the URL where your POST request will be returned with the response.	-
parse	If its value is set to true, it will return the structured data.	-

Once again, for more detailed information on the parameters, check out our documentation.

We will be using product ID 4505166624001087642 for scraping. Using the parameters mentioned in the table, we can create a payload structure like this:

payload = {
  'source': 'google_shopping_product',
  'domain': 'com',
  'query': '4505166624001087642',
  'parse': 'true',
}

After the payload structure is ready, you can create the request by passing your authentication key.

response = requests.request(
  'POST',
  'https://realtime.oxylabs.io/v1/queries',
  auth=('username', 'password'),
  json=payload,
)

We’ll extract the Product Title, Product Details, Highlights, Rating, and Reviews Count from the response received. Like in the previous section, we’ll use JSON response and extract our desired output. You can see the structure of JSON output here.

# Get the content
product=response.json()['results'][0]['content']


# create a DataFrame
df = pd.DataFrame(columns=['Product Title', 'Product Details',
                          'Highlights', 'Rating', 'Reviews Count'])


# Get the elements from the response object
title = product['title']
details = product['description']
highlights = product['highlights']
rating = product['reviews']['rating']
reviews_count = product['reviews']['reviews_count']


# Add all the elements in DataFrame
df = pd.concat([pd.DataFrame([[title, details, highlights, rating, reviews_count]],
                           columns=df.columns), df], ignore_index=True)

In the above code, we’ve created a DataFrame object that will save all the extracted data in it. We can print this DataFrame or write it in CSV or JSON files.

# Copy the data in CSV and JSON file
df.to_csv('google_shopping_product.csv', index=False)
df.to_json('google_shopping_product.json', orient = 'split', index = False)


# Print the data on screen
print ('Product Name: ' + title)
print ('Product Details: ' + details)
print ('Product Highlights: ' + str(highlights))
print ('Product Rating: ' + str(rating))
print ('Reviews Count: ' + str(reviews_count))

Let’s put all the code together and see the output.

import pandas as pd
import requests


# Structure payload.
payload = {
  'source': 'google_shopping_product',
  'domain': 'com',
  'query': '4505166624001087642',
  'parse': 'true',
}


# Get response.
response = requests.request(
  'POST',
  'https://realtime.oxylabs.io/v1/queries',
  auth=('username', 'password'),
  json=payload,
)


# Get the content
product=response.json()['results'][0]['content']


# create a DataFrame
df = pd.DataFrame(columns=['Product Title', 'Product Details',
                          'Highlights', 'Rating', 'Reviews Count'])


# Get the elements from the response object
title = product['title']
details = product['description']
highlights = product['highlights']
rating = product['reviews']['rating']
reviews_count = product['reviews']['reviews_count']


# Add all the elements in DataFrame
df = pd.concat([pd.DataFrame([[title, details, highlights, rating, reviews_count]],
                           columns=df.columns), df], ignore_index=True)


# Copy the data in CSV and JSON file
df.to_csv('google_shopping_product.csv', index=False)
df.to_json('google_shopping_product.json', orient = 'split', index = False)


# Print the data on screen
print ('Product Name: ' + title)
print ('Product Details: ' + details)
print ('Product Highlights: ' + str(highlights))
print ('Product Rating: ' + str(rating))
print ('Reviews Count: ' + str(reviews_count))

We’ve just successfully scraped a product page at Google Shopping. Let’s move on to scrape the Pricing page.

Pricing page

The payload structure will be created using different parameters for the pricing page. Below is a list of the query parameters and their brief description.

Parameter	Description	Default Value
source	This parameter sets the type of scraper to use.	google_shopping_pricing
domain	Domain name	com
start_page	Starting page number.	1
pages	Number of pages you want to retrieve from the search result.	1
locale	Accept-Language header value to change in web interface language of Google Shopping page.	-
results_language	Languages supported by Google.	-
geo_location	The region for which the output should be adjusted. Using this parameter correctly is important if you want the right info.	-
user_agent_type	The type of device and browser.	desktop
render	It allows you to execute Javascript.	-
callback_url	This is the URL where your POST request will be returned with the response.	-
parse	If its value is set to true, it will return the structured data.	-

More information on the parameters can be found in our documentation.

We’ll be using product ID 4505166624001087642 for scraping. Using the parameters mentioned in the table, we can create a payload structure like this:

payload = {
   'source': 'google_shopping_pricing',
   'domain': 'com',
   'query': '4505166624001087642',
   'parse': 'true'
}

After the payload structure is ready, you can create the request by passing your authentication key.

response = requests.request(
   'POST',
   'https://realtime.oxylabs.io/v1/queries',
   auth=('username', 'password'),
   json=payload,
)

We’ll be extracting Product Name, Special Offer, Item Price, Total Price and Shipping charges from the JSON response received. You can find the structure of the JSON response here.

result = response.json()['results'][0]['content']
title = result['title']
pricing = result['pricing']
# Create a DataFrame
df = pd.DataFrame(columns=['Product Name', 'Special Offer',
                          'Item Price', 'Total Price', 'Shipping'])


for p in pricing:
   offer = p['details']
   item_price = p['price']
   total_price = p['price_total']
   shipping = p['price_shipping']
   df = pd.concat([pd.DataFrame([[title, offer, item_price,
                                  total_price, shipping]], columns=df.columns), df],
                  ignore_index=True)

The above script stores the extracted data in a DataFrame object. Therefore, saving data in CSV, JSON, or other formats is easy. Just execute the following code to save the whole data in CSV and JSON files.

df.to_csv('google_shopping_pricing.csv', index=False)
df.to_json('google_shopping_pricing.json', orient='split', index=False)

Let’s put all the code together and see the output.

import pandas as pd  # include the pandas library for DataFrame
import requests  # Include the requests library


# Structure payload.
payload = {
   'source': 'google_shopping_pricing',
   'domain': 'com',
   'query': '4505166624001087642',
   'parse': 'true'
}


# Get response.
response = requests.request(
   'POST',
   'https://realtime.oxylabs.io/v1/queries',
   auth=('username', 'password'),
   json=payload,
)


# Get the content from the response
result = response.json()['results'][0]['content']
title = result['title']
pricing = result['pricing']
# Create a DataFrame
df = pd.DataFrame(columns=['Product Name', 'Special Offer',
                          'Item Price', 'Total Price', 'Shipping'])


for p in pricing:
   offer = p['details']
   item_price = p['price']
   total_price = p['price_total']
   shipping = p['price_shipping']
   df = pd.concat([pd.DataFrame([[title, offer, item_price,
                                  total_price, shipping]], columns=df.columns), df],
                  ignore_index=True)


# Copy the DataFrame to CSV and JSON files
df.to_csv('google_shopping_pricing.csv', index=False)
df.to_json('google_shopping_pricing.json', orient='split', index=False)

Conclusion

Scraping Google Shopping is essential if you’re looking to retrieve accurate data on your biggest competitors’ products and prices and make data-driven decisions to scale your business. We hope this tutorial was clear and will contribute to more effortless and smooth data-gathering activities. You can also find all the necessary code files on our GitHub. But in case you still have any questions, don’t hesitate to contact us – Oxylabs’ professional team is always ready to assist you.

About the author

Yelyzaveta Nechytailo

Senior Content Manager

Yelyzaveta Nechytailo is a Senior Content Manager at Oxylabs. After working as a writer in fashion, e-commerce, and media, she decided to switch her career path and immerse in the fascinating world of tech. And believe it or not, she absolutely loves it! On weekends, you’ll probably find Yelyzaveta enjoying a cup of matcha at a cozy coffee shop, scrolling through social media, or binge-watching investigative TV series.

Learn more about Yelyzaveta Nechytailo

All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.

Scrapers Tutorials