Back to blog
Augustas Pelakauskas
Google Trends is a platform that provides data on top Google Search queries. Using this tool, you can discover the search interest rates for specific keywords during specific time frames in specific regions.
Having insight into what users are searching for can be critical for a wide range of businesses. However, covering vast amounts of data may be difficult without an automated data-gathering solution.
This article provides step-by-step instructions on scraping public Google Trends data with Python and SERP Scraper API.
Here are some of the uses for data from Google Trends.
Keyword research
Google Trends is widely used among SEO specialists and content marketers. Since it provides insights into the past and present popularity of search terms, these professionals can tailor their marketing strategies to gain more website traffic.
Market research
Google Trends data can be used for market research, helping businesses understand consumer interests and preferences over time. For example, e-commerce businesses can use Google Trends search insights for product development.
Societal research
Google Trends is a valuable resource for journalists and researchers, offering a glimpse into societal trends and public interest in specific topics.
These are just a few examples. Google Trends data can also help with investment decisions, brand reputation monitoring, and other cases.
Now, let’s get into gathering publicly available Google Trends data.
To start, import a couple of libraries:
pip install requests
pip install pandas
You’ll need the requests library to use SERP Scraper API and pandas to manipulate received data.
Let’s begin with building an initial request to the API:
import requests
USERNAME = "YourUsername"
PASSWORD = "YourPassword"
query = "persian cat"
print(f"Getting data from Google Trends for {query} keyword..")
url = "https://realtime.oxylabs.io/v1/queries"
auth = (USERNAME, PASSWORD)
payload = {
"source": "google_trends_explore",
"query": query,
}
Variables USERNAME and PASSWORD contain the authentication required by SERP Scraper API, and payload contains configuration for the API on how to process your request.
Meanwhile, source defines the type of scraper that should be used to process this request. Naturally, google_trends_explore is tailored for this specific use case.
Also, define a query that you want to search for. For more information about possible parameters, check our documentation.
The configuration is done. You can now form and send the request:
try:
response = requests.request("POST", url, auth=auth, json=payload)
except requests.exceptions.RequestException as e:
print("Caught exception while getting trend data")
raise e
data = response.json()
content = data["results"][0]["content"]
print(content)
If everything’s in order, when you run the code, you should see the raw results of the query in the terminal window like this:
Now that you have the results, adjust the formatting and save in the CSV format – this way, it’ll be easier to analyze the data. All this can be done with the help of the pandas Python library.
The response you get from the API provides you with four categories of information: interest_over_time, breakdown_by_region, related_topics, and related_queries. Let’s split each category into its own separate CSV file.
Begin by converting each into a pandas dataframe:
def flatten_topic_data(topics_data: List[dict]) -> List[dict]:
"""Flattens related_topic data"""
topics_items = []
for item in topics_data[0]["items"]:
item_dict = {
"mid": item["topic"]["mid"],
"title": item["topic"]["title"],
"type": item["topic"]["type"],
"value": item["value"],
"formatted_value": item["formatted_value"],
"link": item["link"],
"keyword": topics_data[0]["keyword"],
}
topics_items.append(item_dict)
return topics_items
trend_data = json.loads(content)
print("Creating dataframes..")
# Interest over time
iot_df = pd.DataFrame(trend_data["interest_over_time"][0]["items"])
iot_df["keyword"] = trend_data["interest_over_time"][0]["keyword"]
# Breakdown by region
bbr_df = pd.DataFrame(trend_data["breakdown_by_region"][0]["items"])
bbr_df["keyword"] = trend_data["breakdown_by_region"][0]["keyword"]
# Related topics
rt_data = flatten_topic_data(trend_data["related_topics"])
rt_df = pd.DataFrame(rt_data)
# Related queries
rq_df = pd.DataFrame(trend_data["related_queries"][0]["items"])
rq_df["keyword"] = trend_data["related_queries"][0]["keyword"]
As the data for related_topics is multi-leveled, you'll have to flatten the structure into a single-leveled one. Thus, the function flatten_topic_data was added to do so.
The only thing left is to save the data to a file:
CSV_FILE_DIR = "./csv/"
keyword = trend_data["interest_over_time"][0]["keyword"]
keyword_path = os.path.join(CSV_FILE_DIR, keyword)
try:
os.makedirs(keyword_path, exist_ok=True)
except OSError as e:
print("Caught exception while creating directories")
raise e
print("Dumping to csv..")
iot_df.to_csv(f"{keyword_path}/interest_over_time.csv", index=False)
bbr_df.to_csv(f"{keyword_path}/breakdown_by_region.csv", index=False)
rt_df.to_csv(f"{keyword_path}/related_topics.csv", index=False)
rq_df.to_csv(f"{keyword_path}/related_queries.csv", index=False)
You’ve now created a folder structure to hold all of your separate CSV files grouped by keyword.
With all the initial request information transformed into dataframes, you could now use the pandas library and create simple keyword comparisons.
This would require you to adjust your current code to handle multiple keywords and then add functionality to gather all the information in one place.
Let’s begin with multiple keyword handling. To make the code iterable, split it into reusable functions.
First, extract the code for the request to the API into a function that takes a query as an argument and returns you the response:
def get_trend_data(query: str) -> dict:
"""Gets a dictionary of trends based on given query string from Google Trends via SERP Scraper API"""
print(f"Getting data from Google Trends for {query} keyword..")
url = "https://realtime.oxylabs.io/v1/queries"
auth = (USERNAME, PASSWORD)
payload = {
"source": "google_trends_explore",
"query": query,
}
try:
response = requests.request("POST", url, auth=auth, json=payload)
except requests.exceptions.RequestException as e:
print("Caught exception while getting trend data")
raise e
data = response.json()
content = data["results"][0]["content"]
return json.loads(content)
Next, you need a function that would transform a raw response into pandas dataframes, save said dataframes as CSV files, and return them:
def dump_trend_data_to_csv(trend_data: dict) -> dict:
"""Dumps given trend data to generated CSV file"""
# Interest over time
print("Creating dataframes..")
iot_df = pd.DataFrame(trend_data["interest_over_time"][0]["items"])
iot_df["keyword"] = trend_data["interest_over_time"][0]["keyword"]
# Breakdown by region
bbr_df = pd.DataFrame(trend_data["breakdown_by_region"][0]["items"])
bbr_df["keyword"] = trend_data["breakdown_by_region"][0]["keyword"]
# Related topics
rt_data = flatten_topic_data(trend_data["related_topics"])
rt_df = pd.DataFrame(rt_data)
# Related queries
rq_df = pd.DataFrame(trend_data["related_queries"][0]["items"])
rq_df["keyword"] = trend_data["related_queries"][0]["keyword"]
keyword = trend_data["interest_over_time"][0]["keyword"]
keyword_path = os.path.join(CSV_FILE_DIR, keyword)
try:
os.makedirs(keyword_path, exist_ok=True)
except OSError as e:
print("Caught exception while creating directories")
raise e
print("Dumping to csv..")
iot_df.to_csv(f"{keyword_path}/interest_over_time.csv", index=False)
bbr_df.to_csv(f"{keyword_path}/breakdown_by_region.csv", index=False)
rt_df.to_csv(f"{keyword_path}/related_topics.csv", index=False)
rq_df.to_csv(f"{keyword_path}/related_queries.csv", index=False)
result_set = {}
result_set["iot"] = iot_df
result_set["bbr"] = bbr_df
result_set["rt"] = rt_df
result_set["rq"] = rq_df
return result_set
Now that the request and dataframe creation is covered, you can create comparisons:
def create_comparison(trend_dataframes : dict) -> None:
comparison = trend_dataframes[0]
i = 1
for df in trend_dataframes[1:]:
comparison["iot"] = pd.merge(comparison["iot"], df["iot"], on="time", suffixes=("", f"_{i}"))
comparison["bbr"] = pd.merge(comparison["bbr"], df["bbr"], on="geo_code", suffixes=("", f"_{i}"))
comparison["rt"] = pd.merge(comparison["rt"], df["rt"], on="title", how="inner", suffixes=("", f"_{i}"))
comparison["rq"] = pd.merge(comparison["rq"], df["rq"], on="query", how="inner", suffixes=("", f"_{i}"))
i = i + 1
comparison["iot"].to_csv("comparison_interest_over_time.csv", index=False)
comparison["bbr"].to_csv("comparison_breakdown_by_region.csv", index=False)
comparison["rt"].to_csv("comparison_related_topics.csv", index=False)
comparison["rq"].to_csv("comparison_related_queries.csv", index=False)
This function will accept the dataframes for all the queries you have created, go over them, and merge them for comparison on key metrics.
The last thing to do is to create the core logic of your application. Adding it all together, the final version of the code should look like this:
import json
import os
from typing import List
import pandas as pd
import requests
def get_trend_data(query: str) -> dict:
"""Gets a dictionary of trends based on given query string from Google Trends via SERP Scraper API"""
CSV_FILE_DIR = "./csv/"
USERNAME = "yourUsername"
PASSWORD = "yourPassword"
print(f"Getting data from Google Trends for {query} keyword..")
url = "https://realtime.oxylabs.io/v1/queries"
auth = (USERNAME, PASSWORD)
payload = {
"source": "google_trends_explore",
"query": query,
}
try:
response = requests.request("POST", url, auth=auth, json=payload)
except requests.exceptions.RequestException as e:
print("Caught exception while getting trend data")
raise e
data = response.json()
content = data["results"][0]["content"]
return json.loads(content)
def flatten_topic_data(topics_data: List[dict]) -> List[dict]:
"""Flattens related_topic data"""
topics_items = []
for item in topics_data[0]["items"]:
item_dict = {
"mid": item["topic"]["mid"],
"title": item["topic"]["title"],
"type": item["topic"]["type"],
"value": item["value"],
"formatted_value": item["formatted_value"],
"link": item["link"],
"keyword": topics_data[0]["keyword"],
}
topics_items.append(item_dict)
return topics_items
def dump_trend_data_to_csv(trend_data: dict) -> dict:
"""Dumps given trend data to generated CSV file"""
# Interest over time
print("Creating dataframes..")
iot_df = pd.DataFrame(trend_data["interest_over_time"][0]["items"])
iot_df["keyword"] = trend_data["interest_over_time"][0]["keyword"]
# Breakdown by region
bbr_df = pd.DataFrame(trend_data["breakdown_by_region"][0]["items"])
bbr_df["keyword"] = trend_data["breakdown_by_region"][0]["keyword"]
# Related topics
rt_data = flatten_topic_data(trend_data["related_topics"])
rt_df = pd.DataFrame(rt_data)
# Related queries
rq_df = pd.DataFrame(trend_data["related_queries"][0]["items"])
rq_df["keyword"] = trend_data["related_queries"][0]["keyword"]
keyword = trend_data["interest_over_time"][0]["keyword"]
keyword_path = os.path.join(CSV_FILE_DIR, keyword)
try:
os.makedirs(keyword_path, exist_ok=True)
except OSError as e:
print("Caught exception while creating directories")
raise e
print("Dumping to csv..")
iot_df.to_csv(f"{keyword_path}/interest_over_time.csv", index=False)
bbr_df.to_csv(f"{keyword_path}/breakdown_by_region.csv", index=False)
rt_df.to_csv(f"{keyword_path}/related_topics.csv", index=False)
rq_df.to_csv(f"{keyword_path}/related_queries.csv", index=False)
result_set = {}
result_set["iot"] = iot_df
result_set["bbr"] = bbr_df
result_set["rt"] = rt_df
result_set["rq"] = rq_df
return result_set
def create_comparison(trend_dataframes : dict) -> None:
comparison = trend_dataframes[0]
i = 1
for df in trend_dataframes[1:]:
comparison["iot"] = pd.merge(comparison["iot"], df["iot"], on="time", suffixes=("", f"_{i}"))
comparison["bbr"] = pd.merge(comparison["bbr"], df["bbr"], on="geo_code", suffixes=("", f"_{i}"))
comparison["rt"] = pd.merge(comparison["rt"], df["rt"], on="title", how="inner", suffixes=("", f"_{i}"))
comparison["rq"] = pd.merge(comparison["rq"], df["rq"], on="query", how="inner", suffixes=("", f"_{i}"))
i = i + 1
comparison["iot"].to_csv("comparison_interest_over_time.csv", index=False)
comparison["bbr"].to_csv("comparison_breakdown_by_region.csv", index=False)
comparison["rt"].to_csv("comparison_related_topics.csv", index=False)
comparison["rq"].to_csv("comparison_related_queries.csv", index=False)
def main():
keywords = ["cat", "cats"]
results = []
for keyword in keywords:
trend_data = get_trend_data(keyword)
df_set = dump_trend_data_to_csv(trend_data)
results.append(df_set)
create_comparison(results)
if __name__ == "__main__":
main()
Running the code will create comparison CSV files that have the combined information of the supplied keywords on each of the categories:
interest_over_time
breakdown_by_region
related_topics
related_queries
Make sure to check our technical documentation for all the API parameters and variables mentioned in this tutorial.
Also, take a look at how to extract data from other popular targets, such as Wikipedia, Google News, Amazon, Wayfair, and many more on our blog.
We hope that you found this tutorial helpful. If you have any questions, reach us at support@oxylabs.io, and one of our professionals will give you a hand.
The legality of scraping Google Trends depends on the specific data you extract and how you intend to use it. It's essential to adhere to all regulations, including copyright and privacy laws. Before engaging in data extraction, we advise you to seek professional legal advice.
You can use an advanced all-in-one solution, such as Oxylabs Google Trends API, an unofficial Google Trends API, Pytrends, or build your own custom scraper from scratch.
For example, Pytrends – the unofficial Google Trends API or feature-rich Google Trends API by Oxylabs.
About the author
Augustas Pelakauskas
Senior Copywriter
Augustas Pelakauskas is a Senior Copywriter at Oxylabs. Coming from an artistic background, he is deeply invested in various creative ventures - the most recent one being writing. After testing his abilities in the field of freelance journalism, he transitioned to tech content creation. When at ease, he enjoys sunny outdoors and active recreation. As it turns out, his bicycle is his fourth best friend.
All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.
Get the latest news from data gathering world
Try Google Trends API
Choose Oxylabs' Google Trends API to gather real-time search data hassle-free.
Scale up your business with Oxylabs®
GET IN TOUCH
General:
hello@oxylabs.ioSupport:
support@oxylabs.ioCareer:
career@oxylabs.ioCertified data centers and upstream providers
Connect with us
Advanced proxy solutions
Resources
Innovation hub