Proxy locations

Europe

North America

South America

Asia

Africa

Oceania

See all locations

Network status Careers

hello@oxylabs.io

English (EN)

English

中文

Proxies

Proxies & Advanced Proxy Solutions

Residential Proxies

Human-like scraping without IP blocking

Mobile Proxies

Harness the power of IP addresses from real mobile devices

Rotating ISP Proxies

Extract the required data without the fear of getting blocked

Web Unblocker

AI-powered proxy solution for block-free scraping

Shared Datacenter Proxies

Fast and reliable proxies for cost-effective scraping

Dedicated Datacenter Proxies

The highest performing proxies on the market

Static Residential Proxies

Combined power of Datacenter and Residential IPs

Tools & Addons

Oxy Proxy Extension for Chrome

Free Chrome proxy manager extension that works with any proxy provider.

Oxy Proxy Manager for Android

Free Android proxy manager app that works with any proxy provider.

Proxy RotatorAdd-on

Rotates your Datacenter Proxies to help increase success rates.

Scraper APIs

SERP Scraper APIFREE TRIAL

Scalable SERP data delivery from major search engines

E-Commerce Scraper APIFREE TRIAL

Enterprise-level data from largest e-commerce marketplaces

Real Estate Scraper APIFREE TRIAL

Real-time data from popular real estate websites

Web Scraper APIFREE TRIAL

Public data delivery from a majority of websites

Features

Web Crawler

Discovers all pages on a website and fetches data at scale.

Scheduler

Schedules multiple scraping and parsing jobs at specified frequencies.

Custom Parser

Parses scraped documents by executing given parsing instructions.

Headless BrowserNEW

Render JavaScript and execute browser instructions.

DatasetsNew

Datasets

Company Data

Comprehensive datasets for business profiling

E-Commerce Product Data

Datasets for product catalog insights from E-Commerce stores

Job Postings Data

Datasets for labour market research and insights

Community and Code Data

Datasets for developer community trends

Product Review Data

Fresh datasets for user sentiment analysis

Pricing

Proxies

Residential Proxies

Human-like scraping

Starts from

$10

Pay as you go

Mobile Proxies

3G/4G/5G Mobile Proxies

Starts from

$22

Pay as you go

Rotating ISP Proxies

Extended sessions

Starts from

$340/month

Shared Datacenter Proxies

Cost-effective solution

Starts from

$50/month

Dedicated Datacenter Proxies

Superior performance

Starts from

$50/month

Scraper APIs

SERP Scraper API

Scalable SERP data delivery

Starts from

$49/month

E-Commerce Scraper API

Enterprise-level product page data

Starts from

$49/month

Web Scraper API

Data from a majority of websites

Starts from

$49/month

Real Estate Scraper API

Real-time real estate data

Starts from

$49/month

Advanced Proxy Solutions

Web Unblocker

AI-powered proxy solution

Starts from

$75/month

Learn

Getting Started

Knowledge Base

Read the latest articles about the world of web scraping, proxies, and more

Webinars

Check our webinars to learn more about data gathering issues and solutions

White papers

Get extensive white papers to understand the most complex scraping topics

OxyCon

Join inspiring discussions at Oxylabs’ annual web scraping conference

Scraping Experts

Watch lessons by industry-leading experts to gain insights on data gathering

Useful Information

Quick Start Guides

Featured

Explore tutorials and code samples to build a web scraping infrastructure with Oxylabs solutions.

Solutions

By Industry

E-Commerce

Get access to valuable e-commerce data with the help of advanced scraping solutions

Cybersecurity

Collect threat intelligence and inspect risky activities anonymously with reliable proxies

Brand protection

Monitor the web on a large scale to ensure no unauthorized product seeped into the market

SERP Monitoring

Monitor SERPs to enhance your business strategy

Travel and hospitality

Gather real-time flight and hotel data to and build a solid strategy for your travel business.

By Use Case

View all

By Target

View all

Back to blog

Data acquisition Scrapers

Is Web Scraping Legal?

Gabija Fatenaite

2021-07-068 min read

The demand for big data is significantly raised these years. According to Statista, big data market size revenue is constantly growing every year. This is why the web scraping industry is also gaining more popularity, as it is one of the most common data collection methods. The legality of web scraping is a much-debated topic among developers and others who work in the data gathering field.

In this article, we will cover the important questions about web scraping legality and what web scraping legal issues can one encounter when scraping certain websites.

Before delving into this complex topic, we want to note that this article is for informational purposes only and that any information contained herein does not constitute legal advice. Accordingly, before engaging in any scraping activities, you should get appropriate professional legal advice regarding your specific situation.

Is web scraping legal or illegal?

So is web scraping activity legal or not? It is not illegal as such. In the end, you can crawl and scrape your own website without much effort. Businesses use bots for their benefit but at the same time don’t want others to exploit web scrapers against them. If you wonder if you can get in trouble for web scraping, you should be aware of what cases of web scraping are illegal. Here are some specific web scraping examples that are illegal:

1. Your web scraper should not log-in to websites or web pages and then download data. By logging-in on any websites, users have to agree to the Terms of Service (ToS) (By the way it is also possible to accept ToS just by browsing the website), which may forbid activity like automated data collection.

2. There is a misconception that you can do whatever you want with publicly available data. There may be less restrictions for scraping publicly available data as opposed to private information, but you still have to make sure that you are not breaching laws that may be applicable to such data, for example – downloading copyrighted data. Usually, it includes designs, layouts, articles, videos and everything that can be considered as creative work.

3. Even if data is needed for personal usage, Terms of Service may forbid any kind of automatic data collection. In this case, not data usage but scraping activity itself may be illegal.

Why does web scraping activity sometimes appear negatively?

Web scraping may be legal where you are scraping without breaking any rules or applicable laws surrounding the targeted websites or gathered data. However, malicious actors or hackers intentionally abuse web scraping. Here are some thoughts on why sometimes web scraping is considered a suspicious activity:

1. When you are thinking about web scraping advantages and the importance of data for your business improvement, the public data gathering process does not sound offensive or unethical. On the other hand, if you find out that someone is scraping your website for these same reasons, you may have different thoughts.

2. There are situations when individuals or companies are abusing web scraping and violating ToS (Terms of Services), copyright norms or other applicable laws. In this case, web scraping looks like a malicious and unethical activity. This is the reason why it could be hard to explain and prove that the main idea of web crawling and web scraping for businesses is to make data-driven decisions from publicly available information.

3. When web scraping is in process, a scraper will send many requests to the websites to get the required information. As it is done automatically, web scraping tools could potentially make more requests than a regular user does. If this process is done without regard for the website, it will cause a heavy load. This is one of the main reasons websites have security measures.

How do privacy laws affect web scraping?

Another aspect that needs to be considered when scraping publicly available data is various privacy laws.

The GDPR and CCPA

The GDPR (General Data Protection Regulation) is a data privacy and security law passed by the EU (European Union) and put into effect on May 25, 2018. The main purpose of this regulation is to give the EU citizens control over their personally identifiable information by putting limitations on organizations targeting and collecting this data. The GDPR doesn’t state that web scraping is illegal; however, restricts what businesses can do with the contact data they wish to extract. For example, in some cases, in order to gather personal data and use it for various purposes, they have to receive explicit consent from the data subjects.

Similarly, the state of California passed a state law, the California Consumer Privacy Act (CCPA), which put businesses collecting personal data under the similarly strict requirements (e.g., consumers get a possibility to delete their personal information and opt-out of the sale of their data as well as receive a right to non-discrimination for exercising their CCPA rights).

General advices for the best web scraping practises

As mentioned before, we advise you to seek legal consultation before engaging in scraping activities of any kind. With that being said, here are some base practical tips that may help ensure compliance when web scraping:

1. Sometimes, websites provide their API for data collection. If it is possible, use it instead of scraping data. Of course, using a provided API is not the same as web scraping. You can learn more about the differences between web scraping vs. scraping API in our other blog posts.

2. It is essential to respect the Terms of Service (ToS) for each website.

3. Respect the rules of robots.txt. If you really need the data from a specific website, but ToS or robots.txt forbids any automatic data collection, you can try to ask permission from the site owner.

4. Do not use scraped data without making sure that this information is not copyrighted. If it is necessary to publish this data, you should ask written permission from the copyright holder.

If you want to learn more about the best web scraping practices, we have covered this topic in detail from the ethical and technical side. Also, learn more about ethical data collection here.

Web scraping cases

Last year we had a two-day event OxyCon, where our legal counsels Denas and Nerijus went over some of the web scraping legal issues. We made a summary of their presentation, where we will be focusing on the landmark scraping cases that set the tone for future scraping legal claims such as copyright infringement or Computer Fraud and Abuse Act (CFAA).

Denas Grybauskas & Nerijus Sveistys, Oxylabs Legal Counsels

We concentrated over real scraping cases that set precedent for future scraping legal claims. This may help you to answer the question: “Is web scraping legal in the US?”. However, do not forget that these cases are only examples to understand the situation with web scraping’s legality. You should get appropriate professional advice regarding your specific situation.

Ryanair v. PR Aviation (2018)

Ryanair’s argument with a flight price comparison company PR Aviation provided a glimpse of how scraping could be interpreted in European courts. Ryanair’s website subjects its visitors to ToU, which explicitly prohibits scraping. PR Aviation was scraping Ryanair, who took them to court in the Netherlands for breach of contract.

Ryanair came out second best from the dispute, as the Dutch court said that there was no valid contract formed between the companies. It made an interesting allegory, stating that anyone putting up a poster in a shop window visible from the public road, which reads: “Whoever reads further, must pay € 5,” cannot accept that the person reading this wants to commit to such a condition.

Still, this does not mean that ToU would not be applicable in a different scenario, as there were a lot of circumstances unfavorable to Ryanair here. Namely, the facts that at the time of the scraping, Ryanair was presenting its ToU in a browsewrap, which is not generally accepted as legally binding by courts, as well as the fact that the scraped data was free and accessible to everyone.

Ryanair v. Expedia (2019)

Expedia, a U.S. flight comparison company, was scraping Ryanair’s data and continued doing so after receiving a C&D letter. Consequently, it was sued by Ryanair for breaching the CFAA. Expedia argued that Ryanair is an Irish company, therefore the CFAA, a U.S. statute should not be applicable.

The courts established that the CFAA might indeed apply to U.S. companies acting internationally. After this, Ryanair and Expedia settled the case, with the details being confidential. With that being said, as of this day, there are no Ryanair flights being offered via Expedia’s website.

HiQ labs v. LinkedIn (2019)

HiQ labs is a company that scraped data from LinkedIn profiles to provide tools and insights on employees to businesses. After allowing HiQ to collect data for several years, in 2017, LinkedIn issued a C&D letter to HiQ and themselves launched a tool similar to HiQ’s functionality. HiQ sought an injunction in court, which was granted, leading to LinkedIn being asked to withdraw the C&D letter and stop applying any blocking measures against HiQ.

LinkedIn appealed the decision, arguing that HiQ’s scraping was breaching the CFAA. The court decided that HiQ was not acting in breach of the CFAA, as the data scraped from LinkedIn was public (profiles containing user-generated content; not put behind a password wall). The court said that companies should not be able to revoke authorization where one is not needed in the first place, as well as that allowing companies like LinkedIn to decide who can collect and use publicly available data would be contrary to the public interest.

The decision in the HiQ labs v. LinkedIn case was favorable to scraping companies and reconsidered some of the much-criticized previous court practice regarding the applicability of the CFAA, narrowing the relevance of this act with regards to public data (e.g., Facebook v. Power Ventures, Craigslist v. 3Taps). With that being said, if not done with caution, scraping activities might still be subject to potential breaches of the CFAA (e.g., under different case’s circumstances) as well as other grounds such as, among others, trespass to chattels, copyright or breach of contract.

However, later in 2022, the Court stated that HiQ’s creation of fake accounts (“Turkers”) to scrape LinkedIn’s data violated LinkeIn’s User Agreement, therefore on December, 2022 LinkedIn and HiQ reached a settlement in which HiQ agreed to permanent injunction, requiring HiQ to stop scraping LinkedIn.

If you are interested in legal aspects of web scraping, watch a recording of your webinar Web Scraping from a Legal Perspective. During the webinar , an expert panel discusses web scraping laws, cease and desist letters, and ongoing court cases that are relevant to the web scraping community.

Free webinar

Web Scraping From a Legal Perspective

Now, let’s take a look at a few other recent legal web scraping cases that deserve your attention.

Wrapping up

There is no simple answer to this question “Is web scraping legal?” as one must answer whether the scraping done does not breach any laws surrounding the said data.

So please, take this article as informational and educational only. It does not replace independent professional advice and judgement. Statements of fact and opinions expressed are those of the presenters only, and unless expressly stated to the contrary, are not the opinion or position of Oxylabs.

If you wish to further learn about scraping, see our step-by-step guide on how to web scrape in Python and try our general-purpose web scraper for free. Also, find out more about ethical Residential Proxy sourcing.

About the author

Gabija Fatenaite

Lead Product Marketing Manager

Gabija Fatenaite is a Lead Product Marketing Manager at Oxylabs. Having grown up on video games and the internet, she grew to find the tech side of things more and more interesting over the years. So if you ever find yourself wanting to learn more about proxies (or video games), feel free to contact her - she’ll be more than happy to answer you.

Learn more about Gabija Fatenaite

All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.

Data acquisition