Proxy locations

Europe

North America

South America

Asia

Africa

Oceania

See all locations

Network status Careers

hello@oxylabs.io

English (EN)

English

中文

Proxies

Proxies & Advanced Proxy Solutions

Residential Proxies

Human-like scraping without IP blocking

Mobile Proxies

Harness the power of IP addresses from real mobile devices

Rotating ISP Proxies

Extract the required data without the fear of getting blocked

Web Unblocker

AI-powered proxy solution for block-free scraping

Shared Datacenter Proxies

Fast and reliable proxies for cost-effective scraping

Dedicated Datacenter Proxies

The highest performing proxies on the market

Static Residential Proxies

Combined power of Datacenter and Residential IPs

Tools & Addons

Oxy Proxy Extension for Chrome

Free Chrome proxy manager extension that works with any proxy provider.

Oxy Proxy Manager for Android

Free Android proxy manager app that works with any proxy provider.

Proxy RotatorAdd-on

Rotates your Datacenter Proxies to help increase success rates.

Scraper APIs

SERP Scraper APIFREE TRIAL

Scalable SERP data delivery from major search engines

E-Commerce Scraper APIFREE TRIAL

Enterprise-level data from largest e-commerce marketplaces

Real Estate Scraper APIFREE TRIAL

Real-time data from popular real estate websites

Web Scraper APIFREE TRIAL

Public data delivery from a majority of websites

Features

Web Crawler

Discovers all pages on a website and fetches data at scale.

Scheduler

Schedules multiple scraping and parsing jobs at specified frequencies.

Custom Parser

Parses scraped documents by executing given parsing instructions.

Headless BrowserNEW

Render JavaScript and execute browser instructions.

DatasetsNew

Datasets

Company Data

Comprehensive datasets for business profiling

E-Commerce Product Data

Datasets for product catalog insights from E-Commerce stores

Job Postings Data

Datasets for labour market research and insights

Community and Code Data

Datasets for developer community trends

Product Review Data

Fresh datasets for user sentiment analysis

Pricing

Proxies

Residential Proxies

Human-like scraping

Starts from

$10

Pay as you go

Mobile Proxies

3G/4G/5G Mobile Proxies

Starts from

$22

Pay as you go

Rotating ISP Proxies

Extended sessions

Starts from

$340/month

Shared Datacenter Proxies

Cost-effective solution

Starts from

$50/month

Dedicated Datacenter Proxies

Superior performance

Starts from

$50/month

Scraper APIs

SERP Scraper API

Scalable SERP data delivery

Starts from

$49/month

E-Commerce Scraper API

Enterprise-level product page data

Starts from

$49/month

Web Scraper API

Data from a majority of websites

Starts from

$49/month

Real Estate Scraper API

Real-time real estate data

Starts from

$49/month

Advanced Proxy Solutions

Web Unblocker

AI-powered proxy solution

Starts from

$75/month

Learn

Getting Started

Knowledge Base

Read the latest articles about the world of web scraping, proxies, and more

Webinars

Check our webinars to learn more about data gathering issues and solutions

White papers

Get extensive white papers to understand the most complex scraping topics

OxyCon

Join inspiring discussions at Oxylabs’ annual web scraping conference

Scraping Experts

Watch lessons by industry-leading experts to gain insights on data gathering

Useful Information

Quick Start Guides

Featured

Explore tutorials and code samples to build a web scraping infrastructure with Oxylabs solutions.

Solutions

By Industry

E-Commerce

Get access to valuable e-commerce data with the help of advanced scraping solutions

Cybersecurity

Collect threat intelligence and inspect risky activities anonymously with reliable proxies

Brand protection

Monitor the web on a large scale to ensure no unauthorized product seeped into the market

SERP Monitoring

Monitor SERPs to enhance your business strategy

Travel and hospitality

Gather real-time flight and hotel data to and build a solid strategy for your travel business.

By Use Case

View all

By Target

View all

Back to blog

Data acquisition Scrapers

Puppeteer vs Selenium: Which to Choose

Yelyzaveta Nechytailo

2023-02-075 min read

Puppeteer and Selenium are two well-known open-source tools mainly used for browser automation and testing. Released only 5 years ago, Puppeteer has gained appreciation from developers thanks to its useful features and exceptional performance. And while Selenium is a more mature framework dating back to 2004, it still remains to be an industry leader for web automation, supporting multiple programming languages and platforms.

In this article, let’s compare these two frameworks in detail so that you can make a concrete decision on which fits your needs best.

Main features and use cases

Puppeteer

Fundamentally, Puppeteer is a Node.js library mostly used for creating an automated testing environment. It was developed by Google with the idea of providing a high-level API to control Chrome and Chromium over the DevTools Protocol.

Unlike Selenium, which supports various programming languages, Puppeteer doesn’t aim to provide a broad experience for developers. Instead, it focuses on offering a specific set of control structures, supporting solely JavaScript and serving as a remote control library for Chrome.

Developers largely use Puppeteer for such tasks as:

Testing Chrome extensions.
Taking screenshots and generating PDFs of pages for UI testing.
Performing tests on the latest versions of Chromium.
Automating a range of manual testing processes, such as form submissions, keyboard inputs, etc.
Web scraping (see a detailed tutorial on how to perform web scraping with Puppeteer in our extensive blog post).

Selenium

In comparison to Puppeteer, Selenium is a testing library that supports not only Chrome and Chromium but also Firefox, Safari, Opera, Microsoft Edge. Additionally, Selenium scripts can be written using JavaScript, Ruby, C#, Java, and Python. All this gives developers an opportunity to perform sophisticated tests in their preferred languages as well as target different browsers by using one single tool.

Another thing that should be mentioned about Selenium, is the presence of such components as Selenium WebDriver, Selenium IDE, and Selenium Grid which further extend the capabilities of this library and allow users to satisfy different testing needs.

Selenium WebDriver, Selenium IDE, Selenium Grid

The common Selenium use cases include:

Web performance testing.
Web application testing.
Automation testing.
Performance testing.
Data scraping (check out our in-depth Selenium tutorial to learn how to perform web scraping with it using Python).

Advantages and disadvantages

To find out which tool is a better choice for your specific activities, carefully weighing all the benefits and drawbacks of each is an essential step.

Therefore, in this section, let’s take a look at a breakdown of the main pros and cons of both Puppeteer and Selenium.

Puppeteer

Advantages:

Access to the DevTools protocol and the ability to control one of the world’s most popular browsers – Chrome.
One browser, one language. While this one definitely sounds like a disadvantage, it’s exactly what helps Puppeteer to run extremely fast, especially when compared to Selenium.
Less dependency requirements as there’s no separate maintenance of browser drivers.
Availability of various useful performance management features, such as taking screenshots and recording load performance.

Disadvantages:

Supports only one programming language – JavaScript.
Currently supports only the Chrome browser.

Selenium

Advantages:

Seeks to support a wide variety of browsers, platforms, and programming languages.
In-built tools (WebDriver, IDE, and Grid) which allow for the development of a comprehensive testing and automation framework.
Direct integrations with CI/CD ensuring increased capabilities.

What is CI/CD?

The acronym CI/CD stands for continuous integration and continuous delivery/deployment. This combined practice automates most of the human intervention throughout the lifecycle of apps, from integration and testing stages to delivery and deployment.

Disadvantages:

More complicated installation process due to its support of various platforms, languages, and browsers.
Unlike Puppeteer, fails to provide different performance management capabilities.
Steep learning curve.

Differences in set up and web scraping

In this section, we’ll compare these two tools based on fundamental differences in setting up the environment and web scraping efficiency using Node.js.

Installation

Both Puppeteer and Selenium installation processes are simple. The main distinction is in the prerequisite libraries. While Puppeteer users merely need to install it using a simple npm command, Selenium users must follow language-specific instructions.

Puppeteer

npm install puppeteer

Selenium

npm install selenium-webdriver
npm install chromedriver

Browser control and web scraping

Both tools allow programmatic web browser control. You can use this ability to scrape dynamic content from your target web page. Let’s see key code differences to launch a chrome (headless) instance, navigate it to a specific web page, wait for specific dynamic content to load, and scrape the page.

Our target for scraping will be http://quotes.toscrape.com/js/. This is a dynamic web page where all the quotes are loaded dynamically through the relevant JavaScript file. The JavaScript file renders quotes in <DIV> elements, all having a quote class.

1. Dependencies and setting the target

Puppeteer

const puppeteer = require('puppeteer');
const url = 'http://quotes.toscrape.com/js/';

Selenium

const { Builder, By, Key, until } = require('selenium-webdriver');
const chrome = require('selenium-webdriver/chrome');
const url = 'http://quotes.toscrape.com/js/';

Selenium supports multiple browsers, so it requires importing specific browser drivers (chrome drivers in our case) along with the webdriver. Chrome driver is implicit with Puppeteer.

2. Launching a headless chrome instance and navigating to the target URL

Puppeteer

const headlessBrowser = await puppeteer.launch({ headless: true });

const newTab = await headlessBrowser.newPage();

await newTab.goto(url);

Selenium

let driver = await new Builder().forBrowser('chrome') .setChromeOptions(new chrome.Options().headless()).build();

await driver.get(url);

Puppeteer uses an awaitable launch() method to launch the browser instance, and the newPage() method creates a new browser tab. Now, the goto() method can navigate the tab to any given URL.

Selenium, on the other hand, uses Builder() constructor to build a new Builder instance followed by specific options. The build() method at the end creates and returns a new instance of the webdriver session.

Note: You must enclose awaitable calls inside an asynchronous function.

3. Waiting for dynamic content to load

Let’s see the differences between Puppeteer and Selenium in waiting for specific JavaScript content to load. The following code waits for JavaScript to load a <Div> element with the quote class.

Puppeteer

await newTab.waitForSelector('.quote');

Selenium

await driver.wait(until.elementLocated(By.className('quote')));

Puppeteer uses the waitForSelector() method, while Selenium uses the wait() method in conjunction with until property to wait for a specific element to load.

4. Scraping the quotes

Puppeteer uses querySelectorAll() method to select and return a list of all the matching elements specified by given CSS selectors. On the other hand, Selenium provides a findElements() method to extract the relevant elements matching the By selectors.

Puppeteer

let quotes = await newTab.evaluate(() => {
    let allQuoteDivs = document.querySelectorAll(".quote");

    let quotesString= "";
    allQuoteDivs.forEach((quote) => {
        let qouteText = quote.querySelector(".text").innerHTML;
        quotesString += `${qouteText} \n`;
    });
     return quotesString;
 
});

console.log(quotes);

Selenium

let quotes = await driver.findElements(By.className('quote'));

let quotesString = "";
for (let quote of quotes) {
      let qouteText = await quote.findElement(By.className('text')). getText();
      quotesString += `${qouteText} \n`;
 }

 console.log(quotesString);

Notice the evaluate() method in the Puppeteer code. It allows executing a function in the current tab or page context. This means that you can access and manipulate the elements in the Document Object Model (DOM) of the current tab and then return a value as a result (which is the quotesString in our case).

5. Closing the browser

Puppeteer offers the close() method to close the browser instance, while Selenium provides the quit() method to exit the browser instance and destroy the driver session.

Puppeteer

headlessBrowser.close();

Selenium

await driver.quit();

Puppeteer vs Selenium: key differences

Now that we’ve laid out the most important features, advantages, and drawbacks of broth frameworks, let’s group together all of their key differences.

Puppeteer	Selenium
Is a Node.js library.	Is a framework for web applications testing.
Works only with Chrome/Chromium and doesn’t support any other browsers.	Supports a wide range of browsers.
No cross-platform support available.	Cross-platform support available with all browsers.
Supports only JavaScript.	Supports different programming languages.
Faster in execution.	Slower in execution.
Easy to install with npm or Yarn.	Relatively hard for a new user.
Supports both web and mobile automation.	Supports only web automation.
Provides various performance management capabilities.	Fails to offer performance management capabilities.
Recording not possible.	Can record interactions with browsers using Selenium IDE.
Taking screenshots of both images and PDFs.	Taking screenshots of PDFs is unavailable.

Which one should you choose?

At this point, it’s already clear that both Puppeteer and Selenium are two powerful tools with exceptional capabilities for testing automation. However, while they do have several differences, the final decision of whether you should use one or the other will depend on your or your organization’s specific needs.

If you work exclusively with Chrome, Puppeteer is the go-to choice. Its high-level API will provide you with unparalleled control over the browser, and the excellent speed and focus offered by it will make sure you achieve efficiency in setting up tests. What’s more, considering the fact that Puppeteer is more of web automation than testing library, it’ll be more suitable for such activities as web crawling and scraping.

On the other hand, in case you need to support other browsers and programming languages, you should definitely choose Selenium. With Selenium WebDriver offering cross-browser support, you’ll be able to interact with any browser directly, significantly extending the test scope without the need to rely on any other external tools.

Wrapping up

This article discussed and compared two of the most popular automation frameworks, Puppeteer and Selenium, each offering a number of distinct features and advantages. Therefore, we hope you’ll use it as a guideline to identify your requirements and choose the most suitable tool for your future projects.

If you're stuck deciding whether Puppeteer or Selenium will work better for your public data collection project, see our other comparisons of web scraping tools, such as Scrapy vs. Selenium, and a comprehensive overview of the top website testing tools. You can also try our advanced web scraping solutions for free – Web Scraper API and Web Unblocker, including their built-in feature, Headless Browser. And in case you have any questions in the process, don’t hesitate to contact us at hello@oxylabs.io.

Frequently asked questions

Which is faster: Selenium or Puppeteer?

Both Puppeteer and Selenium prove to be powerful tools for web and test automation. However, Puppeteer is significantly faster than Selenium. This is because Selenium is a more complex tool, supporting many browsers and programming languages.

Does Puppeteer use Selenium?

Puppeteer and Selenium are two separate open-source tools used for browser automation and testing. While Puppeteer is designed specifically for Chrome, Selenium can work with different browsers and languages.

About the author

Yelyzaveta Nechytailo

Senior Content Manager

Yelyzaveta Nechytailo is a Senior Content Manager at Oxylabs. After working as a writer in fashion, e-commerce, and media, she decided to switch her career path and immerse in the fascinating world of tech. And believe it or not, she absolutely loves it! On weekends, you’ll probably find Yelyzaveta enjoying a cup of matcha at a cozy coffee shop, scrolling through social media, or binge-watching investigative TV series.

Learn more about Yelyzaveta Nechytailo

All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.

Tutorials Scrapers