Proxy locations

Europe

North America

South America

Asia

Africa

Oceania

See all locations

Network status Careers

hello@oxylabs.io

English (EN)

English

中文

Proxies

Proxies & Advanced Proxy Solutions

Residential Proxies

Human-like scraping without IP blocking

Mobile Proxies

Harness the power of IP addresses from real mobile devices

Rotating ISP Proxies

Extract the required data without the fear of getting blocked

Web Unblocker

AI-powered proxy solution for block-free scraping

Shared Datacenter Proxies

Fast and reliable proxies for cost-effective scraping

Dedicated Datacenter Proxies

The highest performing proxies on the market

Static Residential Proxies

Combined power of Datacenter and Residential IPs

Tools & Addons

Oxy Proxy Extension for Chrome

Free Chrome proxy manager extension that works with any proxy provider.

Oxy Proxy Manager for Android

Free Android proxy manager app that works with any proxy provider.

Proxy RotatorAdd-on

Rotates your Datacenter Proxies to help increase success rates.

Scraper APIs

SERP Scraper APIFREE TRIAL

Scalable SERP data delivery from major search engines

E-Commerce Scraper APIFREE TRIAL

Enterprise-level data from largest e-commerce marketplaces

Real Estate Scraper APIFREE TRIAL

Real-time data from popular real estate websites

Web Scraper APIFREE TRIAL

Public data delivery from a majority of websites

Features

Web Crawler

Discovers all pages on a website and fetches data at scale.

Scheduler

Schedules multiple scraping and parsing jobs at specified frequencies.

Custom Parser

Parses scraped documents by executing given parsing instructions.

Headless BrowserNEW

Render JavaScript and execute browser instructions.

DatasetsNew

Datasets

Company Data

Comprehensive datasets for business profiling

E-Commerce Product Data

Datasets for product catalog insights from E-Commerce stores

Job Postings Data

Datasets for labour market research and insights

Community and Code Data

Datasets for developer community trends

Product Review Data

Fresh datasets for user sentiment analysis

Pricing

Proxies

Residential Proxies

Human-like scraping

Starts from

$10

Pay as you go

Mobile Proxies

3G/4G/5G Mobile Proxies

Starts from

$22

Pay as you go

Rotating ISP Proxies

Extended sessions

Starts from

$340/month

Shared Datacenter Proxies

Cost-effective solution

Starts from

$50/month

Dedicated Datacenter Proxies

Superior performance

Starts from

$50/month

Scraper APIs

SERP Scraper API

Scalable SERP data delivery

Starts from

$49/month

E-Commerce Scraper API

Enterprise-level product page data

Starts from

$49/month

Web Scraper API

Data from a majority of websites

Starts from

$49/month

Real Estate Scraper API

Real-time real estate data

Starts from

$49/month

Advanced Proxy Solutions

Web Unblocker

AI-powered proxy solution

Starts from

$75/month

Learn

Getting Started

Knowledge Base

Read the latest articles about the world of web scraping, proxies, and more

Webinars

Check our webinars to learn more about data gathering issues and solutions

White papers

Get extensive white papers to understand the most complex scraping topics

OxyCon

Join inspiring discussions at Oxylabs’ annual web scraping conference

Scraping Experts

Watch lessons by industry-leading experts to gain insights on data gathering

Useful Information

Quick Start Guides

Featured

Explore tutorials and code samples to build a web scraping infrastructure with Oxylabs solutions.

Solutions

By Industry

E-Commerce

Get access to valuable e-commerce data with the help of advanced scraping solutions

Cybersecurity

Collect threat intelligence and inspect risky activities anonymously with reliable proxies

Brand protection

Monitor the web on a large scale to ensure no unauthorized product seeped into the market

SERP Monitoring

Monitor SERPs to enhance your business strategy

Travel and hospitality

Gather real-time flight and hotel data to and build a solid strategy for your travel business.

By Use Case

View all

By Target

View all

Back to blog

OxyCon

OxyCon 2019: The Top Takeaways From Day One

Vytautas Kirjazovas

2019-10-014 min read

It was a sunny fall morning in Vilnius, Lithuania, where today the 1st annual OxyCon commenced. A two-day event, OxyCon by Oxylabs is a data extraction industry conference, dedicated to sharing knowledge and know-how by expert speakers from some of the best, market-leading companies. A total of 56 attendees from 28 companies and 12 countries are participating in the event, together with speakers and participants from Oxylabs itself.

We had a bit of everything – good coffee, a national nuclear accident response drill (you read that right), truly in-depth presentations and, of course, some friendly mingling, socializing and entertainment.

Whether you participated in the event yourself or if you’re just curious, we have prepared a handy recap of the first day. Without further ado, let’s get to it.

Sustainability of datacenter proxies in the scraping industry

After an introductory speech by one of the Oxylabs founders, detailing our history, work culture and some beautiful traditions we have here, the mic was passed to Rimgaudas Mazgelis, an Oxylabs data analyst.

Mr. Mazgelis shared some insights into how data analysis evolved here at Oxylabs, seeking to ensure a sustainable operation of our datacenter proxies.

It all started from looking for basic patterns of how websites respond to scraping, at first simply using Microsoft Excel all the way back in 2015 and finally switching to and sticking with R and Python, as it should be.

Here at Oxylabs, quadratic equations are used to help find out the limit of requests a target website can accept before a specific IP gets blocked. Today, ratios are being continuously calculated for all of the most popular targets.

Our in-house software, coupled with a convenient dashboard, helps us see scraping trends for each city, which also makes it easier to optimize the stability of the infrastructure.

Here are a couple of interesting facts from Rimgaudas’ presentation:

There is a strong trend of scrapers taking increasingly more data from their targets.

During the summer, companies scrape less, but when fall starts, the seasonal celebrations (e.g. Black Friday) lead to a dramatic increase in scraping rates.

Fingerprinting: the web’s dirty little secret

Allen O’Neill, a big data cloud engineer, discussed many different ways that user data is collected and used to build unique profiles, called fingerprints and how they are used to identify bots. Here are some key takeaways from his rich presentation:

Browser fingerprinting is currently emerging as the primary method of identifying bots.
Browser data, such as resolution, supported fonts, languages and much more helps build this fingerprint. In other words, this unique profile helps identify and track individual users throughout the net.
Cookies are losing relevance as a method of identification/tracking, thanks to fingerprinting.
WebRTC, TCP/IP fingerprinting and Wasm are some of the technologies increasingly leveraged to differentiate bots vs. real users.
Hyper-personalization is the next big thing, requiring huge amounts of data and making separating real users from bots even easier.
Creating realistic personas that act like real web users will eventually become the only way to bypass bot detection systems.

We will end the summary of Mr. O’Neill’s talk with a great quote that perfectly captures his presentation:

“If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck.”

Let’s get technical

The rest of the presentations and a workshop were much more technical and geared towards practical use cases, so to keep this recap within reasonable length, we will only cover some of the main points:

System administrator Karolis Pabijanskas put forward an argument that every successful company using their own server infrastructure will eventually face the challenge of scaling it.
As the infrastructure grows, a certain point can be reached in which your automation solution might not be able to keep up with your needs.
Mr. Pabijanskas suggested migrating from Ansible to SaltStack IT automation solution, lauding its impressive speed and flexibility.
Another speaker detailed the ins and outs of the concept of using a browser as a service. He detailed his preferred solution of using cloud-based Chrome browsers for web scraping, with the help of VNC and XFCE, allowing for smooth window management and using rotating proxies to ensure a high scraping success rate.
Paul Felby, CTO at AdThena, described how machine learning can be used efficiently to optimize web scraping. In his own words, “scraping less is an optimisation challenge” and machine learning is a perfect solution.
Mr. Felby detailed the process of optimizing scraping efforts, which includes preparing data the right way, the caveats of training your model, evaluating its accuracy and more.
It all boils down to using linear and decision tree regression models, random forests, gradient boosting and some more statistical wizardry. We would like to get more in-depth, but, as the saying goes, you just had to be there.

The official part of day one was concluded with Oxylabs software developer Paulius Stundžia leading a workshop on how to save precious resources with the help of Oxylabs Real-Time Crawler (now known as Scraper APIs) and a callback handler. But the fun was not over yet.

After getting all the knowledge of the day out in the open, OxyCon attendants are currently gathering at the beautiful hotel PACAI for some quality evening entertainment. Tomorrow, another inspiring day of know-how sharing awaits us! If you want to stay updated, be sure to follow us on Twitter.

About the author

Vytautas Kirjazovas

Head of PR

Vytautas Kirjazovas is Head of PR at Oxylabs, and he places a strong personal interest in technology due to its magnifying potential to make everyday business processes easier and more efficient. Vytautas is fascinated by new digital tools and approaches, in particular, for web data harvesting purposes, so feel free to drop him a message if you have any questions on this topic. He appreciates a tasty meal, enjoys traveling and writing about himself in the third person.

Learn more about Vytautas Kirjazovas

All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.

OxyCon Events