Setup Selenium for scraping under WSL2

This post describes how to run Selenium under WSL2. This may help those who wants to scrape dynamic websites and develop under WSL2. Assume you have Selenium installed already (common pip install Selenium). TLDR: install and config Selenium dependencies Firefox and geckodriver, and use FAQ below to address roadblocks.

Set up Firefox

You need to install Firefox browser under WSL2. Specifically, Firefox version<=67 (but not too old) is required (from release history, or follow this instruction)

FAQ:

Unable to find matching capabilities: update firefox, for more details see this SO link.
For other bugs see this GitHub issue

Set up Geckodriver

After having Firefox, You need to further install and config geckodriver. Follow guide for installation or check releases. If you cannot get it running, use trace mode (ref) to debug and to find implicit dependency. Note with trace mode it outputs a log file rather than to standard IO. In my case I additionally need libraries libgtk-3 (ref) and libdbus-glib-1-2 (ref). After that, it's good to go.

Tips:

Use Linux version rather than Windows version for Firefox/Geckodriver installation (just assume everything is Linux-native under WSL2). When hitting a bug, I was debating whether the cause is that I was using a Linux-based Firefox under Windows. Turns out it is not.
If you are using Selenium for scraping, it's easier to evelop such scrape pipeline within a notebook, where you can render webpage - detect elements step-by-step.

Comments

Setup Selenium for scraping under WSL2

Set up Firefox

Set up Geckodriver

Comments

Published

Category

Tags