====Selenium====
Web scraping toolkit.
===Debian===
==Install Selenium==
apt update
apt full-upgrade
apt install python3-selenium
===Windows 10===
==Install Python==
[[https://www.python.org/ftp/python/3.13.2/python-3.13.2-amd64.exe|Python 3.13.2 for AMD64]]
==Install Selenium==
python -m pip install --upgrade pip
python -m pip install selenium
==Install tools==
Here are a few tool which can be useful for tidying HTML content
for evaluation and for storing scraped data in a [[:tools:wikibase|Wikibase]].
python -m pip install lxml
python -m pip install beautifulsoup4
python -m pip install "WikibaseIntegrator>=0.12"
python -m pip install dotenv
==WSL1 alias==
Chromium may be installed within WSL1 but cannot operate so creating an alias to the
Windows Python executable allows scripts to run from WSL1 with the Windows browser.
__TCSH__
alias py "/mnt/c/Users/username/AppData/Local/Programs/Python/Python313/python.exe"
__bash__
alias py="/mnt/c/Users/username/AppData/Local/Programs/Python/Python313/python.exe"
===Test===
This test will open chrome.exe or chromium and visit a page.
#! /usr/bin/env python3
from selenium import webdriver
# CHROME
from selenium.webdriver.chrome.options import Options
# FIREFOX
#from selenium.webdriver.firefox.options import Options
options = Options()
# CHROME
options.add_argument("--incognito")
driver = webdriver.Chrome(options=options)
# FIREFOX
#options.add_argument("-private")
#driver = webdriver.Firefox(options=options)
driver.implicitly_wait(60)
driver.get("https://www.kewl.org/")
driver.quit()
CHROME incognito mode was found to be a requirement on a Linux host otherwise chromium would wait
about 30 seconds before opening the URL.
===Resources===
[[https://github.com/vim/vim-win32-installer/releases/download/v9.1.0/gvim_9.1.0_x64_signed.exe|VIM for AMD64]]