Table of Contents

Selenium get

This is a simple tool to fetch a page using selenium and allow access to its wait parameters.

Install

sudo apt install python3-full mercurial
hg clone https://hg.kewl.org/pub/sget
cd sget

script

Demo

This web page loads a page of temporary content and after a delay it will reload.

The reloaded content contains a subset of the full content and requires scrolling to reveal it all.

Fetch the temporary page

py sget.py "https://www.deckenmalerei.eu/e8a3bf28-6365-42fe-a4d3-41608ed870e8" d0.html

Fetch the reloaded page

py sget.py "https://www.deckenmalerei.eu/e8a3bf28-6365-42fe-a4d3-41608ed870e8" -d10 d10.html

Compare temporary and reloaded

diff d0.html d10.html | head -30
9c9
<    CbDD - lade e8a3bf28-6365-42fe-a4d3-41608ed870e8
---
>    CbDD - Abtsgmünd, Gartenpavillon Schloss Hohenstadt [Text]
50,54c50,1386
<     <div class="fullScreenCentered loadingIndicator">
<      <div class="spinningCircleBasic spinningCircleTop spinningCircleRight spinningCircleBottom">
<      </div>
<      <div class="fullScreenText">
<       lade Daten ...
---
>     <div class="dataPage">
>      <div class="document">
>       <div style="margin-top: 0.5rem;">
>        <div class="text-right">
>         <div class="qrView">
>          <div>
>           <a href="/e8a3bf28-6365-42fe-a4d3-41608ed870e8">
>            <span>
>             QR Code
>            </span>
>            <span class="icon-align-big">
>             <i class="material-icons md-36">
>              arrow_drop_down
>             </i>
>            </span>
>           </a>
>          </div>
>         </div>
>         <div class="mapView">

Fetch the page again but this time wait for the reloaded content based on tag attribute

py sget.py "https://www.deckenmalerei.eu/e8a3bf28-6365-42fe-a4d3-41608ed870e8" -b XPATH -v "//div[@class='dataPage']" --scroll dataPage.html