-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix scrolling timeout issue #678
base: 1.13
Are you sure you want to change the base?
Conversation
This change addresses a potential issue where the maximum scrolling distance may change between calculating it, sending the scroll message, and verifying the new position. If verifying the position times out, the code now checks if the scrolling distance has changed. If it has, scrolling is retried once. Additionally, the maximum wait time for verifying the new position has been reduced from 30 seconds to 3 seconds.
By looking at your code, you are just trying to scroll again one more time if you encounter a timeout if the mouse has moved on the screen? So there is nothing preventing the issue from happening a second time, it would just be less likely to happen. |
@enricodias In my case, due to the overlay that opens, the maximum distance always ends up being 0. In this situation, scrolling is not repeated. If something is rendered that only reduces the maximum distance but doesn’t set it to 0, an attempt to scroll again will be made. Here is a simplified example of the code I use to load a page and scroll to the bottom: <?php
use HeadlessChromium\BrowserFactory;
use HeadlessChromium\Exception\CommunicationException;
use HeadlessChromium\Exception\CommunicationException\ResponseHasError;
use HeadlessChromium\Exception\NoResponseAvailable;
use HeadlessChromium\Exception\OperationTimedOut;
use HeadlessChromium\Page;
include __DIR__ . '/vendor/autoload.php';
class ScrollDownPage
{
/**
* @throws Exception
*/
public function invoke(Page $page): ?string
{
$distance = $this->waitAndGetMaxScrollingDistance($page);
if ($distance === 0) { // Retry once
$distance = $this->waitAndGetMaxScrollingDistance($page);
}
if ($distance > 0) {
$scrollingEvents = 0;
while ($distance > 0 && $scrollingEvents < 1_000) {
$page->mouse()->scrollDown($distance);
$distance = $this->waitAndGetMaxScrollingDistance($page);
$scrollingEvents++;
}
} else {
throw new Exception('Scrolling down failed. Couldn’t scroll down even once.');
}
return $page->getHtml();
}
/**
* @throws OperationTimedOut
* @throws CommunicationException
* @throws NoResponseAvailable
* @throws ResponseHasError
*/
private function waitAndGetMaxScrollingDistance(Page $page): int
{
$distance = $this->getMaxYScrollingDistance($page);
if ($distance > 0) {
return $distance;
}
for ($i = 1; ($i * 50_000) <= 1_000_000; $i++) {
usleep(50_000);
$distance = $this->getMaxYScrollingDistance($page);
if ($distance > 0) {
return $distance;
}
}
return 0;
}
/**
* @throws CommunicationException
* @throws CommunicationException\ResponseHasError
* @throws NoResponseAvailable
* @throws OperationTimedOut
*/
private function getMaxYScrollingDistance(Page $page): int
{
$scrollableArea = $page->getLayoutMetrics()->getCssContentSize();
$visibleArea = $page->getLayoutMetrics()->getCssVisualViewport();
$maximumY = $scrollableArea['height'] - $visibleArea['clientHeight'];
return (int) $maximumY - (int) $visibleArea['pageY'];
}
}
$url = 'https://www.example.com/something';
$browserFactory = new BrowserFactory('chromium');
$browser = $browserFactory->createBrowser(['windowSize' => [1920, 1000], 'headless' => false]);
$page = $browser->createPage();
$page->navigate($url)->waitForNavigation();
// Scroll down
$html = (new ScrollDownPage())->invoke($page); If you’d like to try it yourself, you might want to modify the private function scroll(int $distanceY, int $distanceX = 0): self
{
$this->page->assertNotClosed();
// make sure the mouse is on the screen
$this->move($this->x, $this->y);
[$distances, $targets] = $this->getScrollDistancesAndTargets($distanceY, $distanceX);
error_log('try to scroll distances: ' . var_export($distances, true));
// scroll
$this->sendScrollMessage($distances);
try {
// wait until the scroll is done
Utils::tryWithTimeout(3_000_000, $this->waitForScroll($targets['x'], $targets['y']));
} catch (\HeadlessChromium\Exception\OperationTimedOut $exception) {
error_log('failed to verify scroll distances');
// Maybe the possible max scroll distances changed in the meantime.
$prevDistances = $distances;
[$distances, $targets] = $this->getScrollDistancesAndTargets($distanceY, $distanceX);
error_log('new distances: ' . var_export($distances, true));
if ($prevDistances === $distances) {
throw $exception;
}
if (0 !== $distanceY || 0 !== $distanceX) { // Try with the new values.
$this->sendScrollMessage($distances);
// wait until the scroll is done
Utils::tryWithTimeout(3_000_000, $this->waitForScroll($targets['x'], $targets['y']));
}
}
// set new position after move
$this->x += $distances['x'];
$this->y += $distances['y'];
return $this;
} I thought that since the maximum possible scroll distance is already determined and replaces the user-requested distance before the message is sent to the browser, it wouldn’t be desirable for an exception to be thrown if it can't scroll the full distance (or no distance at all). Additionally, I found it a bit excessive to try verifying the position for 30 seconds, especially since it might often fail due to overlays - which, unfortunately, are everywhere on the internet nowadays. |
This change addresses an issue where the maximum scrolling distance may change between calculating it, sending the scroll message, and verifying the new position. If verifying the position times out, the code now checks if the scrolling distance has changed. If it has, scrolling is retried once.
I encountered this issue on pages where an overlay pops up. The problem occurs randomly, approximately once every 10 executions of the same code. I suspect it happens when the overlay is rendered precisely between calculating the maximum possible scroll distance and verifying it after sending the scroll message.
This change appears to have completely resolved the issue in my specific case. However, I haven’t added a test case, as I don't know how to reliably reproduce this behavior.
I also reduced the wait time for verifying the new position, as 30 seconds seemed unnecessarily long.
I created this pull request directly without opening an issue first. If you’d prefer that I create an issue as well, please let me know.