Skip to content

Commit 6a88bd0

Browse files
committed
[TF-3446] add infinite scroll example
Summary: adding example which corresponds with new doc tutorial Test Plan: ran the script locally ``` INFO:__main__:Navigating to the page... INFO:__main__:Scrolling to the bottom of the page... (times = 1) INFO:__main__:Content loaded! INFO:__main__:Scrolling to the bottom of the page... (times = 2) INFO:__main__:Content loaded! INFO:__main__:Scrolling to the bottom of the page... (times = 3) INFO:__main__:Content loaded! INFO:__main__:Issuing AgentQL data query... DEBUG:agentql:Querying data: { page_title post_headers[] } INFO:agentql:AgentQL query execution may take longer than expected, especially for complex queries and lengthy webpages. If you notice no activity in the logs, please be patient—the query is still in progress and has not frozen. The current timeout is set to 900 seconds, so you can expect a response within that timeframe. If a timeout error occurs, consider extending the timeout duration to give AgentQL backend more time to finish the work. DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): agentql.tinyfish.io:443 DEBUG:urllib3.connectionpool:https://agentql.tinyfish.io:443 "POST /api/v2/query-data HTTP/11" 200 477 INFO:__main__:AgentQL response: {'page_title': 'Infinite Scroll · Full page demo', 'post_headers': ['1a - Infinite Scroll full page demo', '1b - RGB Schemes logo in Computer Arts', '2a - RGB Schemes logo', '2b - Masonry gets horizontalOrder', '2c - Every vector 2016', '3a - Logo Pizza delivered', '3b - Some CodePens', '3c - 365daysofmusic.com', '3d - Holograms', '4a - Huebee: 1-click color picker', '4b - Word is Flickity is good']} ```
1 parent cf63ee0 commit 6a88bd0

File tree

3 files changed

+73
-1
lines changed

3 files changed

+73
-1
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ This list contains basic use case examples that demonstrate the fundamental func
3333
| Leverage List Query | [list_query_usage](https://github.com/tinyfish-io/fish-tank/tree/main/examples/list_query_usage) |
3434
| Leverage get_by_prompt method | [get_by_prompt](https://github.com/tinyfish-io/fish-tank/tree/main/examples/get_by_prompt) |
3535
| Log into Site | [log_into_sites](https://github.com/tinyfish-io/fish-tank/tree/main/examples/log_into_sites) |
36-
36+
| Infinite scrolling content load | [infinite_scroll](https://github.com/tinyfish-io/fish-tank/tree/main/examples/infinite_scroll) |
3737

3838
## Application Examples
3939

examples/infinite_scroll/README.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# Example script: load additional content on page by scrolling
2+
3+
This example demonstrates how to load additional content on pages that load content based on scroll position
4+
5+
## Run the script
6+
7+
- [Install AgentQL SDK](https://docs.agentql.com/installation/sdk-installation)
8+
- Save this python file locally as **infinite_scroll.py**
9+
- Run the following command from the project's folder:
10+
11+
```bash
12+
python3 infinite_scroll.py
13+
```
14+
15+
## Adjust the scrolling method
16+
17+
Dynamically loading content can be tricky to get right, as websites have a lot of ways to customize how this interaction looks on their sites.
18+
19+
Scrolling to the end of a page by pressing the `End` key is not always a reliable mechanism, since pages could either have multiple scrollable areas, or have the `End` key mapped to a different function, such as for video playback. Try replacing `key_press_end_scroll(page)` in the example with `mouse_wheel_scroll(page)` and observe how the browser behaves differently, or try navigating to your own site to test in `page.goto`!
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
import logging
2+
import random
3+
import time
4+
5+
import agentql
6+
from agentql.ext.playwright.sync_api import Page
7+
from playwright.sync_api import sync_playwright
8+
9+
logging.basicConfig(level=logging.DEBUG)
10+
log = logging.getLogger(__name__)
11+
12+
13+
def key_press_end_scroll(page: Page):
14+
page.keyboard.press("End")
15+
16+
17+
def mouse_wheel_scroll(page: Page):
18+
viewport_height, total_height, scroll_height = page.evaluate(
19+
"() => [window.innerHeight, document.body.scrollHeight, window.scrollY]"
20+
)
21+
while scroll_height < total_height:
22+
scroll_height = scroll_height + viewport_height
23+
page.mouse.wheel(delta_x=0, delta_y=viewport_height)
24+
time.sleep(random.uniform(0.05, 0.1))
25+
26+
27+
if __name__ == "__main__":
28+
QUERY = """
29+
{
30+
page_title
31+
post_headers[]
32+
}
33+
"""
34+
with sync_playwright() as playwright, playwright.chromium.launch(headless=False) as browser:
35+
page = agentql.wrap(browser.new_page())
36+
37+
log.info("Navigating to the page...")
38+
39+
page.goto("https://infinite-scroll.com/demo/full-page/")
40+
page.wait_for_page_ready_state()
41+
42+
num_extra_pages_to_load = 3
43+
44+
for times in range(num_extra_pages_to_load):
45+
log.info(f"Scrolling to the bottom of the page... (num_times = {times+1})")
46+
key_press_end_scroll(page)
47+
page.wait_for_page_ready_state()
48+
log.info("Content loaded!")
49+
50+
log.info("Issuing AgentQL data query...")
51+
response = page.query_data(QUERY)
52+
53+
log.info(f"AgentQL response: {response}")

0 commit comments

Comments
 (0)