A browser-agnostic browser automation framework inspired by Ansible
browser-playbook is a powerful, declarative browser automation framework that lets you define your automation tasks using simple YAML playbooksβjust like Ansible. Say goodbye to writing repetitive boilerplate code for Selenium or Playwright. Instead, focus on what you want to automate, not how to code it.
- β Write Less Code: Define automation tasks in YAML instead of hundreds of lines of Python
- β Browser Agnostic: Switch between Selenium, Playwright, and Puppeteer without changing your playbooks
- β Declarative: Focus on what you want to achieve, not implementation details
- β Reusable: Create playbook templates that work across different projects
- β Maintainable: YAML playbooks are easier to read and modify than traditional automation code
Traditional Selenium Code:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
driver = webdriver.Chrome()
driver.get("https://example.com")
search_box = driver.find_element(By.CSS_SELECTOR, "input[name='q']")
search_box.click()
search_box.send_keys("browser automation")
# ... more boilerplate codeWith browser-playbook:
tasks:
- name: Navigate to website
action: browser.goto
url: https://example.com
- name: Find and click search box
action: dom.get_element
selector: "input[name='q']"
output: search_box
- name: Type search query
action: keyboard.type
text: "browser automation"Works seamlessly with multiple browser automation engines. Write your playbooks once, run them with any supported engine:
- Selenium (fully supported)
- Playwright (coming soon)
- Puppeteer (coming soon)
Define automation tasks using intuitive YAML syntax, inspired by Ansible. No need to write repetitive browser automation code.
Built-in support for iterations and conditional execution:
- Loop over lists of elements with
map - Execute tasks conditionally with
whenclauses - Nested task execution for complex workflows
Reusable task system with pre-built actions for:
- Browser navigation
- DOM manipulation
- Keyboard and mouse interactions
- Element queries and data extraction
- Wait operations
Powerful variable management system that maintains state across tasks:
- Store element references
- Pass data between tasks
- Use variables in conditions and loops
Pluggable browser automation strategies make it easy to:
- Add support for new engines
- Switch engines without changing playbooks
- Extend functionality with custom tasks
git clone https://github.com/Axel77g/scrapping-playbook-framework.git
cd scrapping-playbook-framework
pip install -r requirements.txt # if requirements.txt exists- Python 3.x
- Selenium WebDriver (for Selenium engine)
Create a file named my_automation.yaml:
tasks:
- name: Navigate to website
action: browser.goto
url: https://example.com
output: page
- name: Find search box
action: dom.get_element
selector: "input[name='q']"
output: search_box
- name: Click search box
action: $search_box.click
- name: Type search query
action: keyboard.type
text: "browser automation"from scrapping_playbook_framework.worker import Worker, WorkerEngine
from scrapping_playbook_framework.playbook_reader import from_yaml_file
# Load playbook
playbook = from_yaml_file("my_automation.yaml")
# Create worker with desired engine
worker = Worker(playbook, WorkerEngine.SELENIUM)
# Execute playbook
results = worker.start()
print(results)The framework is built on a clean, modular architecture:
The main orchestrator that executes playbooks. It:
- Loads and validates playbooks
- Manages the execution context
- Coordinates task execution
- Handles loops and conditions
Parses YAML playbooks into structured task definitions using Pydantic models for validation.
Individual actions that can be performed:
- BrowserTask: Navigation operations (
browser.goto) - DOMTask: Element queries (
dom.get_element,dom.get_elements) - KeyboardTask: Text input (
keyboard.type,keyboard.press) - ClickTask: Mouse operations (
mouse.click) - WaitTask: Delays and waits (
wait)
Browser-specific implementations of the task system:
- SeleniumWorkerStrategy: Selenium WebDriver implementation
- PlaywrightWorkerStrategy: Playwright implementation (planned)
- PuppeteerWorkerStrategy: Puppeteer implementation (planned)
Each strategy provides a get_available_tasks() method that returns the engine-specific task implementations.
Manages variables and state throughout playbook execution:
- Store and retrieve variables
- Pass data between tasks
- Maintain isolated contexts for loops
- Variable injection and cloning for sub-tasks
| Engine | Status | Description |
|---|---|---|
| β Selenium | Implemented | Full support for Selenium WebDriver |
| π§ Playwright | Planned | Microsoft's modern browser automation tool |
| π§ Puppeteer | Planned | Google's headless Chrome automation library |
Every task in a playbook can have these attributes:
| Attribute | Required | Description |
|---|---|---|
name |
β | Human-readable task name |
action |
β | The action to perform (e.g., browser.goto, dom.get_element) |
output |
β | Variable name to store the result |
when |
β | List of conditions for conditional execution |
map |
β | Variable name of list to iterate over |
tasks |
β | Nested tasks for loops |
item_name |
β | Variable name for each item in a loop (default: item) |
Additional attributes depend on the specific action being performed.
The when attribute accepts a list of conditions:
- name: Close popup if exists
action: $popup.click
when:
- variable: popup
is_defined: trueAvailable condition operators:
is_defined: true/false- Check if variable existsequals: value- Check equalitynot_equals: value- Check inequalitygreater_than: number- Numeric comparisonless_than: number- Numeric comparison
Use map to iterate over lists:
- name: Process all products
map: products
item_name: product
tasks:
- name: Get product title
action: $product.get_element
selector: ".title"
output: titleReference variables using the $ prefix:
$variable_name- Use variable value$element.method- Call method on variable (e.g.,$element.click)
browser.goto- Navigate to a URL- name: Navigate to page action: browser.goto url: https://example.com output: page
-
dom.get_element- Find a single element- name: Find search box action: dom.get_element selector: "input[name='q']" output: search_box
-
dom.get_elements- Find multiple elements- name: Get all products action: dom.get_elements selector: ".product-card" output: products
-
keyboard.type- Type text- name: Type search query action: keyboard.type text: "browser automation"
-
keyboard.press- Press a key- name: Press Enter action: keyboard.press key: "Enter"
mouse.click- Click at coordinates- name: Click at position action: mouse.click x: 100 y: 200
wait- Wait for a duration- name: Wait for page load action: wait duration: 2
Call methods on stored element references:
$element.click- Click an element$element.get_text- Extract text content$element.get_attribute- Get element attribute- name: Get element attribute action: $element.get_attribute attribute_name: "href" output: link_url
$element.get_element- Find child element- name: Find child element action: $product.get_element selector: ".price" output: price
Check out the examples/ directory for complete playbook examples:
- simple_navigation.yaml - Basic browser navigation
- form_filling.yaml - Form interaction and submission
- loop_scraping.yaml - Processing multiple elements with loops
- conditional_tasks.yaml - Using conditions to handle dynamic content
See examples/README.md for detailed explanations.
We welcome contributions! Whether you want to:
- π Report bugs
- π‘ Suggest features
- π Improve documentation
- π§ Add support for new browser engines
Please read CONTRIBUTING.md for guidelines on how to contribute.
This project is licensed under the MIT License - see the LICENSE file for details.
- Playwright Support - Add full Playwright engine implementation
- Puppeteer Support - Add full Puppeteer engine implementation
- More Built-in Tasks - Screenshot capture, file downloads, cookie management
- Better Error Handling - Detailed error messages and recovery strategies
- Documentation Website - Comprehensive docs with tutorials and API reference
- Retry Mechanisms - Automatic retry on failures
- Parallel Execution - Run multiple tasks concurrently
- Plugin System - Easy custom task registration
- Visual Testing - Screenshot comparison and visual regression testing
- Package distribution via PyPI
- IDE extensions for playbook authoring
- Cloud-based execution platform
- Playbook marketplace
Made with β€οΈ by the open source community