A TypeScript package that combines Selenium WebDriver with AI capabilities for intelligent browser automation.
This package provides an easy way to automate browser interactions using AI capabilities. It supports multiple LLM providers (OpenAI, Ollama) and can automatically detect and interact with web elements.
For a detailed list of changes and versions, see our Changelog.
- Selenium-based browser automation
- AI-powered element detection and interaction
- Support for multiple LLM providers (OpenAI, Ollama, Grok)
- Screenshot capture capability
- Flexible step-based automation configuration
npm install ai-browser-automation
# or
pnpm add ai-browser-automation
import { AiBrowserAutomation } from 'ai-browser-automation';
const automation = new AiBrowserAutomation({
llmProvider: 'OpenAI',
apiKey: 'your-api-key',
browser: 'chrome',
headless: true
});
const steps = [
{
action: 'navigate',
description: 'Go to Google',
url: 'https://google.com'
},
{
action: 'write',
description: 'Search for something',
solve_with_ai: true
}
];
const result = await automation.execute(steps);
For detailed examples with screenshots and execution outputs, see our Examples Documentation.
You can find example scripts in the examples
directory. To run a specific example:
pnpm run-example google