A AI-powered visual testing tool for web apps. Users describe tests in plain English, Playwright runs the actions and captures screenshots, and Gemini evaluates whether the flow worked correctly.
Frontend: Next.js, TypeScript Backend: Python, FastAPI Browser Automation: Playwright (Python) Real-Time Streaming: Socket.io AI/LLM: Google Gemini Vision API
This project was built at Hack Western 2025.
- User inputs a URL and a test prompt.
- Backend starts a Playwright browser instance and navigates to the URL.
- Backend starts an agent loop that: a. Captures a screenshot of the current page. b. Sends the screenshot, URL, and test prompt to Gemini Vision API. c. Gemini returns an action (e.g., "click", "type", "scroll") and arguments. d. Backend executes the action using Playwright. e. Backend sends the new screenshot and action details to the frontend.
- This loop continues until Gemini decides the test is complete or a timeout is reached.
- Frontend displays the test session in a split-view with the live browser and action history.
- Run
npm installin thefrontenddirectory. - Run
pip install -r requirements.txtin thebackenddirectory. - Run
npm run devin thefrontenddirectory. - Run
uvicorn app.main:socket_app --reload --port 8000in thebackenddirectory.