π οΈ Setup - π Usage - π» Demo - π Ecosystem - π AgentLab - π Contributors - π Paper - π Citation
pip install browsergymWarning
BrowserGym is meant to provide an open, easy-to-use and extensible framework to accelerate the field of web agent research. It is not meant to be a consumer product. Use with caution!
Tip
π Check out AgentLabβ¨ ! A seamless framework to implement, test, and evaluate your web agents on all BrowserGym benchmarks.
4x4.grid.mp4
Example of a GPT4-V agent executing openended tasks (top row, chat interactive), as well as WebArena and WorkArena tasks (bottom row).
BrowserGym includes the following benchmarks by default:
- MiniWoB
 - WebArena
 - VisualWebArena
 - WorkArena
 - AssistantBench
 - WebLINX (static benchmark)
 
Designing new web benchmarks with BrowserGym is easy, and simply requires to inherit the AbstractBrowserTask class.
To use browsergym, install one of the following packages:
pip install browsergym  # (recommended) everything below
pip install browsergym-experiments  # experiment utilities (agent, loop, benchmarks) + everything below
pip install browsergym-core  # core functionalities only (no benchmark, just the openended task)
pip install browsergym-miniwob  # core + miniwob
pip install browsergym-webarena  # core + webarena
pip install browsergym-visualwebarena  # core + visualwebarena
pip install browsergym-workarena  # core + workarena
pip install browsergym-assistantbench  # core + assistantbench
pip install weblinx-browsergym  # core + weblinxThen setup playwright by running
playwright install chromiumFinally, each benchmark comes with its own specific setup that requires to follow additional steps.
- for MiniWoB++, see miniwob/README.md
 - for WebArena, see webarena/README.md
 - for VisualWebArena, see visualwebarena/README.md
 - for WorkArena, see WorkArena
 - for AssistantBench, see assistantbench/README.md
 
To install browsergym locally for development, use the following commands:
git clone git@github.com:ServiceNow/BrowserGym.git
cd BrowserGym
make installContributions are welcome! π
Boilerplate code to run an agent on an interactive, open-ended task:
import gymnasium as gym
import browsergym.core  # register the openended task as a gym environment
# start an openended environment
env = gym.make(
    "browsergym/openended",
    task_kwargs={"start_url": "https://www.google.com/"},  # starting URL
    wait_for_user_message=True,  # wait for a user message after each agent message sent to the chat
)
# run the environment <> agent loop until termination
obs, info = env.reset()
while True:
    action = ...  # implement your agent here
    obs, reward, terminated, truncated, info = env.step(action)
    if terminated or truncated:
        break
# release the environment
env.close()MiniWoB
import gymnasium as gym
import browsergym.miniwob  # register miniwob tasks as gym environments
# start a miniwob task
env = gym.make("browsergym/miniwob.choose-list")
...
# list all the available miniwob tasks
env_ids = [id for id in gym.envs.registry.keys() if id.startswith("browsergym/miniwob")]
print("\n".join(env_ids))WorkArena
import gymnasium as gym
import browsergym.workarena  # register workarena tasks as gym environments
# start a workarena task
env = gym.make("browsergym/workarena.servicenow.order-ipad-pro")
...
# list all the available workarena tasks
env_ids = [id for id in gym.envs.registry.keys() if id.startswith("browsergym/workarena")]
print("\n".join(env_ids))WebArena
import gymnasium as gym
import browsergym.webarena  # register webarena tasks as gym environments
# start a webarena task
env = gym.make("browsergym/webarena.310")
...
# list all the available webarena tasks
env_ids = [id for id in gym.envs.registry.keys() if id.startswith("browsergym/webarena")]
print("\n".join(env_ids))VisualWebArena
import gymnasium as gym
import browsergym.webarena  # register webarena tasks as gym environments
# start a visualwebarena task
env = gym.make("browsergym/visualwebarena.721")
...
# list all the available visualwebarena tasks
env_ids = [id for id in gym.envs.registry.keys() if id.startswith("browsergym/visualwebarena")]
print("\n".join(env_ids))AssistantBench
import gymnasium as gym
import browsergym.workarena  # register assistantbench tasks as gym environments
# start an assistantbench task
env = gym.make("browsergym/assistantbench.validation.3")
...
# list all the available assistantbench tasks
env_ids = [id for id in gym.envs.registry.keys() if id.startswith("browsergym/workarena")]
print("\n".join(env_ids))If you want to experiment with a demo agent in BrowserGym, follow these steps
# conda setup
conda env create -f demo_agent/environment.yml
conda activate demo_agent
# or pip setup
pip install -r demo_agent/requirements.txt
# then download the browser for playwright
playwright install chromiumOur demo agent uses openai as a backend, be sure to set your OPENAI_API_KEY.
Launch the demo agent as follows
# openended (interactive chat mode)
python demo_agent/run_demo.py --task_name openended --start_url https://www.google.com
# miniwob
python demo_agent/run_demo.py --task_name miniwob.click-test
# workarena
python demo_agent/run_demo.py --task_name workarena.servicenow.order-standard-laptop
# webarena
python demo_agent/run_demo.py --task_name webarena.4
# visualwebarena
python demo_agent/run_demo.py --task_name visualwebarena.398You can customize your experience by changing the model_name to your preferred LLM (it uses gpt-4o-mini by default), adding screenshots for your VLMs with use_screenshot, and much more!
python demo_agent/run_demo.py --help- AgentLab: Seamlessly run agents on benchmarks, collect and analyse traces.
 - WorkArena(++): A benchmark for web agents on the ServiceNow platform.
 - WebArena: A benchmark of realistic web tasks on self-hosted domains.
 - VisualWebArena: A benchmark of realistic visual web tasks on self-hosted domains.
 - MiniWoB(++): A collection of over 100 web tasks on synthetic web pages.
 - WebLINX: A dataset of real-world web interaction traces.
 - AssistantBench: A benchmark of realistic and time-consuming tasks on the open web.
 - DoomArena: A framework for AI agent security testing which supports injecting attacks into web pages from Browsergym environments.
 
Please use the two following bibtex entries if you wish to cite BrowserGym:
@article{
    chezelles2025browsergym,
    title={The BrowserGym Ecosystem for Web Agent Research},
    author={Thibault Le Sellier de Chezelles and Maxime Gasse and Alexandre Lacoste and Massimo Caccia and Alexandre Drouin and L{\'e}o Boisvert and Megh Thakkar and Tom Marty and Rim Assouel and Sahar Omidi Shayegan and Lawrence Keunho Jang and Xing Han L{\`u} and Ori Yoran and Dehan Kong and Frank F. Xu and Siva Reddy and Graham Neubig and Quentin Cappart and Russ Salakhutdinov and Nicolas Chapados},
    journal={Transactions on Machine Learning Research},
    issn={2835-8856},
    year={2025},
    url={https://openreview.net/forum?id=5298fKGmv3},
    note={Expert Certification}
}
@inproceedings{workarena2024,
    title = {{W}ork{A}rena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?},
    author = {Drouin, Alexandre and Gasse, Maxime and Caccia, Massimo and Laradji, Issam H. and Del Verme, Manuel and Marty, Tom and Vazquez, David and Chapados, Nicolas and Lacoste, Alexandre},
    booktitle = {Proceedings of the 41st International Conference on Machine Learning},
    pages = {11642--11662},
    year = {2024},
    editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix},
    volume = {235},
    series = {Proceedings of Machine Learning Research},
    month = {21--27 Jul},
    publisher = {PMLR},
    url = {https://proceedings.mlr.press/v235/drouin24a.html},
}Here is an example of how they can be used:
We use the BrowserGym framework for our experiments \cite{workarena2024,chezelles2025browsergym}.