Skip to content

Commit 67e13f7

Browse files
authored
Use browserbase in github action eval (#84)
* add simple google search eval * add 2 more evals * make sure extract continues to use the same model on repeated call * add twitter sign up eval case * update eval * add basic banalayzer eval system * add server * update package jsons * clean up the files * clean up cleanup add gitignore cleanup * fix the bananalyzer eval system + add it to the main eval script * remove all public files on server exit * fix the package.json playwright issue * clean up logs * remove .vscode * cleanup * move the test evals to the playground script * cleanup * cleanup * add server/public to gitignore * test -> playround (much better name) * fix the resource deletion issue * update readme + cleanup * cleanup of readme * remove the changes in teh lib folder * cleanup readme * cleanup * cleanup * update readme * use browserbase browser in github action eval * force eval env to browserbase for github action * Set peeler env to local
1 parent 46c379b commit 67e13f7

File tree

2 files changed

+19
-8
lines changed

2 files changed

+19
-8
lines changed

.github/workflows/ci.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,9 @@ jobs:
1313
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
1414
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
1515
BRAINTRUST_API_KEY: ${{ secrets.BRAINTRUST_API_KEY }}
16+
BROWSERBASE_API_KEY: ${{ secrets.BROWSERBASE_API_KEY }}
1617
HEADLESS: true
18+
EVAL_ENV: browserbase
1719

1820
steps:
1921
- name: Check out repository code

evals/index.eval.ts

Lines changed: 17 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,14 @@ import { evaluateExample, chosenBananalyzerEvals } from "./bananalyzer-ts";
55
import { createExpressServer } from "./bananalyzer-ts/server/expressServer";
66
import process from "process";
77

8+
const env =
9+
process.env.EVAL_ENV?.toLowerCase() === "browserbase"
10+
? "BROWSERBASE"
11+
: "LOCAL";
12+
813
const vanta = async () => {
914
const stagehand = new Stagehand({
10-
env: "LOCAL",
15+
env,
1116
headless: process.env.HEADLESS !== "false",
1217
});
1318
await stagehand.init();
@@ -17,7 +22,10 @@ const vanta = async () => {
1722

1823
const observation = await stagehand.observe("find the request demo button");
1924

20-
if (!observation) return false;
25+
if (!observation) {
26+
await stagehand.context.close();
27+
return false;
28+
}
2129

2230
const observationResult = await stagehand.page
2331
.locator(stagehand.observations[observation].result)
@@ -38,7 +46,7 @@ const vanta = async () => {
3846

3947
const vanta_h = async () => {
4048
const stagehand = new Stagehand({
41-
env: "LOCAL",
49+
env,
4250
headless: process.env.HEADLESS !== "false",
4351
});
4452
await stagehand.init();
@@ -56,7 +64,7 @@ const vanta_h = async () => {
5664

5765
const simple_google_search = async () => {
5866
const stagehand = new Stagehand({
59-
env: "LOCAL",
67+
env,
6068
headless: process.env.HEADLESS !== "false",
6169
});
6270
await stagehand.init();
@@ -69,6 +77,7 @@ const simple_google_search = async () => {
6977

7078
const expectedUrl = "https://www.google.com/search?q=OpenAI";
7179
const currentUrl = await stagehand.page.url();
80+
7281
await stagehand.context.close();
7382

7483
return currentUrl.startsWith(expectedUrl);
@@ -97,7 +106,7 @@ const peeler_simple = async () => {
97106

98107
const peeler_complex = async () => {
99108
const stagehand = new Stagehand({
100-
env: "LOCAL",
109+
env,
101110
verbose: 1,
102111
headless: process.env.HEADLESS !== "false",
103112
});
@@ -202,7 +211,7 @@ const extract_last_twenty_github_commits = async () => {
202211

203212
const wikipedia = async () => {
204213
const stagehand = new Stagehand({
205-
env: "LOCAL",
214+
env,
206215
verbose: 2,
207216
headless: process.env.HEADLESS !== "false",
208217
});
@@ -222,7 +231,7 @@ const wikipedia = async () => {
222231

223232
const costar = async () => {
224233
const stagehand = new Stagehand({
225-
env: "LOCAL",
234+
env,
226235
verbose: 2,
227236
debugDom: true,
228237
headless: process.env.HEADLESS !== "false",
@@ -270,7 +279,7 @@ const costar = async () => {
270279

271280
const google_jobs = async () => {
272281
const stagehand = new Stagehand({
273-
env: "LOCAL",
282+
env,
274283
verbose: 2,
275284
debugDom: true,
276285
headless: process.env.HEADLESS !== "false",

0 commit comments

Comments
 (0)