Skip to content

Commit f68b6f3

Browse files
yuyutaotaozhoushaw
andauthored
feat(chrome-devtool): allow tracking active tab in bridge mode (#282)
--------- Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
1 parent fb2b9d1 commit f68b6f3

File tree

8 files changed

+80
-56
lines changed

8 files changed

+80
-56
lines changed

apps/site/docs/en/automate-with-scripts-in-yaml.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,13 +34,13 @@ Config the OpenAI API key in the environment variable
3434
export OPENAI_API_KEY="sk-abcdefghijklmnopqrstuvwxyz"
3535
```
3636

37-
or you can use a `.env` file to store the configuration
37+
or you can use a `.env` file to store the configuration, Midscene command line tool will automatically load it when running yaml scripts.
3838

3939
```env filename=.env
4040
OPENAI_API_KEY="sk-abcdefghijklmnopqrstuvwxyz"
4141
```
4242

43-
or you may [customize model and provider](./model-provider)
43+
For more details about model and provider, see [customize model and provider](./model-provider)
4444

4545
## Start
4646

apps/site/docs/en/bridge-mode-by-chrome-extension.mdx

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -79,11 +79,13 @@ You should always call `connectCurrentTab` or `connectNewTabWithUrl` before doin
7979
Each of the agent instance can only connect to one tab instance, and it cannot be reconnected after destroy.
8080
:::
8181

82-
### `connectCurrentTab`
82+
### `connectCurrentTab(options?: { trackingActiveTab?: boolean })`
8383

8484
Connect to the current active tab on Chrome.
8585

86-
### `connectNewTabWithUrl(ur: string)`
86+
If `trackingActiveTab` is true, the agent will always track the active tab. For example, if you switch to another tab or a new tab is opened, the agent will track the latest active tab. Otherwise, the agent will only track the tab you connected to initially.
87+
88+
### `connectNewTabWithUrl(url: string, options?: { trackingActiveTab?: boolean })`
8789

8890
Create a new tab with url and connect to immediately.
8991

apps/site/docs/zh/automate-with-scripts-in-yaml.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,13 +34,13 @@ tasks:
3434
export OPENAI_API_KEY="sk-abcdefghijklmnopqrstuvwxyz"
3535
```
3636

37-
或使用 `.env` 文件存储配置
37+
或使用 `.env` 文件存储配置,Midscene 命令行工具在运行 yaml 脚本时会自动加载它
3838

3939
```env filename=.env
4040
OPENAI_API_KEY="sk-abcdefghijklmnopqrstuvwxyz"
4141
```
4242

43-
[自定义模型和服务商](./model-provider)
43+
更多关于模型和服务商的配置,请参阅 [自定义模型和服务商](./model-provider)
4444

4545
## 开始
4646

apps/site/docs/zh/bridge-mode-by-chrome-extension.mdx

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -79,11 +79,13 @@ tsx demo-new-tab.ts
7979
每个 agent 实例只能连接到一个标签页实例,并且一旦被销毁,就无法重新连接。
8080
:::
8181

82-
### `connectCurrentTab`
82+
### `connectCurrentTab(options?: { trackingActiveTab?: boolean })`
8383

8484
连接到当前已激活的标签页。
8585

86-
### `connectNewTabWithUrl(ur: string)`
86+
如果 `trackingActiveTab` 为 true,则 agent 将始终跟踪当前激活的标签页。例如,如果你切换到另一个标签页或打开一个新的标签页,agent 将跟踪最新激活的标签页。否则,agent 将只跟踪你最初连接的标签页。
87+
88+
### `connectNewTabWithUrl(url: string, options?: { trackingActiveTab?: boolean })`
8789

8890
创建一个新标签页,并立即连接到它。
8991

packages/web-integration/src/bridge-mode/agent-cli-side.ts

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
import assert from 'node:assert';
22
import { PageAgent } from '@/common/agent';
3-
import { paramStr, typeStr } from '@/common/ui-utils';
43
import type { KeyboardAction, MouseAction } from '@/page';
54
import {
5+
type BridgeConnectTabOptions,
66
BridgeEvent,
77
BridgePageType,
88
DefaultBridgeServerPort,
@@ -101,13 +101,13 @@ export class AgentOverChromeBridge extends PageAgent<ChromeExtensionPageCliSide>
101101
});
102102
}
103103

104-
async connectNewTabWithUrl(url: string) {
105-
await this.page.connectNewTabWithUrl(url);
104+
async connectNewTabWithUrl(url: string, options?: BridgeConnectTabOptions) {
105+
await this.page.connectNewTabWithUrl(url, options);
106106
await sleep(500);
107107
}
108108

109-
async connectCurrentTab() {
110-
await this.page.connectCurrentTab();
109+
async connectCurrentTab(options?: BridgeConnectTabOptions) {
110+
await this.page.connectCurrentTab(options);
111111
await sleep(500);
112112
}
113113

packages/web-integration/src/bridge-mode/common.ts

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,14 @@ export enum BridgeEvent {
1313
ConnectCurrentTab = 'connectCurrentTab',
1414
}
1515

16+
export interface BridgeConnectTabOptions {
17+
/**
18+
* If true, the page will always track the active tab.
19+
* @default false
20+
*/
21+
trackingActiveTab?: boolean;
22+
}
23+
1624
export enum MouseEvent {
1725
PREFIX = 'mouse.',
1826
Click = 'mouse.click',

packages/web-integration/src/bridge-mode/page-browser-side.ts

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ import assert from 'node:assert';
22
import type { KeyboardAction, MouseAction } from '@/page';
33
import ChromeExtensionProxyPage from '../chrome-extension/page';
44
import {
5+
type BridgeConnectTabOptions,
56
BridgeEvent,
67
DefaultBridgeServerPort,
78
KeyboardEvent,
@@ -94,22 +95,33 @@ export class ChromeExtensionPageBrowserSide extends ChromeExtensionProxyPage {
9495
return await this.setupBridgeClient();
9596
}
9697

97-
public async connectNewTabWithUrl(url: string) {
98+
public async connectNewTabWithUrl(
99+
url: string,
100+
options?: BridgeConnectTabOptions,
101+
) {
98102
const tab = await chrome.tabs.create({ url });
99103
const tabId = tab.id;
100104
assert(tabId, 'failed to get tabId after creating a new tab');
101105

102106
// new tab
103107
this.onLogMessage(`Creating new tab: ${url}`, 'log');
108+
109+
if (options?.trackingActiveTab) {
110+
this.trackingActiveTab = true;
111+
}
104112
}
105113

106-
public async connectCurrentTab() {
114+
public async connectCurrentTab(options?: BridgeConnectTabOptions) {
107115
const tabs = await chrome.tabs.query({ active: true, currentWindow: true });
108116
console.log('current tab', tabs);
109117
const tabId = tabs[0]?.id;
110118
assert(tabId, 'failed to get tabId');
111119

112120
this.onLogMessage(`Connected to current tab: ${tabs[0]?.url}`, 'log');
121+
122+
if (options?.trackingActiveTab) {
123+
this.trackingActiveTab = true;
124+
}
113125
}
114126

115127
async destroy() {

packages/web-integration/tests/ai/bridge/agent.test.ts

Lines changed: 41 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,11 @@ import {
22
AgentOverChromeBridge,
33
getBridgePageInCliSide,
44
} from '@/bridge-mode/agent-cli-side';
5-
import { describe, expect, it } from 'vitest';
5+
import { describe, expect, it, vi } from 'vitest';
66

7+
vi.setConfig({
8+
testTimeout: 60 * 1000,
9+
});
710
const sleep = (ms: number) => new Promise((resolve) => setTimeout(resolve, ms));
811
describe.skipIf(process.env.CI)(
912
'fully functional agent in server(cli) side',
@@ -16,54 +19,51 @@ describe.skipIf(process.env.CI)(
1619
await page.destroy();
1720
});
1821

19-
it(
20-
'page in cli side',
21-
async () => {
22-
const page = getBridgePageInCliSide();
22+
it('page in cli side', async () => {
23+
const page = getBridgePageInCliSide();
24+
25+
// make sure the extension bridge is launched before timeout
26+
await page.connectNewTabWithUrl('https://www.baidu.com');
2327

24-
// make sure the extension bridge is launched before timeout
25-
await page.connectNewTabWithUrl('https://www.baidu.com');
28+
// sleep 3s
29+
await sleep(3000);
30+
31+
await page.destroy();
32+
});
2633

27-
// sleep 3s
28-
await sleep(3000);
34+
it('agent in cli side, new tab', async () => {
35+
const agent = new AgentOverChromeBridge();
2936

30-
await page.destroy();
31-
},
32-
40 * 1000, // longer than the timeout of the bridge io
33-
);
37+
await agent.connectNewTabWithUrl('https://www.bing.com');
38+
await sleep(3000);
3439

35-
it(
36-
'agent in cli side, new tab',
37-
async () => {
38-
const agent = new AgentOverChromeBridge();
40+
await agent.ai('type "AI 101" and hit Enter');
41+
await sleep(3000);
3942

40-
await agent.connectNewTabWithUrl('https://www.bing.com');
41-
await sleep(3000);
43+
await agent.aiAssert('there are some search results');
44+
await agent.destroy();
45+
});
4246

43-
await agent.ai('type "AI 101" and hit Enter');
44-
await sleep(3000);
47+
it('agent in cli side, current tab', async () => {
48+
const agent = new AgentOverChromeBridge();
49+
await agent.connectCurrentTab();
50+
await sleep(3000);
51+
const answer = await agent.aiQuery(
52+
'name of the current page? return {name: string}',
53+
);
4554

46-
await agent.aiAssert('there are some search results');
47-
await agent.destroy();
48-
},
49-
60 * 1000,
50-
);
55+
console.log(answer);
56+
expect(answer.name).toBeTruthy();
57+
await agent.destroy();
58+
});
5159

52-
it(
53-
'agent in cli side, current tab',
54-
async () => {
55-
const agent = new AgentOverChromeBridge();
56-
await agent.connectCurrentTab();
57-
await sleep(3000);
58-
const answer = await agent.aiQuery(
59-
'name of the current page? return {name: string}',
60-
);
60+
it('agent in cli side, current tab, tracking active tab', async () => {
61+
const agent = new AgentOverChromeBridge();
62+
await agent.connectCurrentTab({ trackingActiveTab: true });
6163

62-
console.log(answer);
63-
expect(answer.name).toBeTruthy();
64-
await agent.destroy();
65-
},
66-
60 * 1000,
67-
);
64+
await agent.ai('click "文库",sleep 1500ms,type "AI 101" and hit Enter');
65+
await sleep(3000);
66+
await agent.destroy();
67+
});
6868
},
6969
);

0 commit comments

Comments
 (0)