-
Notifications
You must be signed in to change notification settings - Fork 41
feat: add dynamic context control for get-pointed-element tool #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: add dynamic context control for get-pointed-element tool #13
Conversation
cc @elieteyssedou would love to hear your thoughts & review. Saw feature 1 on the roadmap and thought I'd give it a go. This helps a lot with context management in my personal use so I thought I'd PR it. |
Wow @ashtonchew, thank you for submitting the PR! 🤩 Cheers, |
Hi @ashtonchew , I just merged something that was WIP since last week, which is a rework of DOM element parsing/extraction in MCP. Now, the chrome-extension will be able to send "RawPointedDOMElement" and the server will extract/parse properties from this. This will allow MCP Pointer to work with images easily for example. Do you want to try to rebase on this? (I guess that it will cause the frontend part to be very much simplified, as the "filtering" will only happen on the server side) |
d564b32
to
e4f5d46
Compare
Hey @elieteyssedou! Successfully rebased on the new architecture. The dynamic context control now works perfectly with ProcessedPointedDOMElement. Here are the new changes:
Bug Fix (PR #14):
Tested with 4 detail levels on the same element:
And also tested on the test suites and they passed. The extension-side filtering code is still there but not actively used yet, ready for when we switch to sending RawPointedDOMElement with DOM_ELEMENT_POINTED message type. To my understanding, it seemed like server-side RawPointedDOMElement implementation was complete but the extension-side was still on the legacy |
Hi @ashtonchew, Thank you for updating the code. The 4 detail levels example is speaking by itself, well done! 🎉 We need this. I saw that you were using using What I suggest, is that you rebase one last time from main branch, and implement filtering on the fly (on tool calls) on the server side, not on chrome extension.
So, full css details would be accessible anytime without having to re-point to something. Tell me if I can help in any way. Thank you a lot for collaborating on MCP Pointer :) |
- Extension was referencing PointerMessageType.ELEMENT_SELECTED which doesn't exist - Changed to PointerMessageType.LEGACY_ELEMENT_SELECTED in element-sender-service.ts - Fixed websocket-server.ts to remove invalid ELEMENT_CLEARED reference (dead code) - Fixed lint error in mcp-service.ts (line length) - Fixes bug introduced in PR etsd-tech#14
…ment - Add cssComputed and textContent fields to ProcessedPointedDOMElement type - Update ElementProcessor to capture full computed styles from raw data - Adapt element-detail.ts to shape ProcessedPointedDOMElement instead of TargetedElement - Update tests to use ProcessedPointedDOMElement - All shaping logic now works server-side on processed elements
- Modified getRelevantStyles() to return all computed styles instead of filtering to 5 properties - Updated test mocks to reflect full CSS data in cssProperties - Enables full CSS details without re-pointing, with server-side filtering on MCP tool calls
- Remove LEGACY_ELEMENT_SELECTED message handling - only DOM_ELEMENT_POINTED supported - Delete unused mcp-handler.ts and websocket-server.ts files - Remove StateDataV1 and LegacySharedState types - Simplify SharedStateService to only handle V2 format - Update tests to remove legacy test cases
- Add changeset for minor version bump (0.5.2 → 0.6.0) - Update CONTRIBUTING.md project structure to reflect current architecture - Document services/ and utils/ directories - Remove references to deleted files
e4f5d46
to
ca7a516
Compare
Hey @elieteyssedou, Thanks for the guidance! I've implemented both steps you suggested:
While implementing this, I also removed the legacy code:
I've created a changeset for v0.6.0 and pushed everything to the PR. All tests are passing. Ran a quick sanity check in Claude Code with the new MCP server:
Always happy to contribute. |
- Chrome extensions require return true when using sendResponse asynchronously - Without it, message channel closes immediately causing chrome.runtime.lastError - Fixes bug introduced in commit 155a07c where refactor removed the return statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @ashtonchew,
Great job!
Here some comments, feel free to discuss if you disagree on some suggestions. :)
|
||
export function adaptTargetToElement(element: HTMLElement): TargetedElement { | ||
return { | ||
export function adaptTargetToElement( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @ashtonchew,
I see that adaptTargetToElement
still exists, my mistake. adaptTargetToElement
is no longer used, and should be removed.
Would you like to remove it in your PR or do you want me to clean-up in a separate commit so you can rebase?
element: HTMLElement, | ||
options: ElementSerializationOptions = {}, | ||
): TargetedElement { | ||
const textDetail = options.textDetail ?? DEFAULT_TEXT_DETAIL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that confusion with adaptTargetToElement
still existing led to maintain the notion of detail parameter on chrome-extension side.
The idea is to completely remove the idea of filtering parameters on the front-end/extension side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(this will probably lead to a 0 added line to chrome-extension in this PR)
} as StateDataV2, | ||
}); | ||
|
||
export const createStateV1 = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, the state versioning was existing to maintain non-updated chrome-extension with new server code.
As the event code is deployed from 1 or 2 days, that the update will continue to spread in the coming days, it is okay to remove legacy in this PR. Just wanted to give you insights on why this existed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. 👌🏻
element: ProcessedPointedDOMElement, | ||
cssLevel: CSSDetailLevel, | ||
): CSSProperties | undefined { | ||
if (cssLevel === 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could have a numeric enum for cssLevel (and textDetailLevel too), what do you think?
@@ -0,0 +1,76 @@ | |||
import { CSSDetailLevel, TextDetailLevel } from './types'; | |||
|
|||
export const TEXT_DETAIL_OPTIONS: readonly TextDetailLevel[] = ['full', 'visible', 'none']; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enums would be an interesting choice here (as mentioned on the upper comment).
timestamp: new Date(raw.timestamp).toISOString(), | ||
|
||
cssProperties: this.getRelevantStyles(raw.computedStyles), | ||
cssComputed: raw.computedStyles ? { ...raw.computedStyles } : undefined, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should only have either cssProperties
or cssComputed
, isn't it redundant to have both? (i'd store full computed properties in a single property)
}; | ||
} | ||
|
||
private getRelevantStyles(styles?: Record<string, string>): CSSProperties | undefined { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is not relevant anymore I guess.
return undefined; | ||
} | ||
|
||
export function shapeElementForDetail( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think of renaming shapeElementForDetail
to serializeElement
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function could return a SerializedDOMElement
(new type), but may be not needed for now.
Summary
The server previously returned a single, fully populated element payload, forcing agents to accept hidden text and all CSS even when they wanted smaller responses. This implements roadmap feature item #1 (Dynamic Context Control) to stop wasted tokens in downstream LLMs.
This PR introduces request-time controls so clients decide how much text and styling to receive while we maintain the existing “full detail” defaults for backward compatibility.
Changes Made
packages/shared/src/types.ts
,packages/shared/src/detail.ts
).packages/chrome-extension/src/utils/element.ts
).textDetail
andcssLevel
, normalize inputs, and prune responses accordingly; added shaping utilities and focused Jest coverage (packages/server/src/utils/element-detail.ts
,packages/server/src/{mcp-handler.ts,services/mcp-service.ts}
,packages/server/src/__tests__/utils/element-detail.test.ts
).get-pointed-element
(identical change in handler and service):No breaking changes; no new dependencies.
Testing
Screenshots (if applicable)