Skip to content

Conversation

ashtonchew
Copy link

Summary

The server previously returned a single, fully populated element payload, forcing agents to accept hidden text and all CSS even when they wanted smaller responses. This implements roadmap feature item #1 (Dynamic Context Control) to stop wasted tokens in downstream LLMs.

This PR introduces request-time controls so clients decide how much text and styling to receive while we maintain the existing “full detail” defaults for backward compatibility.

Changes Made

  • Extended shared element types with optional text/CSS variants plus reusable detail constants (packages/shared/src/types.ts, packages/shared/src/detail.ts).
  • Enhanced the Chrome extension serializer to capture full text snapshots and computed styles so the server can down-sample without re-fetching (packages/chrome-extension/src/utils/element.ts).
  • Updated both MCP server entry points to accept textDetail and cssLevel, normalize inputs, and prune responses accordingly; added shaping utilities and focused Jest coverage (packages/server/src/utils/element-detail.ts, packages/server/src/{mcp-handler.ts,services/mcp-service.ts}, packages/server/src/__tests__/utils/element-detail.test.ts).
  • Documented the optional parameters in the README to guide tool authors.
  • Tool schema diff for get-pointed-element (identical change in handler and service):
-          description: 'Get information about the currently pointed/shown DOM element from the browser extension, in order to let you see a specific element the user is showing you on his/her the browser.'
-          inputSchema: {
-            type: 'object',
-            properties: {},
-            required: [],
-          }
+          description: 'Get information about the currently pointed/shown DOM element. Control returned payload size with optional textDetail (full|visible|none) and cssLevel (0-3).'
+          inputSchema: {
+            type: 'object',
+            properties: {
+              textDetail: {
+                type: 'string',
+                enum: [...TEXT_DETAIL_OPTIONS],
+                description: 'Controls how much text is returned. full (default) includes hidden text fallback, visible uses only rendered text, none omits text fields.',
+              },
+              cssLevel: {
+                type: 'integer',
+                enum: [...CSS_DETAIL_OPTIONS],
+                description: 'Controls CSS payload detail. 0 omits CSS, 1 includes layout basics, 2 adds box model, 3 returns the full computed style.',
+              },
+            },
+            required: [],
+          }

No breaking changes; no new dependencies.

Testing

  • Tested locally with Chrome extension
  • Tested MCP server functionality
  • Tested with Claude Code integration
  • All existing tests pass

Screenshots (if applicable)

Screenshot 2025-09-27 at 2 10 01 PM

@ashtonchew ashtonchew changed the title Feature/dynamic context control feat: add dynamic context control for get-pointed-element tool Sep 27, 2025
@ashtonchew
Copy link
Author

cc @elieteyssedou would love to hear your thoughts & review. Saw feature 1 on the roadmap and thought I'd give it a go. This helps a lot with context management in my personal use so I thought I'd PR it.

@elieteyssedou
Copy link
Collaborator

Wow @ashtonchew, thank you for submitting the PR! 🤩
I’ll take a deep dive into your code soon — it looks very promising. I’ll keep you updated.

Cheers,

@elieteyssedou
Copy link
Collaborator

Hi @ashtonchew ,

I just merged something that was WIP since last week, which is a rework of DOM element parsing/extraction in MCP.

Now, the chrome-extension will be able to send "RawPointedDOMElement" and the server will extract/parse properties from this. This will allow MCP Pointer to work with images easily for example.
Anyway, so, full raw data will be now sent from browser to server, and we will now read from the shared state (.json) at data.processedPointedDOMElement => you could apply the dynamic context logic on this object.

Do you want to try to rebase on this? (I guess that it will cause the frontend part to be very much simplified, as the "filtering" will only happen on the server side)

@ashtonchew ashtonchew force-pushed the feature/dynamic-context-control branch from d564b32 to e4f5d46 Compare October 1, 2025 01:57
@ashtonchew
Copy link
Author

Hey @elieteyssedou!

Successfully rebased on the new architecture. The dynamic context control now works perfectly with ProcessedPointedDOMElement.

Here are the new changes:

  • Moved all shaping logic to work on ProcessedPointedDOMElement instead of TargetedElement
  • Extended ProcessedPointedDOMElement type with cssComputed and textContent fields
  • ElementProcessor now captures full computed styles from raw browser data
  • Server-side element-detail.ts shapes the processed element based on textDetail/cssLevel params

Bug Fix (PR #14):

  • Found and fixed: Extension was using PointerMessageType.ELEMENT_SELECTED which doesn't exist after the enum rename
  • Changed to PointerMessageType.LEGACY_ELEMENT_SELECTED in element-sender-service.ts
  • Also cleaned up websocket-server.ts (removed invalid ELEMENT_CLEARED reference)

Tested with 4 detail levels on the same element:

  • textDetail: "none", cssLevel: 0 → 17 lines
  • textDetail: "visible", cssLevel: 1 → 24 lines
  • textDetail: "full", cssLevel: 2 → 53 lines
  • textDetail: "full", cssLevel: 3 → 393 lines

And also tested on the test suites and they passed.

The extension-side filtering code is still there but not actively used yet, ready for when we switch to sending RawPointedDOMElement with DOM_ELEMENT_POINTED message type.

To my understanding, it seemed like server-side RawPointedDOMElement implementation was complete but the extension-side was still on the legacy element-selected path. Let me know if these changes sound good with you.

@elieteyssedou
Copy link
Collaborator

Hi @ashtonchew,

Thank you for updating the code. The 4 detail levels example is speaking by itself, well done! 🎉 We need this.

I saw that you were using using LEGACY_ELEMENT_SELECTED. Actually, we've a new way to transfer data from chrome-extension, with a raw data payload that will then be processed server side.
So I've made the change, now, the chrome-extension is using the new event and transfers a RawPointedDOMElement using the DOM_ELEMENT_POINTED event.

What I suggest, is that you rebase one last time from main branch, and implement filtering on the fly (on tool calls) on the server side, not on chrome extension.
This would be a 2 step implementation:

  • make full CSSProperties (key cssProperties) available in ProcessedPointedDOMElement
  • filtering level detail on server tool use, filtering the sharedState.data.processedPointedDOMElement on the fly based on the tool params

So, full css details would be accessible anytime without having to re-point to something.

Tell me if I can help in any way. Thank you a lot for collaborating on MCP Pointer :)

- Extension was referencing PointerMessageType.ELEMENT_SELECTED which doesn't exist
- Changed to PointerMessageType.LEGACY_ELEMENT_SELECTED in element-sender-service.ts
- Fixed websocket-server.ts to remove invalid ELEMENT_CLEARED reference (dead code)
- Fixed lint error in mcp-service.ts (line length)
- Fixes bug introduced in PR etsd-tech#14
…ment

- Add cssComputed and textContent fields to ProcessedPointedDOMElement type
- Update ElementProcessor to capture full computed styles from raw data
- Adapt element-detail.ts to shape ProcessedPointedDOMElement instead of TargetedElement
- Update tests to use ProcessedPointedDOMElement
- All shaping logic now works server-side on processed elements
- Modified getRelevantStyles() to return all computed styles instead of filtering to 5 properties
- Updated test mocks to reflect full CSS data in cssProperties
- Enables full CSS details without re-pointing, with server-side filtering on MCP tool calls
- Remove LEGACY_ELEMENT_SELECTED message handling - only DOM_ELEMENT_POINTED supported
- Delete unused mcp-handler.ts and websocket-server.ts files
- Remove StateDataV1 and LegacySharedState types
- Simplify SharedStateService to only handle V2 format
- Update tests to remove legacy test cases
- Add changeset for minor version bump (0.5.2 → 0.6.0)
- Update CONTRIBUTING.md project structure to reflect current architecture
- Document services/ and utils/ directories
- Remove references to deleted files
@ashtonchew ashtonchew force-pushed the feature/dynamic-context-control branch from e4f5d46 to ca7a516 Compare October 1, 2025 21:36
@ashtonchew
Copy link
Author

Hey @elieteyssedou,

Thanks for the guidance! I've implemented both steps you suggested:

  1. Full CSS Properties Storage
    Modified element-processor.ts to store all computed CSS properties instead of filtering to just 5. The cssProperties field now contains the complete CSS data.

  2. Server-Side Filtering
    The filtering now happens on-the-fly during tool calls based on the cssLevel parameter. The existing buildCssProperties() function in element-detail.ts handles this, so full CSS details are accessible without re-pointing.

While implementing this, I also removed the legacy code:

  • Removed LEGACY_ELEMENT_SELECTED support - only DOM_ELEMENT_POINTED now
  • Deleted unused files (mcp-handler.ts, websocket-server.ts)
  • Cleaned up V1 types and legacy tests
  • Updated the CONTRIBUTING.md docs to reflect new architecture (with added notes on functionalities per file)

I've created a changeset for v0.6.0 and pushed everything to the PR. All tests are passing. Ran a quick sanity check in Claude Code with the new MCP server:

Pointed Element: GitHub README div for MCP Pointer project at https://github.com/ashtonchew/mcp-pointer

  Summary by Parameter Permutation:

  textDetail (text content control):

  - full: Both innerText and textContent fields with complete README markdown (~4800 chars)
  - visible: Only innerText with rendered text; textContent omitted
  - none: Text fields present but empty strings

  cssLevel (CSS detail control):

  - 0: No cssProperties field
  - 1: Basic layout CSS only (display, position, fontSize, color, backgroundColor) — 5 properties
  - 2: Box model expanded (margins, padding, dimensions, borders, flex, overflow, font details) — ~30 properties
  - 3: Full computed style dump (all browser CSS including animations, transforms, grid, SVG properties) — 400+ properties

  Constant across all modes:

  - selector, tagName, classes, attributes, position, timestamp, url

  Use cases:
  - textDetail=none, cssLevel=0: Minimal payload for structure/selector only
  - textDetail=visible, cssLevel=1: Lightweight for quick style inspection
  - textDetail=full, cssLevel=2: Balanced for component cloning
  - textDetail=full, cssLevel=3: Complete forensics for exact visual replication

Always happy to contribute.

- Chrome extensions require return true when using sendResponse asynchronously
- Without it, message channel closes immediately causing chrome.runtime.lastError
- Fixes bug introduced in commit 155a07c where refactor removed the return statement
Copy link
Collaborator

@elieteyssedou elieteyssedou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ashtonchew,

Great job!
Here some comments, feel free to discuss if you disagree on some suggestions. :)


export function adaptTargetToElement(element: HTMLElement): TargetedElement {
return {
export function adaptTargetToElement(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ashtonchew,

I see that adaptTargetToElement still exists, my mistake. adaptTargetToElement is no longer used, and should be removed.
Would you like to remove it in your PR or do you want me to clean-up in a separate commit so you can rebase?

element: HTMLElement,
options: ElementSerializationOptions = {},
): TargetedElement {
const textDetail = options.textDetail ?? DEFAULT_TEXT_DETAIL;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that confusion with adaptTargetToElement still existing led to maintain the notion of detail parameter on chrome-extension side.

The idea is to completely remove the idea of filtering parameters on the front-end/extension side.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this will probably lead to a 0 added line to chrome-extension in this PR)

} as StateDataV2,
});

export const createStateV1 = (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, the state versioning was existing to maintain non-updated chrome-extension with new server code.

As the event code is deployed from 1 or 2 days, that the update will continue to spread in the coming days, it is okay to remove legacy in this PR. Just wanted to give you insights on why this existed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. 👌🏻

element: ProcessedPointedDOMElement,
cssLevel: CSSDetailLevel,
): CSSProperties | undefined {
if (cssLevel === 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have a numeric enum for cssLevel (and textDetailLevel too), what do you think?

@@ -0,0 +1,76 @@
import { CSSDetailLevel, TextDetailLevel } from './types';

export const TEXT_DETAIL_OPTIONS: readonly TextDetailLevel[] = ['full', 'visible', 'none'];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enums would be an interesting choice here (as mentioned on the upper comment).

timestamp: new Date(raw.timestamp).toISOString(),

cssProperties: this.getRelevantStyles(raw.computedStyles),
cssComputed: raw.computedStyles ? { ...raw.computedStyles } : undefined,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should only have either cssProperties or cssComputed, isn't it redundant to have both? (i'd store full computed properties in a single property)

};
}

private getRelevantStyles(styles?: Record<string, string>): CSSProperties | undefined {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is not relevant anymore I guess.

return undefined;
}

export function shapeElementForDetail(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of renaming shapeElementForDetail to serializeElement?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function could return a SerializedDOMElement (new type), but may be not needed for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants