Skip to content

Smart default snapshot scoping to main content area #45

@avifenesh

Description

@avifenesh

Problem

A goto to a typical website returns the full accessibility tree including sidebars, navigation, footers, banners, and other chrome. For dev.to's homepage, this produced 40KB of output when the relevant content (article feed) was roughly 30% of that.

The --snapshot-selector flag exists but requires foreknowledge of the DOM structure, which the agent doesn't have on first visit.

Proposal

When no --snapshot-selector is provided, auto-detect and prefer the <main> element if one exists. Most modern websites use semantic HTML with a <main> element containing the primary content. This would cut output size significantly without requiring the agent to know the page structure upfront.

Fallback chain:

  1. <main> element (if exists and is non-empty)
  2. [role="main"] (ARIA equivalent)
  3. Full page (current behavior)

This could be a default that users override with --snapshot-selector or disable with --snapshot-full.

Impact

On the dev.to test case, scoping to <main> would have reduced output from ~40KB to ~15KB - keeping it under the 30K tool output limit and saving significant context tokens.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions