Skip to content

Compact snapshot format for token-efficient LLM consumption #46

@avifenesh

Description

@avifenesh

Problem

The ARIA snapshot format is verbose - every link, button, image, and text node gets its own indented line. For a page with 50 links in a sidebar, that's 100+ lines of tree structure (link line + url line each) that an LLM must process.

Example of current verbosity for a single article card:

- link "Article Title":
    - /url: /path/to/article
- link "author profile":
    - /url: /author
    - img "author profile"
- button "Author Name profile details": Author Name
- link "Feb 23":
    - /url: /path/to/article
    - time: Feb 23
- heading "Article Title" [level=2]:
    - link "Article Title":
        - /url: /path/to/article
- link "# tag1":
    - /url: /t/tag1

That's 15 lines for one card. With 15 cards on a page plus sidebar/footer, the snapshot explodes.

Proposal

Add a --snapshot-compact flag that produces a denser format optimized for LLM consumption:

  • Collapse link + url into one line: link "Title" -> /path
  • Omit redundant duplicate links (same URL appearing multiple times in a card)
  • Omit decorative images (img without meaningful alt text)
  • Inline simple children: heading [h2] "Article Title" -> /path

Example compact output for the same card:

link "Article Title" -> /path/to/article
  by Author Name, Feb 23
  tags: #tag1

Impact

Rough estimate: 50-70% reduction in snapshot token count for content-heavy pages, directly reducing agent costs per page visit.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions