preserving raw output for analysis? #33

roycewilliams · 2026-03-02T16:05:19Z

roycewilliams
Mar 2, 2026

(apologies if this is already a solved problem -- if so, maybe it should be surfaced more prominently?)

I regularly benefit from retroactively analyzing the incidental output of tool use, to harvest the observations made as items for the issue queue.

As the tools chatter amongst themselves, they often observe things that should probably be fixed or added. Analyzing this chatter surfaces issues that are often quite important -- that I wouldn't even know to look for.

So it might be useful to save all of that raw output, so that it could be separately analyzed in a single pass in the same manner. Something like a --log-raw-output option, that I could create a shorthand to trigger analysis against.

rjkaes · 2026-03-02T19:30:46Z

rjkaes
Mar 2, 2026

This is an interesting idea. When you say "raw" output, what kind of information are you looking for?

We could do something with an environment variable that you pass to Claude so the MCP server knows you want to record raw output.

0 replies

roycewilliams · 2026-03-02T22:28:31Z

roycewilliams
Mar 2, 2026
Author

Well, I am not sure I understand the model well enough to give a clear answer 😅 , but ... roughly:

If the core value prop here is that context burden is reduced by suppressing some output from tool use, then whatever that full output would have been -- the prompts, the tool use, and the responses, all chronological -- could be written somewhere (with full context) for separate analysis.

0 replies

rjkaes · 2026-03-03T01:10:10Z

rjkaes
Mar 3, 2026

Got it! Well, we have a couple of options:

Write out to stderr which Claude Code surfaces in debug mode.
Write out structured JSONL file in something like ~/.claude/context-mode-calls.jsonl

Both are doable, so it comes down to whether we just want to see the raw logs or whether we what something structured that we can query with jq or similar.

Option 1 is the easiest, but it does mean you have to turn on debug mode to see it, whereas option 2 would always write the data. However, option 2 requires us to somehow manage the amount of data we write to prevent the file from growing forever without bound.

1 reply

roycewilliams Mar 3, 2026
Author

Naively, it feels like it could be a sidecar to the session itself, such that it would be opened at session start, recorded during the session, and closed out at session end (so naturally bounded). And the user could just have it as an overall setting, whether or not they want such files to be created. (And independent from whether or not debugging is enabled.)

roycewilliams · 2026-03-07T17:18:25Z

roycewilliams
Mar 7, 2026
Author

@rjkaes, what are your thoughts on the per-session model? As an option, it would sidestep any scale issues, and would be just lightweight enough to allow end users to overlay their own process/tooling.

2 replies

rjkaes Mar 8, 2026

With Claude Code (at least), we can use SessionStart to parse the session_id and use that to key the logs (~/.claude/context-mode-logs/<session_id>.jsonl).

(We could also use SessionStart to prune old files so we don't fill up the disk.)

The each tool call would write out something like:

timestamp
tool name
input
raw output

One issue is sub-agents. From what I can find, they are invoked in a "separate process" that has a different session_id (and maybe no session_id at all!) If it's separate, we could at least use that as the key; otherwise, we'd have nothing. If we're fine with subagents having their own log and that it's up to the user to seek out what they're looking for, then this isn't a deal-breaker.

(Thinking out loud, we likely need to enforce some kind of file locking since multiple tools can be called in parallel. If we do use locking, then maybe the lack of session id in subagents is fine since we could write to a "shared" JSONL file.)

We'd also need to decide on the shape of the logs for each tool (ctx_execute → stdout/stderr, ctx_search → FTS5 results, ctx_fetch_and_index → fetched content) without enforcing a single schema.

I'm also not sure how this would work with the other coding agents we support.

But the bones are here. 😄

roycewilliams Mar 8, 2026
Author

Just for background / correlation, I recently found out about this project (I'm unaffiliated), which (in the conceptual sense, anyway) would be an example of a consumer of the preserved context we're talking about.

https://github.com/jacob-dietle/tastematter

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

preserving raw output for analysis? #33

Uh oh!

{{title}}

Uh oh!

Replies: 4 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

preserving raw output for analysis? #33

Uh oh!

roycewilliams Mar 2, 2026

Replies: 4 comments · 3 replies

Uh oh!

rjkaes Mar 2, 2026

Uh oh!

roycewilliams Mar 2, 2026 Author

Uh oh!

rjkaes Mar 3, 2026

Uh oh!

Uh oh!

roycewilliams Mar 3, 2026 Author

Uh oh!

roycewilliams Mar 7, 2026 Author

Uh oh!

rjkaes Mar 8, 2026

Uh oh!

roycewilliams Mar 8, 2026 Author

roycewilliams
Mar 2, 2026

Replies: 4 comments 3 replies

rjkaes
Mar 2, 2026

roycewilliams
Mar 2, 2026
Author

rjkaes
Mar 3, 2026

roycewilliams Mar 3, 2026
Author

roycewilliams
Mar 7, 2026
Author

roycewilliams Mar 8, 2026
Author