feat: add performance metrics display (timer, tokens, tok/s) #75

MkDev11 · 2026-01-23T00:09:34Z

What this does

This adds performance metrics so you can see how long queries take and how many tokens they use. Really helpful when using local models with Ollama to understand performance.

The problem

Right now when you ask Dexter a question, you have no idea:

How long it took to think
How many tokens were used
How fast the model is responding

This makes it hard to optimize your setup, especially with local models.

The solution

After each answer, you'll now see a line like this:

✻ 2s · 1,297 tokens (718.6 tok/s)

This shows:

2s - Total time from question to answer
1,297 tokens - How many tokens were used (input + output)
718.6 tok/s - Throughput (tokens per second)

Why this matters

Local model users can see if their setup is fast enough
Cost tracking - Know how many tokens you're burning through
Performance tuning - Spot slow queries and optimize
Model comparison - Compare different models side-by-side

Technical details

Tracks timing from start to finish
Accumulates tokens across all LLM calls (including tool summaries)
Extracts usage from LangChain responses (works with OpenAI, Anthropic, etc.)
Only shows stats when token data is available from the provider

Testing

Tested with a real query and got accurate metrics:

❯ What is 2 + 2?

⏺  4

✻ 2s · 1,297 tokens (718.6 tok/s)

Closes #72

- Add TokenUsage interface and extend DoneEvent with totalTime, tokenUsage, tokensPerSecond - Modify callLlm to return LlmResult with usage metadata extracted from LangChain - Track start time and accumulate token usage across all LLM calls in agent - Display performance stats in UI after completion (duration, token count, tok/s) - Update all callLlm call sites to handle new LlmResult return type Closes virattt#72

MkDev11 · 2026-01-23T00:10:02Z

@virattt please have a look at the implementation and let me know your feedback. thanks.

virattt

This is a great idea and thank you for proposing the change.

Can we make the progress view identical to what Claude Code does? The seconds and tokens are shown while CC is working. This would be a great enhancement for Dexter, as well.

We currently only show the seconds taken while Dexter is in the "Answering" state, but it would be nice to have this for the "Thinking" state as well

src/agent/agent.ts

src/components/HistoryItemView.tsx

Addresses PR review feedback to abstract token counting logic out of agent.ts into a dedicated class.

- Extract token counting into dedicated TokenCounter class (per review) - Add real-time elapsed timer during processing state (like Claude Code) - Show progress indicator while agent is thinking/working

MkDev11 · 2026-01-27T14:23:37Z

This is a great idea and thank you for proposing the change.

Can we make the progress view identical to what Claude Code does? The seconds and tokens are shown while CC is working. This would be a great enhancement for Dexter, as well.

We currently only show the seconds taken while Dexter is in the "Answering" state, but it would be nice to have this for the "Thinking" state as well

Great! Added a real-time elapsed timer that shows during the Thinking/Tool states - updates every 100ms so users can see progress as it happens. Also extracted token counting into a dedicated TokenCounter class per your other feedback. Let me know if you'd like any adjustments to the display style.

Resolve conflict in agent.ts: keep skill deduplication AND tokenCounter

- Keep TokenCounter for performance metrics - Add ToolLimitEvent and limit checking from upstream - Use improved thinking check (skip whitespace-only)

MkDev11 · 2026-02-05T07:48:58Z

@virattt Sorry for tagging you, could you please review the changes once more?

…rch.ts - Keep TokenCounter for performance metrics tracking - Integrate upstream's progress channel and context management - Preserve tokenUsage/tokensPerSecond in done events

MkDev11 · 2026-02-05T20:23:38Z

@virattt I wanted to follow up on my previous message.

virattt

Nice work on this overall!

A few things I noticed:

Bug: financial-metrics.ts and read-filings.ts will break

The callLlm return type changed to { response, usage } but these two files still do await callLlm(...) as AIMessage. That cast won't catch the issue at compile time, but at runtime .tool_calls will be undefined since you're reading it off the wrapper object instead of the actual AIMessage. Both tools will silently fail every time.

Should be a quick fix — just destructure like you already did in financial-search.ts:

const { response } = await callLlm(input.query, { ... });
const aiMessage = response as AIMessage;

tokenCounter in executeToolCalls seems unused

It gets passed in but those methods just invoke tools — they never call callLlm directly. Also worth noting that any LLM calls happening inside tools (like financial-search has its own callLlm call) won't get counted, so the totals will be lower than actual usage. Might be worth a comment or just removing the parameter for now.

Dep bumps

The @langchain/exa jump from 0.1 to 1.0 is a major version bump — any reason to include it here? Might be cleaner as a separate PR so if something breaks it's easy to bisect.

Minor UX things

The old code hid the duration line for fast queries (< 15s). Now it always shows, which means even a quick "what is 2+2" gets a stats line. Also formatDuration rounds to whole seconds, so anything sub-second shows as "0s" which looks a bit odd. Maybe only show it when there's actual token data from the provider, or add a small threshold?

Type casts

There are quite a few response as AIMessage casts — since the type is AIMessage | string, a typeof check would be safer than assuming it's always an AIMessage. Not a big deal but worth cleaning up.

Overall this is a great addition, just needs the two broken files fixed before merging. Looking forward to seeing this land!

virattt

Added some nits

virattt · 2026-02-05T21:29:44Z

Additionally, the overall counter is right below the user query, which is weird:

❯ Walk me through PLTR's earnings 
  ⏺  6s

Let me pull up Palantir's latest earnings data and recent news.

⏺ Financial Metric("Palantir PLTR income statement last 8 quarters and 
                  annual...")
  ⎿  ⠴ Searching...

⠏ Pondering... (esc to interrupt)

Can we do exactly what Claude Code does instead?

Critical: - financial-metrics.ts: destructure callLlm to get AIMessage - read-filings.ts: destructure callLlm in both step1/step2 Medium: - agent.ts: remove unused tokenCounter from executeToolCalls/executeToolCall Low/UI: - HistoryItemView.tsx: show ms for sub-second durations - HistoryItemView.tsx: remove real-time timer below query (not Claude Code style)

MkDev11 · 2026-02-05T21:35:46Z

Great @virattt - removed it. Now just shows final stats after the answer. Also fixed the callLlm destructuring in both files and cleaned up the unused tokenCounter param. Can you please review the changes again?

MkDev11 · 2026-02-06T16:47:02Z

@virattt I am not sure why you ignore me, I found you made some merges except my PRs. Please let me know the reason.

MkDev11 · 2026-02-06T16:51:25Z

https://gittensor.io/miners/details?githubId=94194147
I know there are a few guys who have submitted PRs on the repo. Review my Gittensor profile on the link. You can review other guys as well.

virattt · 2026-02-06T17:42:41Z

Thanks for the updates. A few more optional suggestions, mostly around type safety and keeping things tidy.

`src/model/llm.ts` -- type `LlmResult.response` more narrowly

response: unknown forces as AIMessage casts at every call site (seven of them in agent.ts alone). Since callLlm already knows the shape, we can tighten the return type.

export interface LlmResult {
  response: AIMessage | string;
  usage?: TokenUsage;
}

Then at the bottom of the function:

  if (!outputSchema && !tools && result && typeof result === 'object' && 'content' in result) {
    return { response: (result as { content: string }).content, usage };
  }
  return { response: result as AIMessage, usage };

Why better: the single as AIMessage lives in one place (where we actually know the type), and every downstream consumer gets proper types without casting.

`src/model/llm.ts` -- safer `extractUsage`

The current implementation casts nested objects without checking their runtime types, which could silently produce NaN if a provider returns an unexpected shape.

function extractUsage(result: unknown): TokenUsage | undefined {
  if (!result || typeof result !== 'object') return undefined;
  const msg = result as Record<string, unknown>;

  const usageMetadata = msg.usage_metadata;
  if (usageMetadata && typeof usageMetadata === 'object') {
    const u = usageMetadata as Record<string, unknown>;
    const input = typeof u.input_tokens === 'number' ? u.input_tokens : 0;
    const output = typeof u.output_tokens === 'number' ? u.output_tokens : 0;
    const total = typeof u.total_tokens === 'number' ? u.total_tokens : input + output;
    return { inputTokens: input, outputTokens: output, totalTokens: total };
  }

  const responseMetadata = msg.response_metadata;
  if (responseMetadata && typeof responseMetadata === 'object') {
    const rm = responseMetadata as Record<string, unknown>;
    if (rm.usage && typeof rm.usage === 'object') {
      const u = rm.usage as Record<string, unknown>;
      const input = typeof u.prompt_tokens === 'number' ? u.prompt_tokens : 0;
      const output = typeof u.completion_tokens === 'number' ? u.completion_tokens : 0;
      const total = typeof u.total_tokens === 'number' ? u.total_tokens : input + output;
      return { inputTokens: input, outputTokens: output, totalTokens: total };
    }
  }

  return undefined;
}

Why better: guards against NaN propagation when a provider omits a field or returns a string instead of a number. Defensive parsing here saves confusing UI output downstream.

`src/components/HistoryItemView.tsx` -- consider gating the stats line

Nit. Since duration is now always set via doneEvent.totalTime, the condition item.duration !== undefined || item.tokenUsage is true for every completed query. For a quick "What is 2+2?" with no token data, showing "500ms" on its own is a bit noisy. One option:

{item.status === 'complete' && item.tokenUsage && (
  <Box marginTop={1}>
    <Text color={colors.muted}>
      {'✻ '}
      {item.duration !== undefined && formatDuration(item.duration)}
      {item.duration !== undefined && ' · '}
      {`${item.tokenUsage.totalTokens.toLocaleString()} tokens`}
      {item.tokensPerSecond !== undefined && ` (${item.tokensPerSecond.toFixed(1)} tok/s)`}
    </Text>
  </Box>
)}

Why better: the stats line only appears when there is meaningful token data to show, which is the interesting part. Duration alone is less useful without the token context.

`src/components/HistoryItemView.tsx` -- optional: extract a helper for the stats string

Nit. The inline conditionals for building the stats text are a little dense. A small helper keeps the JSX focused on layout.

function formatPerformanceStats(
  duration?: number,
  tokenUsage?: TokenUsage,
  tokensPerSecond?: number
): string {
  const parts: string[] = [];
  if (duration !== undefined) parts.push(formatDuration(duration));
  if (tokenUsage) parts.push(`${tokenUsage.totalTokens.toLocaleString()} tokens`);
  if (tokensPerSecond !== undefined) parts.push(`(${tokensPerSecond.toFixed(1)} tok/s)`);
  return parts.join(' · ');
}

Then in the JSX:

<Text color={colors.muted}>✻ {formatPerformanceStats(item.duration, item.tokenUsage, item.tokensPerSecond)}</Text>

Why better: easier to read and test independently. Pure formatting logic stays out of the component tree.

Summary

Area	Suggestion	Impact
`LlmResult.response` type	`AIMessage \| string` instead of `unknown`	Eliminates 7 `as AIMessage` casts
`extractUsage`	Runtime type checks on nested fields	Prevents silent `NaN` from unexpected shapes
Stats display gate	Only show when `tokenUsage` is present	Avoids noisy "500ms" line on trivial queries
Stats helper	Extract `formatPerformanceStats()`	Readability, testability

- Keep TokenUsage tracking for performance metrics - Add Anthropic cache_control for ~90% input token savings

- Type LlmResult.response as AIMessage | string (eliminates 7 casts) - Add runtime type checks in extractUsage (prevents NaN from unexpected shapes) - Gate stats display on tokenUsage presence (avoids noisy duration-only line) - Extract formatPerformanceStats helper for readability

MkDev11 · 2026-02-06T17:50:05Z

@virattt Thanks for the suggestions! Applied all the issues.

virattt · 2026-02-06T18:26:37Z

Thanks!

MkDev11 · 2026-02-06T18:27:38Z

Thanks!

Appreciate it!

virattt reviewed Jan 27, 2026

View reviewed changes

src/agent/agent.ts Outdated Show resolved Hide resolved

virattt reviewed Jan 27, 2026

View reviewed changes

src/agent/agent.ts Outdated Show resolved Hide resolved

virattt reviewed Jan 27, 2026

View reviewed changes

src/components/HistoryItemView.tsx Outdated Show resolved Hide resolved

MkDev11 added 2 commits January 27, 2026 15:17

refactor: extract token counting into TokenCounter class

e426636

Addresses PR review feedback to abstract token counting logic out of agent.ts into a dedicated class.

refactor: extract TokenCounter class and add real-time progress display

9380884

- Extract token counting into dedicated TokenCounter class (per review) - Add real-time elapsed timer during processing state (like Claude Code) - Show progress indicator while agent is thinking/working

MkDev11 requested a review from virattt January 27, 2026 15:48

MkDev11 added 2 commits January 28, 2026 11:51

Merge upstream/main into feat/add-performance-metrics

faa77ed

Resolve conflict in agent.ts: keep skill deduplication AND tokenCounter

Merge upstream/main: resolve conflict in agent.ts

4f1d424

- Keep TokenCounter for performance metrics - Add ToolLimitEvent and limit checking from upstream - Use improved thinking check (skip whitespace-only)

Merge upstream/main - resolve conflicts in agent.ts and financial-sea…

fab1dd0

…rch.ts - Keep TokenCounter for performance metrics tracking - Integrate upstream's progress channel and context management - Preserve tokenUsage/tokensPerSecond in done events

This comment was marked as duplicate.

Sign in to view

virattt requested changes Feb 5, 2026

View reviewed changes

virattt reviewed Feb 5, 2026

View reviewed changes

MkDev11 requested a review from virattt February 5, 2026 21:39

virattt added the run-ci Runs CI label Feb 6, 2026

Merge upstream/main - resolve bun.lock conflict

c137a0a

MkDev11 added 2 commits February 6, 2026 18:43

Merge upstream/main: add Anthropic prompt caching

d028176

- Keep TokenUsage tracking for performance metrics - Add Anthropic cache_control for ~90% input token savings

virattt merged commit b3142ae into virattt:main Feb 6, 2026
2 checks passed

feat: add performance metrics display (timer, tokens, tok/s) #75

feat: add performance metrics display (timer, tokens, tok/s) #75

Conversation

MkDev11 commented Jan 23, 2026

What this does

The problem

The solution

Why this matters

Technical details

Testing

Uh oh!

MkDev11 commented Jan 23, 2026

Uh oh!

virattt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MkDev11 commented Jan 27, 2026

Uh oh!

MkDev11 commented Feb 5, 2026

Uh oh!

MkDev11 commented Feb 5, 2026

Uh oh!

This comment was marked as duplicate.

Uh oh!

virattt left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

virattt left a comment

Choose a reason for hiding this comment

Uh oh!

virattt commented Feb 5, 2026

Uh oh!

MkDev11 commented Feb 5, 2026

Uh oh!

MkDev11 commented Feb 6, 2026

Uh oh!

MkDev11 commented Feb 6, 2026

Uh oh!

virattt commented Feb 6, 2026

src/model/llm.ts -- type LlmResult.response more narrowly

src/model/llm.ts -- safer extractUsage

src/components/HistoryItemView.tsx -- consider gating the stats line

src/components/HistoryItemView.tsx -- optional: extract a helper for the stats string

Summary

Uh oh!

MkDev11 commented Feb 6, 2026

Uh oh!

virattt commented Feb 6, 2026

Uh oh!

Uh oh!

MkDev11 commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

virattt left a comment •

edited

Loading

`src/model/llm.ts` -- type `LlmResult.response` more narrowly

`src/model/llm.ts` -- safer `extractUsage`

`src/components/HistoryItemView.tsx` -- consider gating the stats line

`src/components/HistoryItemView.tsx` -- optional: extract a helper for the stats string