feat: agent mode tool-calling via Vercel AI SDK#451
Open
gabrielste1n wants to merge 15 commits intomainfrom
Open
feat: agent mode tool-calling via Vercel AI SDK#451gabrielste1n wants to merge 15 commits intomainfrom
gabrielste1n wants to merge 15 commits intomainfrom
Conversation
Transform Agent Mode from text-only chat into a full agentic experience with native tool/function calling. The agent can now search notes, copy to clipboard, search the web (via cloud API), and check calendar events. - Tool registry with OpenAI-compatible function calling format - ReAct execution loop with parallel read-only tool execution - SSE streaming with incremental tool call argument accumulation - Inline tool execution UI (compact pills with status animations) - Text input field alongside voice input for tool-heavy workflows - Dynamic system prompt with tool usage instructions - IPC handler for web search via OpenWhispr cloud API - Database migration for tool message metadata - i18n strings for all 10 supported locales
Replace manual SSE parsing and AgentLoop with AI SDK streamText + stepCountIs for tool-calling agent mode. Add unified provider factory supporting OpenAI, Groq, Anthropic, Gemini, and custom endpoints. - Add ai, @ai-sdk/openai, @ai-sdk/groq, @ai-sdk/anthropic, @ai-sdk/google - Add src/services/ai/providers.ts with getAIModel factory - Add ToolRegistry.toAISDKFormat() using jsonSchema wrapper - Add ReasoningService.processTextStreamingAI() with full tool support - Remove AgentLoop.ts (replaced by stepCountIs) - Remove dead toOpenAIFormat/OpenAIFunctionTool from ToolRegistry - Simplify AgentOverlay to 3 streaming paths (tools/cloud/BYOK)
…isableThinking Uses AI SDK's providerOptions API to send reasoning_effort: "none" to Groq for models flagged with disableThinking in the model registry.
- Add mounted ref to guard state updates after overlay unmount - Add AudioManager.cleanup() call on unmount - Handle tool-result stream chunks to show actual results in UI - Set tool status to "executing" until result arrives (fix state thrashing) - Add try/catch in ToolRegistry.toAISDKFormat() execute wrapper - Extend AgentStreamChunk type with tool_result variant - Add success field to IPC web-search response for contract consistency
- Add get_note tool to fetch full note content by ID - Add create_note tool with folder resolution and cloud sync - Add update_note tool for title, content, and folder changes - Add shared resolveFolderId utility for folder name lookup - Include note ID in search_notes results for cross-tool reference - Register tools and add system prompt instructions - Add translation keys for all 10 locales
- Enable tool calling for cloud agent mode (clipboard, notes, web search, calendar) - Implement NDJSON streaming via IPC batch approach for reliable event delivery - Add multi-step tool-calling loop with AI SDK v6 message format - Fix mountedRef StrictMode bug causing empty renders - Return actual tool result data to LLM instead of just display text - Extract MAX_TOOL_STEPS constant, remove redundant comments
When cloud backup is enabled and user is signed in, the search_notes agent tool now uses the cloud hybrid search (pgvector + FTS) instead of local SQLite FTS5 keyword search. Falls back to local search transparently on cloud failure.
Replace the buffered IPC middleman with direct fetch from the renderer to the API. Text now streams token-by-token instead of arriving all at once after the full response completes. Adds AbortController support for instant cancellation on unmount or user stop.
The renderer's fetch() can't access auth cookies stored on the Neon Auth domain. Add a get-session-cookies IPC handler to retrieve them from the main process cookie jar and forward as an explicit Cookie header.
Browser fetch() forbids setting the Cookie header, so direct renderer-to- API streaming can't authenticate. Switch to event-based IPC: main process reads the API stream and forwards each chunk via webContents.send() as it arrives. This matches the pattern used for AssemblyAI/Deepgram streaming.
Replace basic tool pills with step-based visualization showing the full tool lifecycle: shimmer accent while executing, checkmark pop on completion, expandable detail, and contextual clipboard confirmation. - ToolCallStep: left-border accent, per-tool icons, shimmer animation - Clipboard: inline "Copied to your clipboard" with green check - Input bar: thinking shimmer bar, tool icon in executing state - Empty state: mic icon with dual-line CTA - Title bar: shows agent name, softer shadow - Tighter spacing throughout (gaps, heights, bubble widths) - New CSS: tool-step-shimmer, tool-check-pop, tool-status-sweep - i18n: copiedToClipboard + orType in all 10 locales
…ults
The executeToolCall callback was returning raw data (full JSON) which was
displayed in the tool step UI. Now returns { data, displayText } so the
LLM gets structured data while the UI shows a human-readable summary.
Also truncates Exa web search article text to 500 chars per result to
prevent massive payloads.
- Increase MAX_TOOL_STEPS from 5 to 20 to prevent agent from getting stuck on multi-step workflows - Add metadata field to ToolCallInfo so tool results can carry structured data (e.g. note ID) to the UI without mixing it into displayText - Created/updated/fetched notes show as clickable steps with a primary accent — clicking opens the note in the control panel - New agent-open-note IPC handler navigates the control panel to the note
- Note cards: when create/update/get_note completes, render a compact clickable card at the bottom of the message bubble with title, icon, and "Open note" label. Clicking opens the note in the control panel. - Input bar: replace generic loading dots with dictation-panel-inspired indicators — pulsing blue circle for listening, accent wave bars for transcribing, shimmer sweep for thinking. - Fix duplicate copiedToClipboard keys in all locale files.
…mode The agent-open-note IPC was reusing navigate-to-meeting-note which activates meeting recording mode. Add a dedicated navigate-to-note event that only sets the active note and view without starting the recorder.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
streamText()withstepCountIs()for multi-step tool callingChanges
New files:
src/services/ai/providers.ts—getAIModel()factory wrapping all AI SDK providerssrc/services/tools/— ToolRegistry, searchNotesTool, webSearchTool, calendarTool, clipboardToolModified:
src/services/ReasoningService.ts— AddprocessTextStreamingAI()with tool support,AgentStreamChunktype with content/tool_calls/tool_result/done variantssrc/components/AgentOverlay.tsx— 3 streaming paths (tools/cloud/BYOK), mounted ref guard, AudioManager cleanup, proper tool-result displaysrc/services/tools/ToolRegistry.ts—toAISDKFormat()with error handling via try/catchDeleted:
src/services/AgentLoop.ts— Replaced by AI SDKstepCountIs()Details:
disableThinkingflag passproviderOptions: { groq: { reasoningEffort: "none" } }{ error }to AI SDK instead of throwingmountedRefto prevent state updates on unmounted componentai,@ai-sdk/openai,@ai-sdk/groq,@ai-sdk/anthropic,@ai-sdk/googleTest plan