-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Summary
Umbrella issue tracking evaluation of FunctionGemma for tool calling layer only - reasoning stays on OpenAI.
Intent
Move tool dispatch off the API outbound path:
Current:
tool_model → gpt-4o-mini (API call, ~$0.15/1M tokens)
prompt_model → o4-mini (API call, reasoning)
Target:
tool_model → FunctionGemma (LOCAL, $0, ~10ms)
prompt_model → o4-mini (API call, reasoning - unchanged)
Motivation
- Cost: Eliminate API costs for ~70% of LLM calls (tool dispatch)
- Latency: Local inference ~10ms vs ~500ms cloud
- Reliability: Tool calling works during API outages
- Privacy: Command patterns stay local
What This IS
- Replacing structured function dispatch (input → function call)
- Pattern: "buy AAPL" →
place_order(symbol="AAPL", side="buy") - Leveraging existing dual-model architecture in
base_agent.py:46-52
What This IS NOT
- Replacing reasoning/analysis (o4-mini stays)
- Making trading decisions smarter
- Full offline capability (execution still needs Alpaca)
Reference
- Blog: https://blog.google/technology/developers/functiongemma/
- Ollama: https://ollama.com/library/functiongemma
- AutoGen Ollama: https://microsoft.github.io/autogen/stable/reference/python/autogen_ext.models.ollama.html
Sub-Issues
- spike: install Ollama + FunctionGemma local inference stack #533 - Install Ollama + FunctionGemma stack
- feat: add LLM backend abstraction layer (OpenAI/Ollama toggle) #534 - LLM backend abstraction layer (tool_model only)
- test: benchmark IntentClassifier with FunctionGemma vs gpt-4o-mini #535 - Benchmark IntentClassifier tool calls
- test: benchmark AutoGenLLMParser with FunctionGemma vs gpt-4o-mini #536 - Benchmark AutoGenLLMParser tool calls
Decision Criteria
After benchmarks:
- Proceed: ≥95% accuracy on tool dispatch → integrate as tool_model
- Hybrid: Use for simple patterns, fallback to gpt-4o-mini for complex
- Defer: Accuracy insufficient → revisit when model improves
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request