Skip to content

research: FunctionGemma local inference evaluation #537

@iAmGiG

Description

@iAmGiG

Summary

Umbrella issue tracking evaluation of FunctionGemma for tool calling layer only - reasoning stays on OpenAI.

Intent

Move tool dispatch off the API outbound path:

Current:
  tool_model    → gpt-4o-mini  (API call, ~$0.15/1M tokens)
  prompt_model  → o4-mini      (API call, reasoning)

Target:
  tool_model    → FunctionGemma (LOCAL, $0, ~10ms)
  prompt_model  → o4-mini       (API call, reasoning - unchanged)

Motivation

  • Cost: Eliminate API costs for ~70% of LLM calls (tool dispatch)
  • Latency: Local inference ~10ms vs ~500ms cloud
  • Reliability: Tool calling works during API outages
  • Privacy: Command patterns stay local

What This IS

  • Replacing structured function dispatch (input → function call)
  • Pattern: "buy AAPL" → place_order(symbol="AAPL", side="buy")
  • Leveraging existing dual-model architecture in base_agent.py:46-52

What This IS NOT

  • Replacing reasoning/analysis (o4-mini stays)
  • Making trading decisions smarter
  • Full offline capability (execution still needs Alpaca)

Reference

Sub-Issues

Decision Criteria

After benchmarks:

  1. Proceed: ≥95% accuracy on tool dispatch → integrate as tool_model
  2. Hybrid: Use for simple patterns, fallback to gpt-4o-mini for complex
  3. Defer: Accuracy insufficient → revisit when model improves

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions