Skip to content

Conversation

@OscarArroyoVega
Copy link
Member

feat: Add local LLM support with Ollama and rebuild query for production schema

Added Ollama as a local LLM provider option alongside OpenAI, Anthropic, and
Gemini, enabling development without API costs and improving response times
for local deployments.

Rebuilt the query builder to match the current production database schema with
complex joins across events, venues, and artists tables. The new query uses CTEs
to unnest artist relationships and aggregates artist data as both comma-separated
strings and JSON objects for flexible frontend rendering.

Why these changes:

  • Local LLM support reduces API costs during development and testing
  • The previous query structure didn't match production schema with venues/artists tables
  • Needed rich event data, including venue details and multiple artist relationships
  • Performance monitoring was insufficient to identify bottlenecks

Implementation details:

  • Added Ollama provider with timeout configuration in llm_factory()
  • Built complex SQL with CTE for artist unnesting and LEFT JOINs
  • Implemented comprehensive timing logs across query building, DB execution, and LLM calls
  • Modified API schema to accept date lists instead of date_range objects

Known limitations:

  • Date filtering generates N OR conditions (one per date in range) - inefficient for large ranges
  • Query and schema should be reviewed by data engineer for optimization opportunities
  • Current implementation blocks on date list generation in frontend

Performance metrics (214-day range, 2 events):

  • Query building: 18ms
  • DB execution: 137ms
  • LLM call: 1031ms (Gemini)
  • Total: 1.2s backend (plus ~60s cold start on first request) ATENTION TO THIS!

Next steps for improvement:

  1. Add LLM pre-processing layer to extract filter fields from natural language
  2. Implement conversational context/memory for multi-turn chat
  3. Replace date list with simple date range in SQL (start_date/end_date)
  4. Add voice-to-text input capability
  5. Implement NLP sentiment analysis on responses
  6. Create feedback collection system for negative interactions (RLHF dataset)
  7. Add streaming responses for better UX during LLM processing
  8. Optimize frontend loading with spinners and timeout handling

Refs: #17

Added local LLM support via Ollama and rebuilt query builder to match
production DB schema with events/venues/artists joins. Includes timing
instrumentation for performance monitoring.

Known issue: Date filtering uses N OR conditions - needs optimization.
Schema and query require data engineer review.

Next: Add filter extraction LLM, conversational memory, voice input,
sentiment analysis, and RLHF feedback collection system.
@OscarArroyoVega OscarArroyoVega self-assigned this Nov 7, 2025
@OscarArroyoVega OscarArroyoVega merged commit ec4026b into main Nov 7, 2025
2 checks passed
@OscarArroyoVega OscarArroyoVega deleted the feature/retriever-llamaindex-Sql branch November 7, 2025 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants