⚠️ This repository is archived and kept for historical purposes.Active development continues in ARES: https://github.com/gabeparra/ares
Meeting caption listener + summarizer with Glup personality and pluggable LLM backends.
Glup is an advanced AI meeting assistant that listens to meeting captions (or generates fake transcript segments for testing) and produces rolling summaries with the calculated, analytical personality of Glup. It supports OpenAI/ChatGPT, Grok, Gemini, and local models via Ollama with full GPU acceleration for RTX 4090 and other NVIDIA GPUs.
┌─────────────┐
│ Capture │──┐
│ (Whisper/ │ │
│ Browser) │ │
└─────────────┘ │
▼
┌─────────┐
│ Bus │──┐
│ (Queue) │ │
└─────────┘ │
│
┌─────────┐ │
│ Storage │◄─┘
│(SQLite) │
└─────────┘
│
▼
┌─────────────┐
│ Summarizer │──┐
│ (Loop) │ │
└─────────────┘ │
│
┌─────────────┐ │
│ LLM Router │◄─┘
└─────────────┘
│
┌────────┼────────┐
▼ ▼ ▼
OpenAI Gemini Ollama
Grok
- Python 3.11+
- pip (Python package manager)
- NVIDIA GPU (RTX 4090 recommended) with drivers installed
For RTX 4090 systems, see INSTALL_4090.md for optimized GPU setup.
- Clone the repository:
git clone https://github.com/gabeparra/AiListener.git
cd AiListener- Create and activate virtual environment:
python3 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate- Install dependencies and package:
pip install -r requirements.txt
pip install -e . # Install package in editable modeCreate a .env file in the project root:
# LLM Provider (openai, grok, gemini, local)
LLM_PROVIDER=local
# OpenAI (if using OpenAI)
OPENAI_API_KEY=sk-...
# Grok (if using Grok)
GROK_API_KEY=xai-...
# Gemini (if using Gemini)
GEMINI_API_KEY=...
# Local Ollama (if using local)
# For RTX 4090, recommended models: llama3.1:70b, llama3:70b, mistral-nemo:12b
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.1:70b
# Storage (optional, defaults to ~/.caption_ai/segments.db)
STORAGE_PATH=~/.caption_ai/segments.dbGlup now includes a React frontend with interactive chat!
Development mode (with hot reload):
- Install Node.js dependencies:
npm install- Start React dev server (terminal 1):
npm run dev- Start Glup backend (terminal 2):
source .venv/bin/activate
python -m caption_ai --web --port 8000- Open http://localhost:3000 in your browser
Production mode:
- Build React app:
npm run build- Start Glup:
python -m caption_ai --webThe React web UI provides:
- Interactive Chat: Talk directly with Glup using the chat panel
- Real-time conversation segments display
- Live Glup analysis updates
- WebSocket-based streaming for instant updates
- Dark theme with Glup's distinctive styling
- Hot Module Replacement (HMR) for instant development updates
Run without web UI:
python -m caption_aiThis will:
- Generate fake meeting transcript segments
- Store them in SQLite
- Produce rolling summaries every 15 seconds using the configured LLM
- Display output in the terminal
-
Install Ollama from https://ollama.ai
-
Start Ollama manually (it's disabled from auto-start):
# Use the control script
./scripts/control_ollama.sh start
# Or manually
ollama serve &- Pull a model:
ollama pull llama3.2:3b- Set
LLM_PROVIDER=localin.envand run:
python -m caption_ai --webImportant: Ollama does NOT auto-start to prevent memory issues. Use ./scripts/control_ollama.sh stop to terminate it when done.
# Install with dev dependencies
pip install -r requirements-dev.txt
# Run linting
make lint
# Run tests
make test
# Run the application
make runcaption-ai/
├── src/
│ └── caption_ai/
│ ├── __init__.py
│ ├── config.py # Configuration management
│ ├── bus.py # Segment queue
│ ├── storage.py # SQLite storage
│ ├── prompts.py # Prompt templates
│ ├── summarizer.py # Rolling summarizer loop
│ ├── main.py # CLI entrypoint
│ ├── capture/ # Audio capture (future)
│ └── llm/ # LLM clients
│ ├── base.py # LLM interface
│ ├── router.py # Provider router
│ ├── openai_api.py
│ ├── gemini_api.py
│ ├── grok_api.py
│ └── local_ollama.py
├── tests/ # Test suite
├── scripts/ # Utility scripts
├── pyproject.toml # Project metadata
├── Makefile # Development commands
└── README.md
- Whisper Audio Capture: Real-time audio transcription using faster-whisper
- Teams UI Captions: Extract captions from Microsoft Teams UI
- Browser Automation: Use Playwright to capture captions from web meetings
- Full LLM Implementations: Complete OpenAI, Gemini, and Grok API integrations
- Speaker Diarization: Identify and label speakers
- Export Formats: Export summaries to Markdown, PDF, etc.
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes with proper type hints and async patterns
- Run
make lintandmake test - Submit a pull request
# Clone and setup
git clone https://github.com/gabeparra/AiListener.git
cd AiListener
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements-dev.txt
# Run tests
pytest
# Format and lint
ruff check .
ruff format .MIT License - see LICENSE file for details.