Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 1 addition & 5 deletions .env.template
Original file line number Diff line number Diff line change
Expand Up @@ -90,16 +90,12 @@ CHAT_TEMPERATURE=0.7
# SPEECH-TO-TEXT CONFIGURATION
# ========================================

# Primary transcription provider: deepgram, mistral, or parakeet
# Primary transcription provider: deepgram or parakeet
TRANSCRIPTION_PROVIDER=deepgram

# Deepgram configuration
DEEPGRAM_API_KEY=your-deepgram-key-here

# Mistral configuration (when TRANSCRIPTION_PROVIDER=mistral)
MISTRAL_API_KEY=your-mistral-key-here
MISTRAL_MODEL=voxtral-mini-2507

# Parakeet ASR configuration (when TRANSCRIPTION_PROVIDER=parakeet)
PARAKEET_ASR_URL=http://host.docker.internal:8767

Expand Down
15 changes: 5 additions & 10 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ This supports a comprehensive web dashboard for management.
Chronicle includes an **interactive setup wizard** for easy configuration. The wizard guides you through:
- Service selection (backend + optional services)
- Authentication setup (admin account, JWT secrets)
- Transcription provider configuration (Deepgram, Mistral, or offline ASR)
- Transcription provider configuration (Deepgram or offline ASR)
- LLM provider setup (OpenAI or Ollama)
- Memory provider selection (Chronicle Native with Qdrant or OpenMemory MCP)
- Network configuration and HTTPS setup
Expand Down Expand Up @@ -184,12 +184,12 @@ docker compose up --build
## Architecture Overview

### Key Components
- **Audio Pipeline**: Real-time Opus/PCM → Application-level processing → Deepgram/Mistral transcription → memory extraction
- **Audio Pipeline**: Real-time Opus/PCM → Application-level processing → Deepgram transcription → memory extraction
- **Wyoming Protocol**: WebSocket communication uses Wyoming protocol (JSONL + binary) for structured audio sessions
- **Unified Pipeline**: Job-based tracking system for all audio processing (WebSocket and file uploads)
- **Job Tracker**: Tracks pipeline jobs with stage events (audio → transcription → memory) and completion status
- **Task Management**: BackgroundTaskManager tracks all async tasks to prevent orphaned processes
- **Unified Transcription**: Deepgram/Mistral transcription with fallback to offline ASR services
- **Unified Transcription**: Deepgram transcription with fallback to offline ASR services
- **Memory System**: Pluggable providers (Chronicle native or OpenMemory MCP)
- **Authentication**: Email-based login with MongoDB ObjectId user system
- **Client Management**: Auto-generated client IDs as `{user_id_suffix}-{device_name}`, centralized ClientManager
Expand All @@ -205,7 +205,7 @@ Required:

Recommended:
- Vector Storage: Qdrant (Chronicle provider) or OpenMemory MCP server
- Transcription: Deepgram, Mistral, or offline ASR services
- Transcription: Deepgram or offline ASR services

Optional:
- Parakeet ASR: Offline transcription service
Expand Down Expand Up @@ -329,12 +329,7 @@ Chronicle supports multiple transcription services:
TRANSCRIPTION_PROVIDER=deepgram
DEEPGRAM_API_KEY=your-deepgram-key-here

# Option 2: Mistral (Voxtral models)
TRANSCRIPTION_PROVIDER=mistral
MISTRAL_API_KEY=your-mistral-key-here
MISTRAL_MODEL=voxtral-mini-2507

# Option 3: Local ASR (Parakeet)
# Option 2: Local ASR (Parakeet)
PARAKEET_ASR_URL=http://host.docker.internal:8767
```

Expand Down
21 changes: 6 additions & 15 deletions Docs/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ cd backends/advanced

**The setup wizard will guide you through:**
- **Authentication**: Admin email/password setup
- **Transcription Provider**: Choose Deepgram, Mistral, or Offline (Parakeet)
- **Transcription Provider**: Choose Deepgram or Offline (Parakeet)
- **LLM Provider**: Choose OpenAI or Ollama for memory extraction
- **Memory Provider**: Choose Chronicle Native or OpenMemory MCP
- **Optional Services**: Speaker Recognition and other extras
Expand All @@ -52,14 +52,13 @@ cd backends/advanced
Admin email [admin@example.com]: john@company.com
Admin password (min 8 chars): ********

► Speech-to-Text Configuration
► Speech-to-Text Configuration
-------------------------------
Choose your transcription provider:
1) Deepgram (recommended - high quality, requires API key)
2) Mistral (Voxtral models - requires API key)
3) Offline (Parakeet ASR - requires GPU, runs locally)
4) None (skip transcription setup)
Enter choice (1-4) [1]: 1
2) Offline (Parakeet ASR - requires GPU, runs locally)
3) None (skip transcription setup)
Enter choice (1-3) [1]: 1

Get your API key from: https://console.deepgram.com/
Deepgram API key: dg_xxxxxxxxxxxxx
Expand Down Expand Up @@ -154,20 +153,14 @@ OLLAMA_BASE_URL=http://ollama:11434
TRANSCRIPTION_PROVIDER=deepgram
DEEPGRAM_API_KEY=your-deepgram-api-key-here

# Option 2: Mistral (Voxtral models for transcription)
TRANSCRIPTION_PROVIDER=mistral
MISTRAL_API_KEY=your-mistral-api-key-here
MISTRAL_MODEL=voxtral-mini-2507

# Option 3: Local ASR service
# Option 2: Local ASR service
PARAKEET_ASR_URL=http://host.docker.internal:8080
```

**Important Notes:**
- **OpenAI is strongly recommended** for LLM processing as it provides much better memory extraction and eliminates JSON parsing errors
- **TRANSCRIPTION_PROVIDER** determines which service to use:
- `deepgram`: Uses Deepgram's Nova-3 model for high-quality transcription
- `mistral`: Uses Mistral's Voxtral models for transcription
- If not set, system falls back to offline ASR service
- The system requires either online API keys or offline ASR service configuration

Expand Down Expand Up @@ -312,7 +305,6 @@ curl -X POST "http://localhost:8000/api/process-audio-files" \

### Transcription Options
- **Deepgram API**: Cloud-based batch processing, high accuracy (recommended)
- **Mistral API**: Voxtral models for transcription with REST API processing
- **Self-hosted ASR**: Local Wyoming protocol services with real-time processing
- **Collection timeout**: 1.5 minute collection for optimal online processing quality

Expand Down Expand Up @@ -407,7 +399,6 @@ uv sync --group (whatever group you want to sync)

**Transcription Issues:**
- **Deepgram**: Verify API key is valid and `TRANSCRIPTION_PROVIDER=deepgram`
- **Mistral**: Verify API key is valid and `TRANSCRIPTION_PROVIDER=mistral`
- **Self-hosted**: Ensure ASR service is running on port 8765
- Check transcription service connection in health endpoint

Expand Down
2 changes: 1 addition & 1 deletion backends/advanced/Docs/memory-configuration-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ memory:
- **Embeddings**: `text-embedding-3-small`, `text-embedding-3-large`

#### Ollama Models (Local)
- **LLM**: `llama3`, `mistral`, `qwen2.5`
- **LLM**: `llama3`, `qwen2.5`
- **Embeddings**: `nomic-embed-text`, `all-minilm`

## Hot Reload
Expand Down
21 changes: 6 additions & 15 deletions backends/advanced/Docs/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ cd backends/advanced

**The setup wizard will guide you through:**
- **Authentication**: Admin email/password setup
- **Transcription Provider**: Choose Deepgram, Mistral, or Offline (Parakeet)
- **Transcription Provider**: Choose Deepgram or Offline (Parakeet)
- **LLM Provider**: Choose OpenAI or Ollama for memory extraction
- **Memory Provider**: Choose Chronicle Native or OpenMemory MCP
- **Optional Services**: Speaker Recognition and other extras
Expand All @@ -50,14 +50,13 @@ cd backends/advanced
Admin email [admin@example.com]: john@company.com
Admin password (min 8 chars): ********

► Speech-to-Text Configuration
► Speech-to-Text Configuration
-------------------------------
Choose your transcription provider:
1) Deepgram (recommended - high quality, requires API key)
2) Mistral (Voxtral models - requires API key)
3) Offline (Parakeet ASR - requires GPU, runs locally)
4) None (skip transcription setup)
Enter choice (1-4) [1]: 1
2) Offline (Parakeet ASR - requires GPU, runs locally)
3) None (skip transcription setup)
Enter choice (1-3) [1]: 1

Get your API key from: https://console.deepgram.com/
Deepgram API key: dg_xxxxxxxxxxxxx
Expand Down Expand Up @@ -152,20 +151,14 @@ OLLAMA_BASE_URL=http://ollama:11434
TRANSCRIPTION_PROVIDER=deepgram
DEEPGRAM_API_KEY=your-deepgram-api-key-here

# Option 2: Mistral (Voxtral models for transcription)
TRANSCRIPTION_PROVIDER=mistral
MISTRAL_API_KEY=your-mistral-api-key-here
MISTRAL_MODEL=voxtral-mini-2507

# Option 3: Local ASR service
# Option 2: Local ASR service
PARAKEET_ASR_URL=http://host.docker.internal:8080
```

**Important Notes:**
- **OpenAI is strongly recommended** for LLM processing as it provides much better memory extraction and eliminates JSON parsing errors
- **TRANSCRIPTION_PROVIDER** determines which service to use:
- `deepgram`: Uses Deepgram's Nova-3 model for high-quality transcription
- `mistral`: Uses Mistral's Voxtral models for transcription
- If not set, system falls back to offline ASR service
- The system requires either online API keys or offline ASR service configuration

Expand Down Expand Up @@ -310,7 +303,6 @@ curl -X POST "http://localhost:8000/api/audio/upload" \

### Transcription Options
- **Deepgram API**: Cloud-based batch processing, high accuracy (recommended)
- **Mistral API**: Voxtral models for transcription with REST API processing
- **Self-hosted ASR**: Local Wyoming protocol services with real-time processing
- **Collection timeout**: 1.5 minute collection for optimal online processing quality

Expand Down Expand Up @@ -405,7 +397,6 @@ uv sync --group (whatever group you want to sync)

**Transcription Issues:**
- **Deepgram**: Verify API key is valid and `TRANSCRIPTION_PROVIDER=deepgram`
- **Mistral**: Verify API key is valid and `TRANSCRIPTION_PROVIDER=mistral`
- **Self-hosted**: Ensure ASR service is running on port 8765
- Check transcription service connection in health endpoint

Expand Down
2 changes: 1 addition & 1 deletion backends/advanced/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Modern React-based web dashboard located in `./webui/` with:

**The setup wizard guides you through:**
- **Authentication**: Admin email/password setup with secure keys
- **Transcription Provider**: Choose between Deepgram, Mistral, or Offline (Parakeet)
- **Transcription Provider**: Choose between Deepgram or Offline (Parakeet)
- **LLM Provider**: Choose between OpenAI (recommended) or Ollama for memory extraction
- **Memory Provider**: Choose between Friend-Lite Native or OpenMemory MCP
- **Optional Services**: Speaker Recognition, network configuration
Expand Down
9 changes: 4 additions & 5 deletions backends/advanced/SETUP_SCRIPTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ This document explains the different setup scripts available in Friend-Lite and

### What it does:
- ✅ **Authentication Setup**: Admin email/password with secure key generation
- ✅ **Transcription Provider Selection**: Choose between Deepgram, Mistral, or Offline (Parakeet)
- ✅ **Transcription Provider Selection**: Choose between Deepgram or Offline (Parakeet)
- ✅ **LLM Provider Configuration**: Choose between OpenAI (recommended) or Ollama
- ✅ **Memory Provider Setup**: Choose between Friend-Lite Native or OpenMemory MCP
- ✅ **API Key Collection**: Prompts for required keys with helpful links to obtain them
Expand Down Expand Up @@ -43,10 +43,9 @@ Admin password (min 8 chars): ********
-------------------------------
Choose your transcription provider:
1) Deepgram (recommended - high quality, requires API key)
2) Mistral (Voxtral models - requires API key)
3) Offline (Parakeet ASR - requires GPU, runs locally)
4) None (skip transcription setup)
Enter choice (1-4) [1]: 1
2) Offline (Parakeet ASR - requires GPU, runs locally)
3) None (skip transcription setup)
Enter choice (1-3) [1]: 1

Get your API key from: https://console.deepgram.com/
Deepgram API key: dg_xxxxxxxxxxxxx
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@ class Conversation(Document):
class TranscriptProvider(str, Enum):
"""Supported transcription providers."""
DEEPGRAM = "deepgram"
MISTRAL = "mistral"
PARAKEET = "parakeet"
SPEECH_DETECTION = "speech_detection" # Legacy value
UNKNOWN = "unknown" # Fallback value
Expand Down Expand Up @@ -63,7 +62,7 @@ class TranscriptVersion(BaseModel):
transcript: Optional[str] = Field(None, description="Full transcript text")
segments: List["Conversation.SpeakerSegment"] = Field(default_factory=list, description="Speaker segments")
provider: Optional["Conversation.TranscriptProvider"] = Field(None, description="Transcription provider used")
model: Optional[str] = Field(None, description="Model used (e.g., nova-3, voxtral-mini-2507)")
model: Optional[str] = Field(None, description="Model used (e.g., nova-3, parakeet)")
created_at: datetime = Field(description="When this version was created")
processing_time_seconds: Optional[float] = Field(None, description="Time taken to process")
metadata: Dict[str, Any] = Field(default_factory=dict, description="Additional provider-specific metadata")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ async def init_session(
user_id: User identifier
client_id: Client identifier
mode: Processing mode (streaming/batch)
provider: Transcription provider ("deepgram", "mistral", etc.)
provider: Transcription provider ("deepgram", "parakeet", etc.)
"""
# Client-specific stream naming (one stream per client for isolation)
stream_name = f"audio:stream:{client_id}"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,6 @@ class TranscriptionProvider(Enum):
"""Available transcription providers for audio stream routing."""
DEEPGRAM = "deepgram"
PARAKEET = "parakeet"
MISTRAL = "mistral"


class BaseTranscriptionProvider(abc.ABC):
Expand Down
5 changes: 2 additions & 3 deletions backends/advanced/tests/test_conversation_models.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ def test_add_transcript_version(self):
version_id="v2",
transcript="Updated transcript",
segments=segments,
provider=TranscriptProvider.MISTRAL,
provider=TranscriptProvider.PARAKEET,
set_as_active=False
)

Expand Down Expand Up @@ -170,7 +170,7 @@ def test_set_active_versions(self):
segments2 = [SpeakerSegment(start=0.0, end=5.0, text="Version 2", speaker="Speaker A")]

conversation.add_transcript_version("v1", "Transcript 1", segments1, TranscriptProvider.DEEPGRAM)
conversation.add_transcript_version("v2", "Transcript 2", segments2, TranscriptProvider.MISTRAL, set_as_active=False)
conversation.add_transcript_version("v2", "Transcript 2", segments2, TranscriptProvider.PARAKEET, set_as_active=False)

# Should be v1 active
assert conversation.active_transcript_version == "v1"
Expand Down Expand Up @@ -213,7 +213,6 @@ def test_provider_enums(self):
"""Test that provider enums work correctly."""
# Test TranscriptProvider enum
assert TranscriptProvider.DEEPGRAM == "deepgram"
assert TranscriptProvider.MISTRAL == "mistral"
assert TranscriptProvider.PARAKEET == "parakeet"

# Test MemoryProvider enum
Expand Down
6 changes: 1 addition & 5 deletions config.env.template
Original file line number Diff line number Diff line change
Expand Up @@ -65,16 +65,12 @@ OPENAI_API_KEY = sk-xxxxx
# SPEECH-TO-TEXT CONFIGURATION
# ========================================

# Primary transcription provider: deepgram, mistral, or parakeet
# Primary transcription provider: deepgram or parakeet
TRANSCRIPTION_PROVIDER = deepgram

# Deepgram configuration
DEEPGRAM_API_KEY = 90xxxxxx

# Mistral configuration (when TRANSCRIPTION_PROVIDER=mistral)
MISTRAL_API_KEY =
MISTRAL_MODEL = voxtral-mini-2507

# Parakeet ASR configuration (when TRANSCRIPTION_PROVIDER=parakeet)
PARAKEET_ASR_URL = http://host.docker.internal:8767

Expand Down
4 changes: 2 additions & 2 deletions tests/configs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ done

When creating a new test configuration:

1. **Name it descriptively**: `{stt}-{llm}.yml` (e.g., `mistral-openai.yml`)
1. **Name it descriptively**: `{stt}-{llm}.yml` (e.g., `deepgram-openai.yml`)
2. **Use environment variables**: Always use `${VAR:-default}` pattern for secrets
3. **Set appropriate defaults**: Update the `defaults:` section to match your provider combo
4. **Include only required models**: Don't include models that aren't used
Expand Down Expand Up @@ -124,7 +124,7 @@ Test configs use environment variable substitution to avoid hardcoding secrets:

As you add support for new providers, create corresponding test configs:

- `mistral-openai.yml` - Mistral Voxtral STT + OpenAI LLM
- `deepgram-openai.yml` - Deepgram STT + OpenAI LLM
- `deepgram-ollama.yml` - Deepgram STT + Local Ollama LLM
- `parakeet-openai.yml` - Local Parakeet STT + OpenAI LLM
- etc.
Expand Down