Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
137 changes: 137 additions & 0 deletions AUDIO_PROVIDER_SUMMARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
# Audio Provider System - Corrected Architecture

## Summary

Audio is now a proper provider capability system with **two separate capabilities**:

1. **`audio_input`** - Audio SOURCES (mobile, Omi, desktop, file, UNode)
2. **`audio_consumer`** - Audio DESTINATIONS (Chronicle, Mycelia, relay, webhooks)

## Architecture

```
┌─────────────────────┐ ┌──────────────────────┐
│ Audio INPUT │ │ Audio CONSUMER │
│ (Source/Provider) │ ────────> │ (Destination) │
└─────────────────────┘ └──────────────────────┘

• Mobile App Mic • Chronicle
• Omi Device • Mycelia
• Desktop Mic • Multi-Destination
• Audio File Upload • Custom WebSocket
• UNode Device • Webhook
```

## Files Created/Modified

### Configuration Files
- ✅ `config/capabilities.yaml` - Added `audio_input` and `audio_consumer` capabilities
- ✅ `config/providers/audio_input.yaml` - 5 input providers (mobile, omi, desktop, file, unode)
- ✅ `config/providers/audio_consumer.yaml` - 5 consumer providers (chronicle, mycelia, multi-dest, custom, webhook)
- ✅ `config/config.defaults.yaml` - Default selections

### Backend API
- ✅ `ushadow/backend/src/routers/audio_provider.py` - Audio consumer API
- `GET /api/providers/audio_consumer/active` - Get where to send audio
- `GET /api/providers/audio_consumer/available` - List consumers
- `PUT /api/providers/audio_consumer/active` - Switch consumer
- ✅ `ushadow/backend/src/routers/audio_relay.py` - Multi-destination relay
- `WS /ws/audio/relay` - Fanout to multiple consumers
- ✅ `ushadow/backend/main.py` - Registered routers

### Mobile App Integration
- ✅ `ushadow/mobile/app/services/audioProviderApi.ts` - Consumer discovery API
- ✅ `ushadow/mobile/app/hooks/useMultiDestinationStreamer.ts` - Multi-cast support

### Documentation
- ✅ `docs/AUDIO_PROVIDER_ARCHITECTURE.md` - Complete architecture guide
- ✅ `MULTI_DESTINATION_AUDIO_EXAMPLE.md` - Relay examples

## How It Works

### Mobile App (Audio Input Provider)

```typescript
// 1. Mobile app asks: "Where should I send my audio?"
const consumer = await getActiveAudioConsumer(baseUrl, token);
// Returns: { provider_id: "chronicle", websocket_url: "ws://chronicle:5001/...", ...}

// 2. Mobile app connects to that consumer
const wsUrl = buildAudioStreamUrl(consumer, token);
await audioStreamer.startStreaming(wsUrl, 'streaming');

// 3. Mobile app sends audio
recorder.startRecording((audioData) => {
audioStreamer.sendAudio(audioData); // Goes to Chronicle
});
```

### Configuration Examples

**Send to Chronicle** (default):
```yaml
selected_providers:
audio_consumer: chronicle
```

**Send to Mycelia**:
```yaml
selected_providers:
audio_consumer: mycelia
```

**Send to BOTH (multi-destination)**:
```yaml
selected_providers:
audio_consumer: multi-destination

audio_consumer:
multi_dest_destinations: '[
{"name":"chronicle","url":"ws://chronicle:5001/chronicle/ws_pcm"},
{"name":"mycelia","url":"ws://mycelia:5173/ws_pcm"}
]'
```

## Testing

```bash
# Start backend
cd ushadow/backend
uvicorn main:app --reload

# Test API
curl http://localhost:8000/api/providers/audio_consumer/active

# Response:
{
"capability": "audio_consumer",
"selected_provider": "chronicle",
"config": {
"provider_id": "chronicle",
"websocket_url": "ws://chronicle-backend:5001/chronicle/ws_pcm",
"protocol": "wyoming",
"format": "pcm_s16le_16khz_mono"
}
}

# Switch to Mycelia
curl -X PUT http://localhost:8000/api/providers/audio_consumer/active \
-H "Authorization: Bearer TOKEN" \
-d '{"provider_id":"mycelia"}'
```

## Key Benefits

✅ **Correct Semantics**: Audio sources are inputs, processors are consumers
✅ **Flexible Routing**: Any source → any consumer(s)
✅ **No Hardcoding**: Mobile app discovers consumer dynamically
✅ **Multi-Destination**: Built-in fanout support
✅ **Follows Pattern**: Same structure as LLM/transcription providers
✅ **Provider Discovery**: Mobile apps query API instead of hardcoded URLs

## Next Steps

1. **Configure default consumer** in `config/config.defaults.yaml`
2. **Mobile app integration** - Use `getActiveAudioConsumer()` to discover endpoint
3. **Test routing** - Send mobile audio to Chronicle, then switch to Mycelia
4. **Try multi-destination** - Send audio to both simultaneously
125 changes: 125 additions & 0 deletions compose/MYCELIA-INTEGRATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Mycelia Integration with ushadow

## Overview

Mycelia has been integrated with ushadow's provider/instance model to support stateless configuration via environment variables.

## Changes Made

### 1. Schema Updates (`mycelia/myceliasdk/config.ts`)

Updated the server config schema to support separate LLM and transcription providers:

```typescript
export const zProviderConfig = z.object({
baseUrl: z.string().optional(),
apiKey: z.string().optional(),
model: z.string().optional(),
});

export const zServerConfig = z.object({
llm: zProviderConfig.optional().nullable(), // New: LLM-specific config
transcription: zProviderConfig.optional().nullable(), // New: Transcription-specific config
inference: zInferenceProviderConfig.optional().nullable(), // Deprecated: kept for backward compatibility
// ...
});
```

### 2. Resource Updates

Both `LLMResource` and `TranscriptionResource` now follow ushadow's stateless pattern:

**Priority:** Environment variables → MongoDB (fallback)

```typescript
async getInferenceProvider() {
// 1. Read from env vars (stateless - ushadow pattern)
const envBaseUrl = Deno.env.get("OPENAI_BASE_URL");
const envApiKey = Deno.env.get("OPENAI_API_KEY");
const envModel = Deno.env.get("OPENAI_MODEL");

if (envBaseUrl && envApiKey) {
return { baseUrl: envBaseUrl, apiKey: envApiKey, model: envModel };
}

// 2. Fallback to MongoDB for backward compatibility
const config = await getServerConfig();
// ...
}
```

### 3. Compose File Configuration

**Environment Variables** (compose/mycelia-compose.yml):

```yaml
# LLM Provider Configuration
- OPENAI_BASE_URL=${OPENAI_BASE_URL}
- OPENAI_API_KEY=${OPENAI_API_KEY}
- OPENAI_MODEL=${OPENAI_MODEL}

# Transcription Provider Configuration
- TRANSCRIPTION_BASE_URL=${TRANSCRIPTION_BASE_URL}
- TRANSCRIPTION_API_KEY=${TRANSCRIPTION_API_KEY}
- TRANSCRIPTION_MODEL=${TRANSCRIPTION_MODEL}
```

**ushadow Metadata:**

```yaml
x-ushadow:
mycelia-backend:
requires: ["llm", "transcription"] # Declares capability requirements
```

## How It Works

1. **Service Definition**: Mycelia declares it needs `llm` and `transcription` capabilities in x-ushadow metadata
2. **Provider Resolution**: ushadow's capability resolver maps these to provider instances
3. **Env Var Injection**: ushadow injects the mapped env vars into the container
4. **Runtime**: Mycelia reads configuration from env vars (stateless)

## Backward Compatibility

Mycelia maintains backward compatibility with its original MongoDB-based configuration:
- If env vars are not set, it falls back to reading from MongoDB
- Existing Mycelia installations continue to work unchanged
- The `inference` field is deprecated but still supported

## Provider Requirements

### LLM Provider
- **Base URL**: OpenAI-compatible API endpoint (e.g., http://ollama:11434/v1)
- **API Key**: Authentication key for the provider
- **Model**: Optional model name (e.g., "llama3")
- **Endpoint Used**: `/v1/chat/completions`

### Transcription Provider
- **Base URL**: OpenAI-compatible Whisper API endpoint
- **API Key**: Authentication key for the provider
- **Model**: Optional model name (defaults to "whisper-1")
- **Endpoint Used**: `/v1/audio/transcriptions`

## Example ushadow Provider Configuration

```yaml
providers:
llm:
instances:
- id: ollama-local
base_url: http://ollama:11434/v1
api_key: ollama
model: llama3

transcription:
instances:
- id: whisper-local
base_url: http://whisper:8000/v1
api_key: whisper
```

When Mycelia is started, ushadow will:
1. Resolve `llm` → ollama-local instance
2. Resolve `transcription` → whisper-local instance
3. Inject env vars: `OPENAI_BASE_URL`, `OPENAI_API_KEY`, `TRANSCRIPTION_BASE_URL`, etc.
4. Start Mycelia with stateless configuration
6 changes: 3 additions & 3 deletions compose/agent-zero-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ services:
- agent_zero_data:/a0

networks:
- infra-network
- ushadow-network

# Enable access to host network for Ollama and other local services
extra_hosts:
Expand All @@ -72,6 +72,6 @@ volumes:
name: ${COMPOSE_PROJECT_NAME:-ushadow}_agent_zero_data

networks:
infra-network:
ushadow-network:
external: true
name: ${COMPOSE_PROJECT_NAME:-ushadow}_infra-network
name: ushadow-network
7 changes: 4 additions & 3 deletions compose/backend.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,12 +32,13 @@ services:
- ../ushadow/backend:/app
- ../config:/config # Mount config directory (read-write for feature flags)
- ../compose:/compose # Mount compose files for service management
- ../mycelia:/mycelia # Mount mycelia for building mycelia-backend service
- /app/__pycache__
- /app/.pytest_cache
# Docker socket for container management (Tailscale container control)
- /var/run/docker.sock:/var/run/docker.sock
networks:
- infra-network
- ushadow-network
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 10s
Expand All @@ -47,6 +48,6 @@ services:
restart: unless-stopped

networks:
infra-network:
name: infra-network
ushadow-network:
name: ushadow-network
external: true
10 changes: 5 additions & 5 deletions compose/chronicle-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ x-ushadow:
chronicle-backend:
display_name: "Chronicle"
description: "AI-powered voice journal and life logger with transcription and LLM analysis"
requires: [llm, transcription]
requires: [llm, transcription, audio_input]
optional: [memory] # Uses memory if available, works without it
route_path: /chronicle # Tailscale Serve route - all /chronicle/* requests go here
chronicle-webui:
Expand Down Expand Up @@ -72,7 +72,7 @@ services:
- ${PROJECT_ROOT}/config/defaults.yml:/app/config/defaults.yml:ro

networks:
- infra-network
- ushadow-network

# NOTE: Depends on shared infrastructure services (mongo, redis, qdrant)
# These must be started separately via docker-compose.infra.yml
Expand Down Expand Up @@ -103,7 +103,7 @@ services:
- VITE_BACKEND_URL=http://localhost:${CHRONICLE_PORT:-8080}
- BACKEND_URL=http://chronicle-backend:8000
networks:
- infra-network
- ushadow-network
depends_on:
- chronicle-backend
restart: unless-stopped
Expand All @@ -119,6 +119,6 @@ volumes:

# Use existing shared infrastructure network
networks:
infra-network:
name: infra-network
ushadow-network:
name: ushadow-network
external: true
Loading