SimpleOpenSoftware
diff --git a/‎.github/README.md‎
Lines changed: 213 additions & 1 deletion b/‎.github/README.md‎
Lines changed: 213 additions & 1 deletion
diff --git a/‎.github/workflows/README.md‎
Lines changed: 91 additions & 0 deletions b/‎.github/workflows/README.md‎
Lines changed: 91 additions & 0 deletions
diff --git a/‎.github/workflows/android-apk-build.yml‎
Lines changed: 6 additions & 0 deletions b/‎.github/workflows/android-apk-build.yml‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎.github/workflows/build-all-platforms.yml‎
Lines changed: 1 addition & 0 deletions b/‎.github/workflows/build-all-platforms.yml‎
Lines changed: 1 addition & 0 deletions
@@ -1,3 +1,215 @@
+# Own your Friend DevKit, DevKit 2, OMI
+
+# Intro
+The idea of this repo is to provide just enough to be useful for developers.
+It should provide the minimal requirements to either:
+1. Provide firmware, sdks, examples to make your own software
+2. Advanced solution or tutorial for one such that you can get full use of your devices.
+
+Instead of going the way of making OMI's ecosystem compatible and more open -
+This will attempt to make it easy to roll your own - and then try to make it compatible with OMI's ecosystem.
+
+The app uses react native sdk (A fork of it, since the original wasn't updated fast enough)
+The backend uses the python sdk (A fork of it, since the original isn't pushed to pypi)
+
+# Vision
+This fits as a small part of the larger idea of
+"Have various sensors feeding the state of YOUR world to computers/AI and get some use out of it"
+
+Usecases are numerous - OMI Mentor is one of them
+Friend/Omi/pendants are a small but important part of this, since they record personal spoken context the best.
+OMI-like devices with a camera can also capture visual context - or smart glasses - which also double as a display.
+
+Regardless - this repo will try to do the minimal of this - multiple OMI-like audio devices feeding audio data - and from it,
+- Memories
+- Action items
+- Home automation
+
+# Arch
+Checkout arch.md in backends/advanced-backends/Docs/arch.md
+
+## Arch description
+The current DevKit2 streams audio via Bluetooth to some device in the OPUS codec.
+Once you have audio, you need trascription (you need speech to text AKA STT or automatic speech recognition AKA ASR. Deepgram is an API based service where you stream your audio to them and they give you transcripts. You can host this locally) from it and then any other things you want, such as -
+Conversation summarization (typically done via LLMs, so ollama or call OpenAI)
+You also need to store these things somewhere - and you need to store different things -
+1. Transcript
+2. Conversation summary
+3. Maybe the audio itself?
+4. Memories
+
+Memories are stored in qdrant
+Conversation, and like, general logging is done on mongodb.
+
+Its a little complicated to turn that into PCM, which most Apps use.
+
+# Repository Structure
+
+## Core Components
+
+### 📱 Mobile App (`friend-lite/`)
+- **React Native app** for connecting to OMI devices via Bluetooth
+- Streams audio in OPUS format to selected backend
+- Cross-platform (iOS/Android) support
+- Uses React Native Bluetooth SDK
+
+### 🖥️ Backends (`backends/`)
+Choose one based on your needs:
+
+#### **Simple Backend** (`backends/simple-backend/`)
+**Best for:** Getting started, basic audio processing, learning
+
+**Features:**
+- ✅ Basic audio ingestion (OPUS → PCM → WAV chunks)
+- ✅ File-based storage (30-second segments)
+- ✅ Minimal dependencies
+- ✅ Quick setup
+
+**Pros:**
+- Easiest to understand and modify
+- Minimal resource requirements
+- No external services needed
+- Good for prototyping
+
+**Cons:**
+- No transcription built-in
+- No memory/conversation management
+- No speaker recognition
+- Manual file management required
+
+---
+
+#### **Advanced Backend** (`backends/advanced-backend/`) ⭐ **RECOMMENDED**
+**Best for:** Production use, full feature set, comprehensive AI features
+
+**Features:**
+- ✅ Full audio processing pipeline
+- ✅ **Memory system** (mem0 + Qdrant vector storage)
+- ✅ **Speaker recognition & enrollment**
+- ✅ **Action items extraction** from conversations
+- ✅ **Audio cropping** (removes silence, keeps speech)
+- ✅ **Conversation management** with timeouts
+- ✅ **Web UI** for management and monitoring
+- ✅ **Multiple ASR options** (Deepgram API + offline ASR)
+- ✅ **MongoDB** for structured data storage
+- ✅ **RESTful API** for all operations
+- ✅ **Real-time processing** with WebSocket support
+
+**Pros:**
+- Complete AI-powered solution
+- Scalable architecture
+- Rich feature set
+- Web interface included
+- Speaker identification
+- Memory and action item extraction
+- Audio optimization
+
+**Cons:**
+- More complex setup
+- Requires multiple services (MongoDB, Qdrant, Ollama)
+- Higher resource requirements
+- Steeper learning curve
+- Authentication setup required
+
+---
+
+#### **OMI-Webhook-Compatible Backend** (`backends/omi-webhook-compatible/`)
+**Best for:** Existing OMI users, migration from official OMI backend
+
+**Features:**
+- ✅ Compatible with official OMI app webhook system
+- ✅ Drop-in replacement for OMI backend
+- ✅ Audio file storage
+- ✅ ngrok integration for public endpoints
+
+**Pros:**
+- Easy migration from official OMI
+- Works with existing OMI mobile app
+- Simple webhook-based architecture
+
+**Cons:**
+- Limited features compared to advanced backend
+- Depends on ngrok for public access
+- No built-in AI features
+
+---
+
+#### **Example Satellite Backend** (`backends/example-satellite/`)
+**Best for:** Distributed setups, Wyoming protocol integration
+
+**Features:**
+- ✅ Wyoming protocol satellite
+- ✅ Streams audio to remote Wyoming servers
+- ✅ Bluetooth OMI device discovery
+- ✅ Integration with Home Assistant/Wyoming ecosystem
+
+**Pros:**
+- Integrates with existing Wyoming setups
+- Good for distributed architectures
+- Home Assistant compatible
+
+**Cons:**
+- Requires separate Wyoming ASR server
+- Limited standalone functionality
+
+### 🔧 Additional Services (`extras/`)
+
+#### **ASR Services** (`extras/asr-services/`)
+- **Wyoming-compatible** ASR services
+- **Moonshine** - Fast offline ASR
+- **Parakeet** - Alternative offline ASR
+- Self-hosted transcription options
+
+#### **Speaker Recognition Service** (`extras/speaker-recognition/`)
+- Standalone speaker identification service
+- Used by advanced backend
+- REST API for speaker operations
+
+#### **HAVPE Relay** (`extras/havpe-relay/`)
+- Audio relay service
+- Protocol bridging capabilities
+
+# Wyoming Protocol Compatibility
+
+Both backends and ASR services use the **Wyoming protocol** for standardized communication:
+- Consistent audio streaming format
+- Interoperable with Home Assistant
+- Modular ASR service architecture
+- Easy to swap ASR providers
+
+# Quick Start Recommendations
+
+## For Beginners
+1. Start with **Simple Backend** to understand the basics
+2. Use **friend-lite mobile app** to connect your OMI device
+3. Examine saved audio chunks in `./audio_chunks/`
+
+## For Production Use
+1. Use **Advanced Backend** for full features
+2. Set up the complete stack: MongoDB + Qdrant + Ollama
+3. Access the Web UI for conversation management
+4. Configure speaker enrollment for multi-user scenarios
+
+## For OMI Users
+1. Use **OMI-Webhook-Compatible Backend** for easy migration
+2. Configure ngrok for public webhook access
+3. Point your OMI app to the webhook URL
+
+## For Home Assistant Users
+1. Use **Example Satellite Backend** with Wyoming integration
+2. Set up ASR services from `extras/asr-services/`
+3. Configure Home Assistant Wyoming integration
+
+# Getting Started
+
+1. **Clone the repository**
+2. **Choose your backend** based on the recommendations above
+3. **Follow the README** in your chosen backend directory
+4. **Configure the mobile app** to connect to your backend
+5. **Start streaming audio** from your OMI device
+
+Each backend directory contains detailed setup instructions and docker-compose files for easy deployment.
+
 # GitHub Actions CI/CD Setup for Friend Lite
 
 This sets up **automatic GitHub releases** with APK/IPA files whenever you push code.
@@ -23,4 +235,4 @@ This sets up **automatic GitHub releases** with APK/IPA files whenever you push
 4. Value: Paste your token from Step 1
 5. Click **Add secret**
 
-## ⚡ That's It!
+## ⚡ That's It!
@@ -0,0 +1,91 @@
+# GitHub Actions CI/CD Setup for Friend Lite
+
+This sets up **automatic GitHub releases** with APK/IPA files whenever you push code.
+
+## 🚀 How This Works
+
+1. You push code to GitHub
+2. GitHub automatically builds **both Android APK and iOS IPA**
+3. **Creates GitHub Releases** with both files attached
+4. You download directly from the **Releases** tab!
+
+## 🎯 Quick Setup (2 Steps)
+
+### Step 1: Get Expo Token
+1. Go to [expo.dev](https://expo.dev) and sign in/create account
+2. Go to [Access Tokens](https://expo.dev/accounts/[account]/settings/access-tokens)
+3. Create a new token and copy it
+
+### Step 2: Add GitHub Secret
+1. In your GitHub repo: **Settings** → **Secrets and variables** → **Actions**
+2. Click **New repository secret**
+3. Name: `EXPO_TOKEN`
+4. Value: Paste your token from Step 1
+5. Click **Add secret**
+
+## ⚡ That's It!
+# GitHub Actions Workflows
+
+## Integration Tests
+
+### Automatic Integration Tests (`integration-tests.yml`)
+- **Triggers**: Push/PR to `main` or `develop` branches affecting backend code
+- **Timeout**: 15 minutes
+- **Mode**: Cached mode (better for CI environment)
+- **Dependencies**: Requires `DEEPGRAM_API_KEY` and `OPENAI_API_KEY` secrets
+
+## Required Secrets
+
+Add these secrets in your GitHub repository settings:
+
+```
+DEEPGRAM_API_KEY=your-deepgram-api-key
+OPENAI_API_KEY=your-openai-api-key
+```
+
+## Test Environment
+
+- **Runtime**: Ubuntu latest with Docker support
+- **Python**: 3.12 with uv package manager
+- **Services**: MongoDB (port 27018), Qdrant (ports 6335/6336), Backend (port 8001)
+- **Test Data**: Isolated test directories and databases
+- **Audio**: 4-minute glass blowing tutorial for end-to-end validation
+
+## Modes
+
+### Cached Mode (Recommended for CI)
+- Reuses containers and data between test runs
+- Faster startup time
+- Better for containerized CI environments
+- Used by default in automatic workflows
+
+### Fresh Mode (Recommended for Local Development)
+- Completely clean environment each run
+- Removes all test data and containers
+- Slower but more reliable for debugging
+- Can be selected in manual workflow
+
+## Troubleshooting
+
+1. **Test Timeout**: Increase `timeout_minutes` in manual workflow
+2. **Memory Issues**: Check container logs in failed run artifacts
+3. **API Key Issues**: Verify secrets are set correctly in repository settings
+4. **Fresh Mode Fails**: Try cached mode for comparison
+
+## Local Testing
+
+To run the same tests locally:
+
+```bash
+cd backends/advanced-backend
+
+# Install dependencies
+uv sync --dev
+
+# Set up environment (copy from .env.template)
+cp .env.template .env.test
+# Add your API keys to .env.test
+
+# Run test (modify CACHED_MODE in test_integration.py if needed)
+uv run pytest test_integration.py::test_full_pipeline_integration -v -s
+```
@@ -1,10 +1,15 @@
 name: Android APK Build
 
+permissions:
+  contents: write
+
 on:
   push:
     branches: [main, develop]
+    paths: ['app/**']
   pull_request:
     branches: [main]
+    paths: ['app/**']
   workflow_dispatch:
 
 jobs:
@@ -59,6 +64,7 @@ jobs:
 
       - name: Create Release
         id: create_release
+        if: github.event_name == 'push' && github.ref == 'refs/heads/main'
         uses: actions/create-release@v1
         env:
           GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
 
@@ -3,6 +3,7 @@ name: Build All Platforms
 on:
   push:
     branches: [main]
+    paths: ['app/**']
   workflow_dispatch:
     inputs:
       build_android: