Skip to content

Commit 7a9df1d

Browse files
committed
2 parents 64265c7 + f03faf1 commit 7a9df1d

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

60 files changed

+9608
-4846
lines changed

.github/README.md

Lines changed: 213 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,215 @@
1+
# Own your Friend DevKit, DevKit 2, OMI
2+
3+
# Intro
4+
The idea of this repo is to provide just enough to be useful for developers.
5+
It should provide the minimal requirements to either:
6+
1. Provide firmware, sdks, examples to make your own software
7+
2. Advanced solution or tutorial for one such that you can get full use of your devices.
8+
9+
Instead of going the way of making OMI's ecosystem compatible and more open -
10+
This will attempt to make it easy to roll your own - and then try to make it compatible with OMI's ecosystem.
11+
12+
The app uses react native sdk (A fork of it, since the original wasn't updated fast enough)
13+
The backend uses the python sdk (A fork of it, since the original isn't pushed to pypi)
14+
15+
# Vision
16+
This fits as a small part of the larger idea of
17+
"Have various sensors feeding the state of YOUR world to computers/AI and get some use out of it"
18+
19+
Usecases are numerous - OMI Mentor is one of them
20+
Friend/Omi/pendants are a small but important part of this, since they record personal spoken context the best.
21+
OMI-like devices with a camera can also capture visual context - or smart glasses - which also double as a display.
22+
23+
Regardless - this repo will try to do the minimal of this - multiple OMI-like audio devices feeding audio data - and from it,
24+
- Memories
25+
- Action items
26+
- Home automation
27+
28+
# Arch
29+
Checkout arch.md in backends/advanced-backends/Docs/arch.md
30+
31+
## Arch description
32+
The current DevKit2 streams audio via Bluetooth to some device in the OPUS codec.
33+
Once you have audio, you need trascription (you need speech to text AKA STT or automatic speech recognition AKA ASR. Deepgram is an API based service where you stream your audio to them and they give you transcripts. You can host this locally) from it and then any other things you want, such as -
34+
Conversation summarization (typically done via LLMs, so ollama or call OpenAI)
35+
You also need to store these things somewhere - and you need to store different things -
36+
1. Transcript
37+
2. Conversation summary
38+
3. Maybe the audio itself?
39+
4. Memories
40+
41+
Memories are stored in qdrant
42+
Conversation, and like, general logging is done on mongodb.
43+
44+
Its a little complicated to turn that into PCM, which most Apps use.
45+
46+
# Repository Structure
47+
48+
## Core Components
49+
50+
### 📱 Mobile App (`friend-lite/`)
51+
- **React Native app** for connecting to OMI devices via Bluetooth
52+
- Streams audio in OPUS format to selected backend
53+
- Cross-platform (iOS/Android) support
54+
- Uses React Native Bluetooth SDK
55+
56+
### 🖥️ Backends (`backends/`)
57+
Choose one based on your needs:
58+
59+
#### **Simple Backend** (`backends/simple-backend/`)
60+
**Best for:** Getting started, basic audio processing, learning
61+
62+
**Features:**
63+
- ✅ Basic audio ingestion (OPUS → PCM → WAV chunks)
64+
- ✅ File-based storage (30-second segments)
65+
- ✅ Minimal dependencies
66+
- ✅ Quick setup
67+
68+
**Pros:**
69+
- Easiest to understand and modify
70+
- Minimal resource requirements
71+
- No external services needed
72+
- Good for prototyping
73+
74+
**Cons:**
75+
- No transcription built-in
76+
- No memory/conversation management
77+
- No speaker recognition
78+
- Manual file management required
79+
80+
---
81+
82+
#### **Advanced Backend** (`backends/advanced-backend/`) ⭐ **RECOMMENDED**
83+
**Best for:** Production use, full feature set, comprehensive AI features
84+
85+
**Features:**
86+
- ✅ Full audio processing pipeline
87+
-**Memory system** (mem0 + Qdrant vector storage)
88+
-**Speaker recognition & enrollment**
89+
-**Action items extraction** from conversations
90+
-**Audio cropping** (removes silence, keeps speech)
91+
-**Conversation management** with timeouts
92+
-**Web UI** for management and monitoring
93+
-**Multiple ASR options** (Deepgram API + offline ASR)
94+
-**MongoDB** for structured data storage
95+
-**RESTful API** for all operations
96+
-**Real-time processing** with WebSocket support
97+
98+
**Pros:**
99+
- Complete AI-powered solution
100+
- Scalable architecture
101+
- Rich feature set
102+
- Web interface included
103+
- Speaker identification
104+
- Memory and action item extraction
105+
- Audio optimization
106+
107+
**Cons:**
108+
- More complex setup
109+
- Requires multiple services (MongoDB, Qdrant, Ollama)
110+
- Higher resource requirements
111+
- Steeper learning curve
112+
- Authentication setup required
113+
114+
---
115+
116+
#### **OMI-Webhook-Compatible Backend** (`backends/omi-webhook-compatible/`)
117+
**Best for:** Existing OMI users, migration from official OMI backend
118+
119+
**Features:**
120+
- ✅ Compatible with official OMI app webhook system
121+
- ✅ Drop-in replacement for OMI backend
122+
- ✅ Audio file storage
123+
- ✅ ngrok integration for public endpoints
124+
125+
**Pros:**
126+
- Easy migration from official OMI
127+
- Works with existing OMI mobile app
128+
- Simple webhook-based architecture
129+
130+
**Cons:**
131+
- Limited features compared to advanced backend
132+
- Depends on ngrok for public access
133+
- No built-in AI features
134+
135+
---
136+
137+
#### **Example Satellite Backend** (`backends/example-satellite/`)
138+
**Best for:** Distributed setups, Wyoming protocol integration
139+
140+
**Features:**
141+
- ✅ Wyoming protocol satellite
142+
- ✅ Streams audio to remote Wyoming servers
143+
- ✅ Bluetooth OMI device discovery
144+
- ✅ Integration with Home Assistant/Wyoming ecosystem
145+
146+
**Pros:**
147+
- Integrates with existing Wyoming setups
148+
- Good for distributed architectures
149+
- Home Assistant compatible
150+
151+
**Cons:**
152+
- Requires separate Wyoming ASR server
153+
- Limited standalone functionality
154+
155+
### 🔧 Additional Services (`extras/`)
156+
157+
#### **ASR Services** (`extras/asr-services/`)
158+
- **Wyoming-compatible** ASR services
159+
- **Moonshine** - Fast offline ASR
160+
- **Parakeet** - Alternative offline ASR
161+
- Self-hosted transcription options
162+
163+
#### **Speaker Recognition Service** (`extras/speaker-recognition/`)
164+
- Standalone speaker identification service
165+
- Used by advanced backend
166+
- REST API for speaker operations
167+
168+
#### **HAVPE Relay** (`extras/havpe-relay/`)
169+
- Audio relay service
170+
- Protocol bridging capabilities
171+
172+
# Wyoming Protocol Compatibility
173+
174+
Both backends and ASR services use the **Wyoming protocol** for standardized communication:
175+
- Consistent audio streaming format
176+
- Interoperable with Home Assistant
177+
- Modular ASR service architecture
178+
- Easy to swap ASR providers
179+
180+
# Quick Start Recommendations
181+
182+
## For Beginners
183+
1. Start with **Simple Backend** to understand the basics
184+
2. Use **friend-lite mobile app** to connect your OMI device
185+
3. Examine saved audio chunks in `./audio_chunks/`
186+
187+
## For Production Use
188+
1. Use **Advanced Backend** for full features
189+
2. Set up the complete stack: MongoDB + Qdrant + Ollama
190+
3. Access the Web UI for conversation management
191+
4. Configure speaker enrollment for multi-user scenarios
192+
193+
## For OMI Users
194+
1. Use **OMI-Webhook-Compatible Backend** for easy migration
195+
2. Configure ngrok for public webhook access
196+
3. Point your OMI app to the webhook URL
197+
198+
## For Home Assistant Users
199+
1. Use **Example Satellite Backend** with Wyoming integration
200+
2. Set up ASR services from `extras/asr-services/`
201+
3. Configure Home Assistant Wyoming integration
202+
203+
# Getting Started
204+
205+
1. **Clone the repository**
206+
2. **Choose your backend** based on the recommendations above
207+
3. **Follow the README** in your chosen backend directory
208+
4. **Configure the mobile app** to connect to your backend
209+
5. **Start streaming audio** from your OMI device
210+
211+
Each backend directory contains detailed setup instructions and docker-compose files for easy deployment.
212+
1213
# GitHub Actions CI/CD Setup for Friend Lite
2214

3215
This sets up **automatic GitHub releases** with APK/IPA files whenever you push code.
@@ -23,4 +235,4 @@ This sets up **automatic GitHub releases** with APK/IPA files whenever you push
23235
4. Value: Paste your token from Step 1
24236
5. Click **Add secret**
25237

26-
## ⚡ That's It!
238+
## ⚡ That's It!

.github/workflows/README.md

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
# GitHub Actions CI/CD Setup for Friend Lite
2+
3+
This sets up **automatic GitHub releases** with APK/IPA files whenever you push code.
4+
5+
## 🚀 How This Works
6+
7+
1. You push code to GitHub
8+
2. GitHub automatically builds **both Android APK and iOS IPA**
9+
3. **Creates GitHub Releases** with both files attached
10+
4. You download directly from the **Releases** tab!
11+
12+
## 🎯 Quick Setup (2 Steps)
13+
14+
### Step 1: Get Expo Token
15+
1. Go to [expo.dev](https://expo.dev) and sign in/create account
16+
2. Go to [Access Tokens](https://expo.dev/accounts/[account]/settings/access-tokens)
17+
3. Create a new token and copy it
18+
19+
### Step 2: Add GitHub Secret
20+
1. In your GitHub repo: **Settings****Secrets and variables****Actions**
21+
2. Click **New repository secret**
22+
3. Name: `EXPO_TOKEN`
23+
4. Value: Paste your token from Step 1
24+
5. Click **Add secret**
25+
26+
## ⚡ That's It!
27+
# GitHub Actions Workflows
28+
29+
## Integration Tests
30+
31+
### Automatic Integration Tests (`integration-tests.yml`)
32+
- **Triggers**: Push/PR to `main` or `develop` branches affecting backend code
33+
- **Timeout**: 15 minutes
34+
- **Mode**: Cached mode (better for CI environment)
35+
- **Dependencies**: Requires `DEEPGRAM_API_KEY` and `OPENAI_API_KEY` secrets
36+
37+
## Required Secrets
38+
39+
Add these secrets in your GitHub repository settings:
40+
41+
```
42+
DEEPGRAM_API_KEY=your-deepgram-api-key
43+
OPENAI_API_KEY=your-openai-api-key
44+
```
45+
46+
## Test Environment
47+
48+
- **Runtime**: Ubuntu latest with Docker support
49+
- **Python**: 3.12 with uv package manager
50+
- **Services**: MongoDB (port 27018), Qdrant (ports 6335/6336), Backend (port 8001)
51+
- **Test Data**: Isolated test directories and databases
52+
- **Audio**: 4-minute glass blowing tutorial for end-to-end validation
53+
54+
## Modes
55+
56+
### Cached Mode (Recommended for CI)
57+
- Reuses containers and data between test runs
58+
- Faster startup time
59+
- Better for containerized CI environments
60+
- Used by default in automatic workflows
61+
62+
### Fresh Mode (Recommended for Local Development)
63+
- Completely clean environment each run
64+
- Removes all test data and containers
65+
- Slower but more reliable for debugging
66+
- Can be selected in manual workflow
67+
68+
## Troubleshooting
69+
70+
1. **Test Timeout**: Increase `timeout_minutes` in manual workflow
71+
2. **Memory Issues**: Check container logs in failed run artifacts
72+
3. **API Key Issues**: Verify secrets are set correctly in repository settings
73+
4. **Fresh Mode Fails**: Try cached mode for comparison
74+
75+
## Local Testing
76+
77+
To run the same tests locally:
78+
79+
```bash
80+
cd backends/advanced-backend
81+
82+
# Install dependencies
83+
uv sync --dev
84+
85+
# Set up environment (copy from .env.template)
86+
cp .env.template .env.test
87+
# Add your API keys to .env.test
88+
89+
# Run test (modify CACHED_MODE in test_integration.py if needed)
90+
uv run pytest test_integration.py::test_full_pipeline_integration -v -s
91+
```

.github/workflows/android-apk-build.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,15 @@
11
name: Android APK Build
22

3+
permissions:
4+
contents: write
5+
36
on:
47
push:
58
branches: [main, develop]
9+
paths: ['app/**']
610
pull_request:
711
branches: [main]
12+
paths: ['app/**']
813
workflow_dispatch:
914

1015
jobs:
@@ -59,6 +64,7 @@ jobs:
5964
6065
- name: Create Release
6166
id: create_release
67+
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
6268
uses: actions/create-release@v1
6369
env:
6470
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

.github/workflows/build-all-platforms.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ name: Build All Platforms
33
on:
44
push:
55
branches: [main]
6+
paths: ['app/**']
67
workflow_dispatch:
78
inputs:
89
build_android:

0 commit comments

Comments
 (0)