Skip to content

feat: voice message recording and send (Phase 8)#551

Draft
torlando-tech wants to merge 12 commits intomainfrom
feat/phase-8-voice-recording
Draft

feat: voice message recording and send (Phase 8)#551
torlando-tech wants to merge 12 commits intomainfrom
feat/phase-8-voice-recording

Conversation

@torlando-tech
Copy link
Copy Markdown
Owner

Summary

  • Add VoiceMessageRecorder with Opus encoding (24kHz mono, VOICE_MEDIUM profile) via LXST-kt
  • Add AudioPermissionManager for RECORD_AUDIO permission checks
  • Add VoiceMessageViewModel (Hilt) wrapping the recorder
  • Wire mic button with hold-to-record gesture into MessagingScreen
  • Wire sendMessage() to accept VoiceRecording with audio data passthrough

Features

  • 300ms minimum duration (tap-to-discard)
  • 30s auto-stop for mesh-friendly message sizes (~30KB)
  • Audio focus management (silences notifications during recording)
  • Per-frame waveform peak capture for future visualization

Test plan

  • On-device: hold mic button, record, release → message sends with audio
  • Tap mic briefly (< 300ms) → silently discarded, no message sent
  • Record for 30s → auto-stops and sends
  • Verify audio focus acquired/released (notifications silenced during recording)
  • Verify RECORD_AUDIO permission prompt on first use

🤖 Generated with Claude Code

@torlando-tech torlando-tech linked an issue Feb 25, 2026 that may be closed by this pull request
@torlando-tech torlando-tech added this to the v0.10.0 milestone Feb 26, 2026
torlando-tech and others added 12 commits March 11, 2026 18:23
- Add hasAudioAttachment field for audio bubble rendering
- Add audioDurationMs field for playback duration display
- Add audioCodecId field for decoder selection (e.g., "opus_vm")
- Add audioWaveform field for pre-computed amplitude peaks
- All fields have defaults so existing callers are unaffected
Wire audioData/audioCodecId through the full IPC stack:
- ReticulumProtocol interface: add audioData/audioCodecId params with defaults
- MockReticulumProtocol: match updated interface signature
- IReticulumService.aidl: add audioData, audioCodecId, audioDataPath params
- ServiceReticulumProtocol: handle large audio via temp file (same as image pattern)
- ReticulumServiceBinder: pass audio params through to Python callAttr

Large audio (>FILE_TRANSFER_THRESHOLD) is written to a temp file and
passed as audioDataPath to bypass Android Binder IPC size limits,
mirroring the existing imageDataPath pattern.
- Add hasAudioField() to detect LXMF field 7 audio (array format)
  distinguishing from legacy location data (string format)
- Add AudioMetadata data class with codecId, durationMs, waveform
- Add extractAudioBytes() for raw audio data extraction from field 7
- Add extractAudioMetadata() to read codec_id and optional waveform peaks
- Update toMessageUi() to populate hasAudioAttachment and audio metadata
- Preserve fieldsJson when hasAudio is true for lazy byte loading
Send path:
- Add audio_data, audio_codec_id, audio_data_path params to
  send_lxmf_message_with_method()
- Large audio files (via audio_data_path) are read from disk and temp
  file deleted after reading (same pattern as image_data_path)
- Audio bytes packed into fields[FIELD_AUDIO] = [codec_id, audio_bytes]

Receive path:
- Disambiguate field 7 before attempting legacy location parse
- Audio data arrives as [codec_id, audio_bytes] (list with string first)
- Legacy location data arrives as bytes/string containing JSON
- Type check prevents audio from being misinterpreted as location JSON
  and vice versa, ensuring both features coexist without breakage
- Add extractAudioBytes/extractAudioMetadata imports from MessageMapper
- Add audioData, audioCodecId, audioWaveform params to buildFieldsJson()
- Pack Field 7 as JSONArray ["codec_id", "hex"] with optional [waveform]
- Large audio (>512KB) uses _file_ref pattern to avoid OOM
- Declare audio vars in sendMessage() (null placeholders, Phase 8 wires them)
- Pass audioData/audioCodecId to sendLxmfMessageWithMethod() in sendMessage()
- Add audioData/audioCodecId/audioWaveform params to handleSendSuccess()
- handleSendSuccess() stores audio in local fieldsJson via buildFieldsJson()
- retryFailedMessage() extracts audio via extractAudioBytes/extractAudioMetadata
- retryFailedMessage() passes extracted audio to sendLxmfMessageWithMethod()
- All audio params nullable with null defaults, no breaking changes to callers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- 3/3 plans executed across 2 waves
- PROTO-01 through PROTO-05 requirements marked Complete
- Phase 7 verified: 5/5 must-haves confirmed against codebase

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…heck

- Mirrors MediaPermissionManager pattern with single hasPermission(context) method
- Checks ContextCompat.checkSelfPermission for RECORD_AUDIO
- Permission launcher to be added in UI layer (Plan 02)
- AudioRecord captures PCM at 24kHz mono, encodes to Opus VOICE_MEDIUM
- 2-byte big-endian length-prefixed Opus frames accumulated in ByteArrayOutputStream
- 300ms minimum duration: shorter recordings return null from stop() (silent discard)
- 30s auto-stop via coroutine delay(MAX_DURATION_MS)
- Audio focus AUDIOFOCUS_GAIN_TRANSIENT_EXCLUSIVE acquired before recording, released in stop()
- Audio focus released BEFORE 300ms discard check (covers both send and discard paths)
- stop() is suspend fun dispatching AudioRecord.stop()/release() to Dispatchers.IO (ANR safety)
- Waveform peak amplitudes captured per frame for visualization
- StateFlow<RecordingUiState> for observable recording state
- VoiceRecording data class with audioBytes, codecId, durationMs, waveformPeaks
…params

- Add VoiceMessageViewModel (HiltViewModel) with startRecording/stopRecording
  delegating to VoiceMessageRecorder
- Add optional voiceRecording parameter to MessagingViewModel.sendMessage()
- Wire audioBytes/codecId/waveformPeaks from VoiceRecording to send pipeline
- Add voice-only message guard: bypass validateAndSanitizeContent with " "
  when audio present but no text/image/files (matches image-only pattern)
…n handling

- Add VoiceMessageViewModel + recording state to MessagingScreen
- Add RECORD_AUDIO permission launcher (requests on first mic press)
- Add mic/send button toggle in MessageInputBar: mic when empty, send when
  text/attachments present
- Mic button uses detectTapGestures onPress/tryAwaitRelease for hold gesture
- onMicRelease launches coroutine to call suspend stopRecording() and sends
  VoiceRecording through sendMessage pipeline
- Wire hasTextOrAttachments, isRecording, onMicPress, onMicRelease params
Extract helpers from sendMessage/buildFieldsJson/sendLxmfMessageWithMethod
to satisfy LongMethod, LongParameterList, and CyclomaticComplexMethod
thresholds. Fix ktlint parse error in MessageUi KDoc comment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nd bubble rendering

- Fix mic button gesture: replace IconButton with Box so pointerInput
  receives press/release events (IconButton's internal clickable was
  swallowing them)
- Fix recomposition killing gesture: keep mic button in composition
  during recording so tryAwaitRelease() completes
- Add recording indicator: pulsing red dot + duration timer replaces
  text field during active recording
- Add preview bar after recording: play/pause, waveform with progress,
  discard (trash), and send buttons
- Add VoiceMessagePlayer: decodes length-prefixed Opus frames via
  LXST-kt Opus decoder, plays through AudioTrack MODE_STATIC
- Add VoiceMessageBubble: play/pause, waveform, duration in message
  bubbles for sent/received voice messages
- Extract shared WaveformBar composable for preview and bubble reuse
- Hide text content for voice-only messages (content is " " for
  Sideband compatibility)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@torlando-tech torlando-tech force-pushed the feat/phase-8-voice-recording branch from c9fe3ef to b462e15 Compare March 11, 2026 23:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Voice Messages

1 participant