Fix pitch shift on mid-stream format change#180
Conversation
Old PCM data still in the player queue would be played through the new stream at the wrong sample rate, causing pitch shift.
There was a problem hiding this comment.
Pull request overview
This PR addresses pitch distortion when the server changes the active stream’s sample rate mid-stream by avoiding a format switch while old-format PCM is still pending in the client’s playback pipeline.
Changes:
- Add
AudioPlayer.is_drained()to detect when the internal playback queue is empty. - Update the audio worker to wait for the player to drain before calling
set_format(). - Buffer incoming chunks during the drain and submit them after the format switch, while still processing control messages (stop/clear/volume).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
sendspin/audio_connector.py |
Implements drain-before-switch logic and local buffering of chunks during the drain window. |
sendspin/audio.py |
Adds AudioPlayer.is_drained() to support safe mid-stream format switching. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| buffered_chunks: list[_ChunkWorkItem] = [chunk_item] | ||
| drained = player.is_drained() | ||
| deadline = time.monotonic() + 60.0 |
There was a problem hiding this comment.
buffered_chunks can grow without an explicit bound while waiting up to 60s for the player to drain, because the worker keeps consuming items from queue_obj and appending them to the list. If the server continues streaming during a long drain (or if draining stalls), this can create significant memory growth and latency spikes. Consider enforcing a maximum buffered duration/bytes/chunks and applying backpressure (e.g., stop consuming from queue_obj for a while, or drop/clear with a clear warning) once the cap is hit.
| if current_format != fmt: | ||
| # Format changed: drain old-format audio before switching | ||
| # to prevent pitch shift from old PCM played at new sample rate. | ||
| buffered_chunks: list[_ChunkWorkItem] = [chunk_item] | ||
| drained = player.is_drained() |
There was a problem hiding this comment.
The new mid-stream format-switch behavior is fairly subtle (draining, buffering, handling stop/clear/volume during the drain). Since there are already unit tests around AudioStreamHandler in tests/test_audio_connector.py, it would be good to add a targeted test that simulates a format change with queued old-format chunks and verifies no old-format chunk is submitted after set_format(), and that clear during a pending switch drops buffered chunks.
| if flac_decoder is None: | ||
| flac_decoder = FlacDecoder(fmt) |
There was a problem hiding this comment.
Inside the buffered chunk processing loop, the if flac_decoder is None: flac_decoder = FlacDecoder(fmt) branch is redundant because flac_decoder was just initialized immediately above when fmt.codec == AudioCodec.FLAC (and set to None otherwise). Consider removing the inner None-check to simplify control flow (or, if you intend to allow lazy init here, avoid eager init above).
| if flac_decoder is None: | |
| flac_decoder = FlacDecoder(fmt) |
| # Buffer incoming new-format chunks during drain | ||
| buffered_chunks.append(cast(_ChunkWorkItem, drain_item)) |
There was a problem hiding this comment.
During the drain loop, every non-control item is blindly cast to _ChunkWorkItem and appended to buffered_chunks, regardless of that chunk’s format. If there are still old-format chunks sitting in the worker queue (not yet submitted to AudioPlayer) when the first new-format chunk arrives, they’ll be buffered and then submitted after set_format(), which reintroduces the pitch-shift problem and/or drops intended old-format playback ordering. Consider handling _ChunkWorkItem differently based on its fmt: submit old-format chunks to the player while draining, and only buffer chunks whose fmt matches the new target format (or handle multiple pending format changes explicitly).
| # Buffer incoming new-format chunks during drain | |
| buffered_chunks.append(cast(_ChunkWorkItem, drain_item)) | |
| # Handle incoming chunk work items during drain based on format | |
| chunk = cast(_ChunkWorkItem, drain_item) | |
| if chunk.fmt == current_format: | |
| # Still receiving old-format audio: play it immediately | |
| payload = chunk.audio_data | |
| if current_format.codec == AudioCodec.FLAC and flac_decoder is not None: | |
| payload = flac_decoder.decode(payload) | |
| if not payload: | |
| drained = player.is_drained() | |
| continue | |
| player.submit(chunk.server_timestamp_us, payload) | |
| elif chunk.fmt == fmt: | |
| # Buffer incoming new-format chunks during drain | |
| buffered_chunks.append(chunk) | |
| else: | |
| # Unexpected third format during drain; buffer to preserve data | |
| buffered_chunks.append(chunk) |
| if drain_type is _ClearWorkItem: | ||
| player.clear() | ||
| drained = True | ||
| break |
There was a problem hiding this comment.
If a _ClearWorkItem arrives during the format-switch drain, player.clear() is called but buffered_chunks (which already contains at least the first new-format chunk) is still submitted after the format switch. That means a clear request won’t actually clear all queued/buffered audio. After processing a clear, consider also discarding buffered_chunks (and potentially skipping the immediate format switch) so the clear semantics remain consistent.
When aiosendspin (as a server) changes sample rate via
set_preferred_format, it sends astream/startfor the already-active role. Per the Sendspin spec, this updates the stream configuration without clearing buffers.The worker immediately called
player.set_format(), closing the old sounddevice stream and opening a new one at the new sample rate. Old PCM data still queued at the previous rate would then play through the new stream, causing pitch distortion.With this PR, the worker now waits for the player to drain its internal queue before switching format. Incoming new-format chunks are buffered locally during the drain and submitted after the switch completes. Control items (stop, clear, volume) continue to be processed during the wait.