Conversation
|
davidzhao
left a comment
There was a problem hiding this comment.
this looks great! just a few comments.
we'll also need some error handling in various parts.. how do both sides handle cases where the other side is disconnected. if the avatar participant is gone for longer than a reasonable timeout, then the agent would likely need to report that error and shutdown itself.
similarly.. if the controller is gone, the service on the other side might want to avoid consuming resources and also exit
|
|
||
| AUDIO_SENDER_ATTR = "__livekit_avatar_audio_sender" | ||
| AUDIO_RECEIVER_ATTR = "__livekit_avatar_audio_receiver" | ||
| RPC_INTERRUPT_PLAYBACK = "__livekit_avatar_interrupt_playback" |
There was a problem hiding this comment.
nit: for consistency, using lk. namespace to identify livekit specific actions
| RPC_INTERRUPT_PLAYBACK = "__livekit_avatar_interrupt_playback" | |
| RPC_INTERRUPT_PLAYBACK = "lk.interrupt_playback" |
| async def start(self) -> None: | ||
| """Wait for worker participant to join and start streaming""" | ||
| # mark self as sender | ||
| await self._room.local_participant.set_attributes({AUDIO_SENDER_ATTR: "true"}) |
There was a problem hiding this comment.
was thinking we can simplify this step. instead the receiver could just wait for an audio stream of a particular name?
| """Wait for worker participant to join and start streaming""" | ||
| # mark self as sender | ||
| await self._room.local_participant.set_attributes({AUDIO_SENDER_ATTR: "true"}) | ||
| self._remote_participant = await wait_for_participant( |
There was a problem hiding this comment.
what if.. instead of waiting for an attribute, we could:
- take
avatar_identityas a param in the sink (with a sane default) - create a token for that identity and send it to the other side as part of initial handshake
- here we can just wait for that agreed-upon identity
There was a problem hiding this comment.
sounds good! on the avatar side, it wait for the audio stream with a particular name from the participant with kind=='agent'.
| # start new stream | ||
| # TODO: any better option to send the metadata? | ||
| name = f"audio_{frame.sample_rate}_{frame.num_channels}" | ||
| self._stream_writer = await self._room.local_participant.stream_file( |
There was a problem hiding this comment.
this is a good use of stream extensions:
writer = await room.local_participant.stream_file("audio",
extensions={"sample_rate": "48000", "channels": "1"})
or
writer = await room.local_participant.stream_file("audio",
extensions={"audio_settings": json.dumps({"sample_rate": 48000, channels: 1})})
There was a problem hiding this comment.
Oh I see. the extensions is some kind of metadata? then what is the reason it is named extensions?
There was a problem hiding this comment.
yeah.. I think attributes is probably a better name
| # mark self as receiver | ||
| await self._room.local_participant.set_attributes({AUDIO_RECEIVER_ATTR: "true"}) | ||
|
|
||
| self._remote_participant = await wait_for_participant( |
There was a problem hiding this comment.
it seems here we can just wait for participant.kind == agent?
if we wanted to handle multiple avatars in the room, then the integration should take in the controller's identity.
| reader: rtc.FileStreamReader, remote_participant_id: str | ||
| ) -> None: | ||
| if remote_participant_id != self._remote_participant.identity: | ||
| logger.warning( |
There was a problem hiding this comment.
would we really want to warn on any other incoming file stream? that seems like a rather narrow use case for this plugin
There was a problem hiding this comment.
oh I see, I'll filter for the audio stream first, so other data streams can still be processed by other handlers.
Btw, what is the use case of the file_name in data stream, can I pass a tag and the metadata like sample_rate and num_channels using file name, or is there any better option for this.
There was a problem hiding this comment.
see @davidzhao's comment above, the best option is the extensions map on the stream.
| reader = self._stream_readers.pop(0) | ||
| async for data in reader.stream_reader: | ||
| yield rtc.AudioFrame( | ||
| data=data, |
There was a problem hiding this comment.
this pattern would suggest that we're sure a single audio frame never exceeds STREAM_CHUNK_SIZE (~15kb)
There was a problem hiding this comment.
For audio bytes, splitting large chunks into smaller chunks before sending is fine, and should be less than 15kb. but for other use cases, receiving a different number of chunks than it send may not be good behavior. Maybe add a size limit at the send side?
There was a problem hiding this comment.
I originally had a size limit in there, @theomonnom's wish was that we wouldn't enforce such a limit, but I agree, it might make things trickier if we don't have a sender side size limit
| ) -> None: | ||
| """Wait for worker participant to join and start streaming""" | ||
| # create a token for the avatar worker | ||
| # TODO(long): do we need to set agent=True here? in playground if not the video track is not automatically displayed |
There was a problem hiding this comment.
we might want a new kind to distinguish but we probably shouldn't join with the standard participant kind
| .with_name("Avatar Worker") | ||
| .with_grants(api.VideoGrants(room_join=True, room=ctx.room.name, agent=True)) | ||
| .with_metadata("avatar_worker") | ||
| .to_jwt() |
There was a problem hiding this comment.
we should specify the identity of the "original" agent as well so that the integrator can add a guard on their RPC handlers methods (if data.caller_identity !== agent_identity: print("RPC call from unexpected participant"); return)
| logger.info(f"Sending connection info to avatar dispatcher {avatar_dispatcher_url}") | ||
| connection_info = AvatarConnectionInfo( | ||
| room_name=ctx.room.name, url=ctx._info.url, token=token | ||
| ) |
There was a problem hiding this comment.
we should specify the identity of the "original" agent as well so that the integrator can add a guard on their RPC handlers methods (if data.caller_identity !== agent_identity: print("RPC call from unexpected participant"); return)
There was a problem hiding this comment.
This makes sense! I added the check in the rpc hanlders.
| async def _read_stream(): | ||
| async for event in self._audio_stream: | ||
| yield event.frame | ||
| await asyncio.sleep(0) |
There was a problem hiding this comment.
any reason why this is needed?
There was a problem hiding this comment.
It was added for debugging an event loop issue. Will remove it.
Co-authored-by: David Zhao <dz@livekit.io>
Uh oh!
There was an error while loading. Please reload this page.