-
-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
migrate mww to esphome_audio, bring back volume control and media_player #39
Conversation
…o chips on board and have media_player functionality
…the needed on on_tts_stream_end when we use speaker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! I've added some comments regarding code quality and naming conventions and I'm testing the config today (hopefully we can get some more beta testers on board to see how it goes).
Thanks for the feedback! Fixed most of it. Left you a comment above too. |
@tetele implemented the stopping of media player on wake word detection. Separately I have started another branch here that starts to implement audio ducking - IE - the music would play in the background but softer while the device/system processes the full request, with the end goal being that it would resume playing where it left off after the request is fully done. As I got into this I realized it was going to be a little complex, and I might need to make some changes to media_player in esphome to support it, so I will work on this separately if I get time. |
I doubt it will be easy to do that on the ESPHome That said, I wanted to thank you for your contributions! You, sir, have been amazing! With my current limited availability, I don't even know when I would have been able to test and implement the ADF I'd like to get at least some testing feedback on this implementation and then merge it to the main config. I've found an issue with the volume and reported it upstream. There's a workaround, but I'd like for that change to be merged before proceeding, apart from some testing. If all goes well, I'd also like to port this implementation to the non-MWW config as a longer term plan. None of that would have been possible without your help ❤️ |
Also pushed a method to restore the volume between reboots as it was driving me crazy. Not sure if there is a better way to do it. Ya, I noticed the speaker sounds bad over 50% of volume as well. But strangely only for music, not TTS. (for me, at least). So I do wonder whether there is something else going on.. Does TTS sound poor above 50% for you, or just music? Lastly, re: audio ducking, I agree, there needs to be changes in media_player, I plan to look at this when I get time, upstream in esphome, and perhaps in voice_assistant, too. Might be a fun challenge, but my goal is to fully replace siri in my house, one day! :p |
Both TTS and media sound just as loud above 50%. Music does sound a bit poorer, but maybe TTS does too, it's just that it's harder to trace |
Can you please hit me up on Discord @cowboyrushforth ? I'm @tetele on both the HA and ESPHome servers |
This reverts commit 76a103f.
Rolled that last commit back. Sent you a DM on discord, I am on the esphome discord under spot or spotman_, cheers |
I've been following this and porting to non-mww. Hit me up (@discord) if I can help out on that front. 👍 |
That took some legitimate long compile time haha, but super awesome work!!! Could successfully test microwakeword with audio control and media player working!!! Also played a bit with adding a beep, this works really really well:
Beep is a sub 500ms little clip and is currently hosted on some rpi in my network, but I am wondering if there would be a more elegant version of this.... any ideas ? Greetings, and thanks for the amazing work, I will flash this out to all 5 onjus I am running as soon as I get to it =D |
I'm having some issues with either MWW or the I2S audio - wake word recognition simply stops after a long enough time. It works after the MWW detection is turned off and on again. So far I haven't been able to trace the root cause of the issue, but I will keep trying. Does anybody else get that? |
Hm... had it running since my comment yesterday, still works fine here (running esphome 2023.4 btw). I did have a issue like that on my RaspiAudio MuseProto Boards that I could never get to fix fully... |
Re: sound quality. From https://github.com/gnumpi/esphome_audio?tab=readme-ov-file#i2s-settings:
Probably just need to set this to 48000 for music playback. blah blah blah... edit: Never mind. I see you're way ahead of me, and the microphone fails to work with the resampler. So 16000 hz it is for now. |
So I have 3 onju devices setup. This happens in one location despite which device is in that location. In this location it is closer to another device. I am curious if this could have to do with perhaps 2 onju devices listen to the wake word, and something gets weird and neither actually respond. To expand though, one device experiencing this frequently, took to another location (downstairs, far away from any other de ice) and its never once not heard the wake word. So I started to look thru code as I read somewhere that HA supports only a single device actually responding, but I can't actually find any support for this, so the whole thing remains a mystery. |
That is true, i haven't thought to check this, but it would not explain 1. the fact that toggling the wake word makes it work immediately and 2. the fact that until the toggle it doesn't trigger at all, regardless how well isolated it is from other devices (i.e. closed doors etc.). |
Seems like your issue is not related to that. Often when I trigger 2 devices at once one of them does not respond, which is useful sometimes... but mostly feels buggy. Even worse I had cases where the second device would repeat the question (!!!) and answer of the first again. this is especially strange as I do not understand how the question gets into tts the pipeline again at all.. |
Perhaps, but perhaps not. Would love to find the source code for whatever functionality HA has for only-one-device-detecting-wake-word at one time. regarding 1 - perhaps toggling the wake word resets something critical? wouldnt say that that rules this out at all. |
Got some info from discord, so for MWW, the code HA uses to stop concurrent requests is here: https://github.com/home-assistant/core/blob/dev/homeassistant%2Fcomponents%2Fassist_pipeline%2Fpipeline.py#L1364-L1381 After expanding my loglevel for assist_pipeline, I do see this is triggered for me on the trouble unit. Will continue debugging when I get more time. My gut is that there is maybe multiple ways that the system can get into a weird state. |
Small update here..
a. upgraded to ESPHome 2024.4.1
The intention of this yaml is just to make the lights red if the connection to HA fails. When I restart HA I do see the lights go to red then back to normal. |
I'm seeing a decryption failure with this PR when trying to play valid media on my Onju voice devices (tested with TTS, but also any responses). I recall that responses worked at some point, but have updated to the ESPHome betas since then.
If I stick the audio URLin a browser, it plays the expected audio. I don't see anything in the HA logs when this happens. |
@rccoleman can you double check that the encryption key in the config is the same as that stored in HA config entry? |
The
If the API key was wrong, I would have expected much bigger issues like HA not being able to talk to the device at all. This is isolated to media playback, as far as I can tell. I believe that I even removed devices and re-added them to ensure they sync properly, but it didn't help. Edit: Weird, it's just that one Onju voice device. The other four that I have running the same ESPHome config work fine. I changed the key (it was actually a dup of a key for another device), removed it from HA, rebuilt/reuploaded the firmware, re-added to HA with the new key, and now it's working. 🤷 |
Is there something that prevents to merge this to the main branch? |
Some minor comments here on testing this: It works nicely most of the time but sometimes I needed to do a fully clean build and re-upload to get stuff working, not sure what is happening there.. Meanwhile I added a http mp3 link served by home assistant to the on_wakeword_detected. This works fine, but it is too slow when using HTTPS... so I downgraded to HTTP (hosting it somewhere else than my HA instance now using a simple http server)
|
I still have a lot of issues with mine, which leave very few traces in the debug log. However, since we have the option of leaving the MWW version marked as experimental, I'm going to merge it into |
NOTE - this is using some undocumented features of esphome_audio thanks to it's author pointing them out. (gain_log2: 3)
So, perhaps we make this another yaml as the previous one with "speaker" component is fairly stable, for me anyways. Open to suggestions.
Ultimately, this probably needs a lot of testing. I did have an issue with it once I added volume controls back in that seemed to vanish on its own, but its nice to have media_player back, and you can even detect a wake word while a song is playing. However, I can't seem to find the VAD timeout setting for microwakeword, so it appears to take awhile to detect the wake word, if there is a song playing, but does work. -- More nuance on this one too, once the TTS response comes back, it will stop the media player to play it, which makes sense, but we may want to configure it to resume from where it was in the previous file, if it was playing, perhaps.