migrate mww to esphome_audio, bring back volume control and media_player #39

cowboyrushforth · 2024-04-18T01:50:00Z

NOTE - this is using some undocumented features of esphome_audio thanks to it's author pointing them out. (gain_log2: 3)

So, perhaps we make this another yaml as the previous one with "speaker" component is fairly stable, for me anyways. Open to suggestions.

Ultimately, this probably needs a lot of testing. I did have an issue with it once I added volume controls back in that seemed to vanish on its own, but its nice to have media_player back, and you can even detect a wake word while a song is playing. However, I can't seem to find the VAD timeout setting for microwakeword, so it appears to take awhile to detect the wake word, if there is a song playing, but does work. -- More nuance on this one too, once the TTS response comes back, it will stop the media player to play it, which makes sense, but we may want to configure it to resume from where it was in the previous file, if it was playing, perhaps.

…o chips on board and have media_player functionality

…the needed on on_tts_stream_end when we use speaker

tetele

Great work! I've added some comments regarding code quality and naming conventions and I'm testing the config today (hopefully we can get some more beta testers on board to see how it goes).

esphome/onju-voice-microwakeword.yaml

cowboyrushforth · 2024-04-18T15:14:13Z

Thanks for the feedback! Fixed most of it. Left you a comment above too.

cowboyrushforth · 2024-04-19T06:02:54Z

@tetele implemented the stopping of media player on wake word detection.

Separately I have started another branch here that starts to implement audio ducking - IE - the music would play in the background but softer while the device/system processes the full request, with the end goal being that it would resume playing where it left off after the request is fully done. As I got into this I realized it was going to be a little complex, and I might need to make some changes to media_player in esphome to support it, so I will work on this separately if I get time.

tetele · 2024-04-19T06:37:00Z

I doubt it will be easy to do that on the ESPHome media_player or that it will be implemented any time soon. It would be lovely, but certainly not trivial (from the ESPHome point of view; from this config's point of view, it just needs upstream support).

That said, I wanted to thank you for your contributions! You, sir, have been amazing! With my current limited availability, I don't even know when I would have been able to test and implement the ADF media_player in the MWW config.

I'd like to get at least some testing feedback on this implementation and then merge it to the main config. I've found an issue with the volume and reported it upstream. There's a workaround, but I'd like for that change to be merged before proceeding, apart from some testing.

If all goes well, I'd also like to port this implementation to the non-MWW config as a longer term plan. None of that would have been possible without your help ❤️

cowboyrushforth · 2024-04-19T06:47:36Z

Also pushed a method to restore the volume between reboots as it was driving me crazy. Not sure if there is a better way to do it.

Ya, I noticed the speaker sounds bad over 50% of volume as well. But strangely only for music, not TTS. (for me, at least). So I do wonder whether there is something else going on.. Does TTS sound poor above 50% for you, or just music?

Lastly, re: audio ducking, I agree, there needs to be changes in media_player, I plan to look at this when I get time, upstream in esphome, and perhaps in voice_assistant, too. Might be a fun challenge, but my goal is to fully replace siri in my house, one day! :p

esphome/onju-voice-microwakeword.yaml

tetele · 2024-04-19T06:55:52Z

Does TTS sound poor above 50% for you, or just music?

Both TTS and media sound just as loud above 50%. Music does sound a bit poorer, but maybe TTS does too, it's just that it's harder to trace

tetele · 2024-04-19T06:58:35Z

Can you please hit me up on Discord @cowboyrushforth ? I'm @tetele on both the HA and ESPHome servers

This reverts commit 76a103f.

cowboyrushforth · 2024-04-19T07:07:05Z

Rolled that last commit back. Sent you a DM on discord, I am on the esphome discord under spot or spotman_, cheers

tbrasser · 2024-04-19T15:37:26Z

I've been following this and porting to non-mww. Hit me up (@discord) if I can help out on that front. 👍

s00500 · 2024-04-21T20:26:20Z

That took some legitimate long compile time haha, but super awesome work!!!

Could successfully test microwakeword with audio control and media player working!!!
works without disabeling listening manually, sound quality still seems to be missing higher frequencies... (does not bother me, just noticed it)

Also played a bit with adding a beep, this works really really well:

  on_wake_word_detected:
    - if:
        condition: media_player.is_playing
        then:
          - media_player.pause
    - media_player.play_media:
        id: onju_out
        media_url: "http://192.168.0.111:8000/beep.mp3"
    - delay: 300ms
    #- wait_until:
    #    not:
    #      media_player.is_playing: onju_out
    - voice_assistant.start:
        wake_word: !lambda return wake_word;

Beep is a sub 500ms little clip and is currently hosted on some rpi in my network, but I am wondering if there would be a more elegant version of this.... any ideas ?

Greetings, and thanks for the amazing work, I will flash this out to all 5 onjus I am running as soon as I get to it =D

tetele · 2024-04-22T08:13:55Z

I'm having some issues with either MWW or the I2S audio - wake word recognition simply stops after a long enough time. It works after the MWW detection is turned off and on again.

So far I haven't been able to trace the root cause of the issue, but I will keep trying. Does anybody else get that?

s00500 · 2024-04-22T09:00:38Z

Hm... had it running since my comment yesterday, still works fine here (running esphome 2023.4 btw). I did have a issue like that on my RaspiAudio MuseProto Boards that I could never get to fix fully...

jherby2k · 2024-04-22T13:55:30Z

Re: sound quality. From https://github.com/gnumpi/esphome_audio?tab=readme-ov-file#i2s-settings:

sample_rate (Optional, positive integer): I2S sample rate. Defaults to 16000.

Probably just need to set this to 48000 for music playback. blah blah blah...

edit: Never mind. I see you're way ahead of me, and the microphone fails to work with the resampler. So 16000 hz it is for now.

cowboyrushforth · 2024-04-22T16:00:32Z

I'm having some issues with either MWW or the I2S audio - wake word recognition simply stops after a long enough time. It works after the MWW detection is turned off and on again.

So far I haven't been able to trace the root cause of the issue, but I will keep trying. Does anybody else get that?

So I have 3 onju devices setup. This happens in one location despite which device is in that location. In this location it is closer to another device. I am curious if this could have to do with perhaps 2 onju devices listen to the wake word, and something gets weird and neither actually respond.

To expand though, one device experiencing this frequently, took to another location (downstairs, far away from any other de ice) and its never once not heard the wake word.

So I started to look thru code as I read somewhere that HA supports only a single device actually responding, but I can't actually find any support for this, so the whole thing remains a mystery.

tetele · 2024-04-22T16:33:43Z

So I started to look thru code as I read somewhere that HA supports only a single device actually responding

That is true, i haven't thought to check this, but it would not explain 1. the fact that toggling the wake word makes it work immediately and 2. the fact that until the toggle it doesn't trigger at all, regardless how well isolated it is from other devices (i.e. closed doors etc.).

s00500 · 2024-04-22T16:37:56Z

Seems like your issue is not related to that.
I have however had similar issues as @cowboyrushforth

Often when I trigger 2 devices at once one of them does not respond, which is useful sometimes... but mostly feels buggy.

Even worse I had cases where the second device would repeat the question (!!!) and answer of the first again. this is especially strange as I do not understand how the question gets into tts the pipeline again at all..

cowboyrushforth · 2024-04-22T17:18:54Z

So I started to look thru code as I read somewhere that HA supports only a single device actually responding

That is true, i haven't thought to check this, but it would not explain 1. the fact that toggling the wake word makes it work immediately and 2. the fact that until the toggle it doesn't trigger at all, regardless how well isolated it is from other devices (i.e. closed doors etc.).

Perhaps, but perhaps not. Would love to find the source code for whatever functionality HA has for only-one-device-detecting-wake-word at one time.

regarding 1 - perhaps toggling the wake word resets something critical? wouldnt say that that rules this out at all.
regarding 2 - for me, its once it gets into this "bad state". if the device never gets into a bad state, which for me is if i keep the door closed to this room, it always toggles, so long as when I enter this room, and shut the door behind me.

cowboyrushforth · 2024-04-23T16:06:26Z

Got some info from discord, so for MWW, the code HA uses to stop concurrent requests is here: https://github.com/home-assistant/core/blob/dev/homeassistant%2Fcomponents%2Fassist_pipeline%2Fpipeline.py#L1364-L1381

After expanding my loglevel for assist_pipeline, I do see this is triggered for me on the trouble unit. Will continue debugging when I get more time.

My gut is that there is maybe multiple ways that the system can get into a weird state.

cowboyrushforth · 2024-04-26T17:38:11Z

Small update here..

have not been able to replicate for 2 days.. nothing has changed in my house.
the only things I have changed code wise are

a. upgraded to ESPHome 2024.4.1
b. am running this yaml change, with the intention of making it obvious if there is a wifi issue:

  on_client_connected:
    - if:
        condition:
          and:
            - switch.is_on: use_wake_word
            - binary_sensor.is_off: mute_switch
        then:
          - script.execute: reset_led
          - micro_wake_word.start:
  on_client_disconnected:
    - if:
        condition:
          and:
            - switch.is_on: use_wake_word
            - binary_sensor.is_off: mute_switch
        then:
          - light.turn_on:
              id: top_led
              blue: 0%
              red: 100%
              green: 0%
              effect: none
          - voice_assistant.stop:
          - micro_wake_word.stop:

The intention of this yaml is just to make the lights red if the connection to HA fails. When I restart HA I do see the lights go to red then back to normal.

rccoleman · 2024-04-29T17:16:07Z

I'm seeing a decryption failure with this PR when trying to play valid media on my Onju voice devices (tested with TTS, but also any responses). I recall that responses worked at some point, but have updated to the ESPHome betas since then.

[10:08:15][D][media_player:059]: 'Onju Voice Satellite Dining Room' - Setting
[10:08:15][D][media_player:066]:   Media URL: https://xxx/api/tts_proxy/b8b13f9279e4f60bbc005a4c6d66bf220dd2df68_en-us_5c97d21c48_cloud.mp3
[10:08:15][D][adf_media_player:030]: Got control call in state 1
[10:08:15][D][esp_adf_pipeline:050]: Starting request, current state STOPPED
[10:08:15][D][esp_adf_pipeline:302]: State changed from STOPPED to PREPARING
[10:08:15][I][adf_media_player:135]: got new pipeline state: 1
[10:08:15][D][adf_i2s_out:127]: Set final i2s settings: 16000
[10:08:15][D][esp_audio_processors:079]: New settings: SRC: rate: 16000, ch: 2 DST: rate: 16000, ch: 2 
[10:08:16][D][esp-idf:000]: I (7767236) AUDIO_ELEMENT: [http] AEL_MSG_CMD_RESUME,state:1

[10:08:16][D][esp-idf:000]: I (7767239) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1

[10:08:16][D][esp_aud:000]: 
ERROR Fatal error: protocol.data_received() call failed.
protocol: <aioesphomeapi._frame_helper.noise.APINoiseFrameHelper object at 0x7f06254bf060>
transport: <_SelectorSocketTransport fd=6 read=polling write=<idle, bufsize=0>>
Traceback (most recent call last):
  File "/usr/lib/python3.11/asyncio/selector_events.py", line 1009, in _read_ready__data_received
    self._protocol.data_received(data)
  File "aioesphomeapi/_frame_helper/noise.py", line 136, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper.data_received
  File "aioesphomeapi/_frame_helper/noise.py", line 163, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper.data_received
  File "aioesphomeapi/_frame_helper/noise.py", line 319, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper._handle_frame
  File "/usr/local/lib/python3.11/dist-packages/noise/state.py", line 74, in decrypt_with_ad
    plaintext = self.cipher.decrypt(self.k, self.n, ad, ciphertext)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/noise/backends/default/ciphers.py", line 13, in decrypt
    return self.cipher.decrypt(nonce=self.format_nonce(n), data=ciphertext, associated_data=ad)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "src/chacha20poly1305_reuseable/__init__.py", line 127, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt
  File "src/chacha20poly1305_reuseable/__init__.py", line 147, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt
  File "src/chacha20poly1305_reuseable/__init__.py", line 263, in chacha20poly1305_reuseable._decrypt_with_fixed_nonce_len
  File "src/chacha20poly1305_reuseable/__init__.py", line 273, in chacha20poly1305_reuseable._decrypt_data
cryptography.exceptions.InvalidTag
WARNING onju-voice-dr @ 192.168.1.131: Connection error occurred: onju-voice-dr @ 192.168.1.131: Invalid encryption key: received_name=onju-voice-dr
INFO Processing unexpected disconnect from ESPHome API for onju-voice-dr @ 192.168.1.131
WARNING Disconnected from API
INFO Successfully connected to onju-voice-dr @ 192.168.1.131 in 0.010s
INFO Successful handshake with onju-voice-dr @ 192.168.1.131 in 0.109s

If I stick the audio URLin a browser, it plays the expected audio. I don't see anything in the HA logs when this happens.

tetele · 2024-04-29T17:34:37Z

@rccoleman can you double check that the encryption key in the config is the same as that stored in HA config entry?

rccoleman · 2024-04-29T17:43:30Z

@rccoleman can you double check that the encryption key in the config is the same as that stored in HA config entry?

The noise_psk key in the config_entry is the same as the one specified in the ESPHome config:

api:
  encryption:
    key: "same_as_noise_psk_key"

If the API key was wrong, I would have expected much bigger issues like HA not being able to talk to the device at all. This is isolated to media playback, as far as I can tell. I believe that I even removed devices and re-added them to ensure they sync properly, but it didn't help.

Edit: Weird, it's just that one Onju voice device. The other four that I have running the same ESPHome config work fine. I changed the key (it was actually a dup of a key for another device), removed it from HA, rebuilt/reuploaded the firmware, re-added to HA with the new key, and now it's working. 🤷

VivantSenior · 2024-05-04T06:40:52Z

Is there something that prevents to merge this to the main branch?

s00500 · 2024-05-07T10:12:56Z

Some minor comments here on testing this: It works nicely most of the time but sometimes I needed to do a fully clean build and re-upload to get stuff working, not sure what is happening there..

Meanwhile I added a http mp3 link served by home assistant to the on_wakeword_detected. This works fine, but it is too slow when using HTTPS... so I downgraded to HTTP (hosting it somewhere else than my HA instance now using a simple http server)

micro_wake_word:
  model: okay_nabu
  probability_cutoff: 0.6
  on_wake_word_detected:
    - if:
        condition: media_player.is_playing
        then:
          - media_player.pause
    - media_player.play_media:
        id: onju_out
        media_url: "http://192.168.0.244:8000/beep.mp3"
    - delay: 300ms # tuned to length, works a bit better than waiting for state
    #- wait_until:
    #    not:
    #      media_player.is_playing: onju_out
    - voice_assistant.start:
        wake_word: !lambda return wake_word;

tetele · 2024-05-09T09:15:10Z

I still have a lot of issues with mine, which leave very few traces in the debug log.

However, since we have the option of leaving the MWW version marked as experimental, I'm going to merge it into main.

cowboyrushforth added 3 commits April 17, 2024 18:56

WIP - migrate to esphome_audio so we can have make better use of audi…

da61c59

…o chips on board and have media_player functionality

we do need to wait for media_player as on_end works differently than …

212c3f7

…the needed on on_tts_stream_end when we use speaker

bring back volume control.

b4367f4

tetele requested changes Apr 18, 2024

View reviewed changes

This was referenced Apr 18, 2024

[Feature request] Ability to change volume in microWakeWord version #12

Closed

[Feature request] Switch to disable speaker of the microWakeWord variant #15

Closed

addressing feedback w/r/t naming conventions and whitespace

1a2778f

if media player is playing, stop it when wake word is detected

9c2cccf

save/restore volume between reboots

76a103f

tetele requested changes Apr 19, 2024

View reviewed changes

esphome/onju-voice-microwakeword.yaml Outdated Show resolved Hide resolved

Revert "save/restore volume between reboots"

8bb69c3

This reverts commit 76a103f.

Make it easier to resume playback after VA activity

328c670

tetele added major This PR causes a major version bump in the version number. refactor Improvement of existing code, not introducing new features. labels Apr 19, 2024

tetele self-assigned this Apr 19, 2024

tetele approved these changes Apr 19, 2024

View reviewed changes

README updates

028f3fb

tetele merged commit e8f276f into tetele:main May 9, 2024
1 check passed

This was referenced May 11, 2024

Ability to change volume in Home Assistant #11

Closed

No audio response / media_player audio playback #45

Open

rccoleman mentioned this pull request May 15, 2024

media_player functionality using this config stopped working as of esphome 2024.5 #46

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

migrate mww to esphome_audio, bring back volume control and media_player #39

migrate mww to esphome_audio, bring back volume control and media_player #39

cowboyrushforth commented Apr 18, 2024

tetele left a comment

cowboyrushforth commented Apr 18, 2024 •

edited

Loading

cowboyrushforth commented Apr 19, 2024 •

edited

Loading

tetele commented Apr 19, 2024

cowboyrushforth commented Apr 19, 2024

tetele commented Apr 19, 2024

tetele commented Apr 19, 2024

cowboyrushforth commented Apr 19, 2024

tbrasser commented Apr 19, 2024

s00500 commented Apr 21, 2024

tetele commented Apr 22, 2024

s00500 commented Apr 22, 2024

jherby2k commented Apr 22, 2024 •

edited

Loading

cowboyrushforth commented Apr 22, 2024

tetele commented Apr 22, 2024

s00500 commented Apr 22, 2024

cowboyrushforth commented Apr 22, 2024

cowboyrushforth commented Apr 23, 2024

cowboyrushforth commented Apr 26, 2024

rccoleman commented Apr 29, 2024

tetele commented Apr 29, 2024

rccoleman commented Apr 29, 2024 •

edited

Loading

VivantSenior commented May 4, 2024

s00500 commented May 7, 2024

tetele commented May 9, 2024

migrate mww to esphome_audio, bring back volume control and media_player #39

migrate mww to esphome_audio, bring back volume control and media_player #39

Conversation

cowboyrushforth commented Apr 18, 2024

tetele left a comment

Choose a reason for hiding this comment

cowboyrushforth commented Apr 18, 2024 • edited Loading

cowboyrushforth commented Apr 19, 2024 • edited Loading

tetele commented Apr 19, 2024

cowboyrushforth commented Apr 19, 2024

tetele commented Apr 19, 2024

tetele commented Apr 19, 2024

cowboyrushforth commented Apr 19, 2024

tbrasser commented Apr 19, 2024

s00500 commented Apr 21, 2024

tetele commented Apr 22, 2024

s00500 commented Apr 22, 2024

jherby2k commented Apr 22, 2024 • edited Loading

cowboyrushforth commented Apr 22, 2024

tetele commented Apr 22, 2024

s00500 commented Apr 22, 2024

cowboyrushforth commented Apr 22, 2024

cowboyrushforth commented Apr 23, 2024

cowboyrushforth commented Apr 26, 2024

rccoleman commented Apr 29, 2024

tetele commented Apr 29, 2024

rccoleman commented Apr 29, 2024 • edited Loading

VivantSenior commented May 4, 2024

s00500 commented May 7, 2024

tetele commented May 9, 2024

cowboyrushforth commented Apr 18, 2024 •

edited

Loading

cowboyrushforth commented Apr 19, 2024 •

edited

Loading

jherby2k commented Apr 22, 2024 •

edited

Loading

rccoleman commented Apr 29, 2024 •

edited

Loading