Skip to content

A prototype CLI in Python where a user can collect all of the recordings needed to produce a wakeword

License

Notifications You must be signed in to change notification settings

sfu-bigdata/wakeword-data-collector

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Secret Sauce AI

Wakeword Data Collector

Wake word

Do you want your own personal wakeword?

Wakeword Data Collector is a prototype CLI to record wakewords, non-wakewords, and background noise written in Python. Think of it as your wakeword data collection recipe for creating bullet proof wakeword models. It's part of the Secret Sauce AI Wakeword Project.

wakeword data collector wakeword collection example

The Wakeword Data Collector records wave files with a sample rate of 16000 for two main categories of data:

  • Wakewords (ie 'hey Jarvis') audio/wake-word/
    • Wakeword variations (ie saying 'hey jarvis' further/closer to the mic, faster, or slower) audio/wake-word/variations/
  • Not-wakewords audio/not-wake-word/
    • background noise recordings audio/not-wake-word/background/
    • syllables (ie 'hey, jar, vis') audio/not-wake-word/parts/
    • combinations of syllable permutations (ie 'hey jar', 'jarvis') audio/not-wake-word/parts/
    • other longer recordings: recording the TV and a natural conversation audio/not-wake-word/

Installation

You may have to install the pyaudio dependancy, ie: sudo apt-get install portaudio19-dev.

As usual once you have cloned the repo, it is recommended to create and activate python virtual environment to install the requirements.

python3 -m venv .venv
source .venv/bin/activate
pip3 install -e .

Usage

Simply run wakeword_collect in your console and follow the instructions.

wakeword_collect

For a first time user, it is highly recommended to do a full data collection of all steps (besides 3. Non-wake-word recordings which is optional) to ensure a production quality wakeword.

  • 1. First set of 16 wakeword recordings
  • 2. Wakeword syllable and syllable permutation recordings
  • 4. Second set of 16 wakeword recordings
  • 5. Wakeword variations
  • 6. Third set of 16 wakeword recordings
  • 7. First random conversation recordings
  • 8. Second random TV recording
  • 9. Second rantom TV recording
  • The background recordings taken throughout the sessions will be in audio/not-wake-word/background/

If you are doing a data collection to add another user it is recommended to record:

  • 1. First set of 16 wakeword recordings
  • 7. First random conversation recording
  • All of the background recordings from this session in audio/not-wake-word/background/
  • 2. Wakeword syllable and syllable permutation recordings is optional, but it can help

IMPORTANT

  • It is recommended to use your production audio hardware and location to collect the samples.
  • It is best to use a wakeword with at least three syllables.
  • Make sure to check each recording, it is always possible something went wrong. Even one bad or missing recording can be the difference between a bullet proof and a craptastic wakeword model.
  • After collecting your data, it is highly recommended to run the Precise Wakeword Model Maker.

Secret Sauce AI

Special thanks

Although Secret Sauce AI is always about collaboration and community, special thanks should go to Dan "The Man" Borufka for his support since day one. Thanks, Dan!

About

A prototype CLI in Python where a user can collect all of the recordings needed to produce a wakeword

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%