Automatic Chord Recognition research folder that generates, parses and stores Datasets and defines and trains Models. This part of the project is coded in Python in order to use specific python libraries.
- Python 3.8
- python libraries - sklearn, tensorflow, librosa, mir_eval
- dataset - wav audio, chord annotations, key annotations
The Proof of concept of ACR training on the Isophonics and Billboard dataset were performed in a Jupyter Notebook. All trainings were evaluated on the validation, NOT TEST, dataset. You can see the outcome HERE!
The actual ACR research inspired of the POC was evaluated only on the Beatles Dataset. Mir eval scores and Accuracy are used and printed HERE!
Model descriptions are listed BELOW.
In order to increase accuracy of the ACR, the harmony segmentation is researched. The Segmentation results were generated in a Jupyter Notebook. You can see the outcome HERE!
Basicaly, segmentation models are opportunity for the future works. For now, the librosa's beat track, which returns bpm with beat list, is used.
Model descriptions are listed BELOW.
Two well known datasets are included.
- Isophonics (225 songs) - This one is used as a training dataset.
- Billboard (890 songs)
- No audio files provided.
- Not consistent chord annotations - sometimes B chord means Hes, sometimes B chord means H.
- Sometimes not correctly annotated chords (Something by Beatles).
- When the audio files exist, the dataset is too large.
Datasets have functions for data preprocessing. The audio waveform is parsed and the spectrogram/chromagram is generated. Each model type needs different kind of preprocessing.
The function will generate the moving window of spectrograms flattened as a one feature set. Optionally, data can be transposed to C major key (and its mode alternatives).
The function will generate the sequence of spectrograms as a one feature set.Optionally, data can be transposed to C major key (and its mode alternatives).