ACR Training

Automatic Chord Recognition research folder that generates, parses and stores Datasets and defines and trains Models. This part of the project is coded in Python in order to use specific python libraries.

Prerequisites

Python 3.8
python libraries - sklearn, tensorflow, librosa, mir_eval
dataset - wav audio, chord annotations, key annotations

ACR - Results

The Proof of concept of ACR training on the Isophonics and Billboard dataset were performed in a Jupyter Notebook. All trainings were evaluated on the validation, NOT TEST, dataset. You can see the outcome HERE!

The actual ACR research inspired of the POC was evaluated only on the Beatles Dataset. Mir eval scores and Accuracy are used and printed HERE!

Model descriptions are listed BELOW.

Segmentation - Results

In order to increase accuracy of the ACR, the harmony segmentation is researched. The Segmentation results were generated in a Jupyter Notebook. You can see the outcome HERE!

Basicaly, segmentation models are opportunity for the future works. For now, the librosa's beat track, which returns bpm with beat list, is used.

Model descriptions are listed BELOW.

Datasets

Two well known datasets are included.

Isophonics (225 songs) - This one is used as a training dataset.
Billboard (890 songs)

Dataset issues

No audio files provided.
Not consistent chord annotations - sometimes B chord means Hes, sometimes B chord means H.
Sometimes not correctly annotated chords (Something by Beatles).
When the audio files exist, the dataset is too large.

Preprocessing

Datasets have functions for data preprocessing. The audio waveform is parsed and the spectrogram/chromagram is generated. Each model type needs different kind of preprocessing.

MLP Preprocessing

The function will generate the moving window of spectrograms flattened as a one feature set. Optionally, data can be transposed to C major key (and its mode alternatives).

CRNN Preprocessing

The function will generate the sequence of spectrograms as a one feature set.Optionally, data can be transposed to C major key (and its mode alternatives).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ReadMe.md

ReadMe.md

ACR Training

Prerequisites

ACR - Results

Segmentation - Results

Datasets

Dataset issues

Preprocessing

MLP Preprocessing

CRNN Preprocessing

ACR Models - structures

MLP

CRNN Basic

CRNN with EfficientNet

CRNN with CRF

CRNN Bass-Third

Segmentation Models - structures

Segmentation CRNN

Encoder-Decoder Segmentation

Files

ReadMe.md

Latest commit

History

ReadMe.md

File metadata and controls

ACR Training

Prerequisites

ACR - Results

Segmentation - Results

Datasets

Dataset issues

Preprocessing

MLP Preprocessing

CRNN Preprocessing

ACR Models - structures

MLP

CRNN Basic

CRNN with EfficientNet

CRNN with CRF

CRNN Bass-Third

Segmentation Models - structures

Segmentation CRNN

Encoder-Decoder Segmentation