Translation Invariant Network

This is a pytorch implementation of the baseline model described in Invariances and Data Augmentation for Supervised Music Transcription. It is the current state-of-the-art Multi pitch Estimation Model evaluated on MusicNet dataset.

The Implementation details are based on original repository.

Quick Start

Download MusicNet, the raw format.
Train your model.

python train.py --root where/your/data/is \
                --outfile your_model_name.pth
                --preprocess #set this when execute the first time
                --steps 100000

==> Loading Data...
==> Building model..
7159808 of parameters.
Start Training.
steps / mse / avp_train / avp_test
1000 1.1489939180016517 0.35331320842371294 0.6423918783844346
2000 0.8266454307436943 0.6419942976527393 0.6908275697088563
3000 0.7581750468611718 0.6841538815755338 0.7231194980728902
...
99000 0.5766582242846489 0.8026213566978918 0.779892872485184
100000 0.5726349068582058 0.8063233963916538 0.7788928809627774
Finished

You can use ctrl+C to stop the process, and the model will always be saved.

Test the model on test data same as in original paper (a pre-trained model is also included in the repository).

python test.py --infile your_model.pth

==> Loading ID 2303
==> Loading ID 1819
==> Loading ID 2382
average precision on testset: 0.7691312928576854

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Translation Invariant Network

Quick Start

Files

README.md

Latest commit

History

README.md

File metadata and controls

Translation Invariant Network

Quick Start