Releases: ufal/olimpic-icdar24
The Zeus models for (Camera) GrandStaff LMX
- The
zeus-grandstaff-lmx-1.0-2024-02-12.model
is an OMR model for the Zeus recognizer trained on the GrandStaff LMX dataset, available under the CC BY-SA license. - The
zeus-camera-grandstaff-lmx-1.0-2024-02-12.model
is an OMR model for the Zeus recognizer trained on the Camera GrandStaff LMX dataset, available under the CC BY-SA license.
Please visit https://github.com/ufal/olimpic-icdar24 to see how the models can be used.
The zeus-olimpic-1.0-2024-02-12.model
The zeus-olimpic-1.0-2024-02-12.model
is an OMR model for Zeus recognizer, available under CC BY-SA license.
Please visit https://github.com/ufal/olimpic-icdar24/tree/master/zeus to see how the model can be used.
Datasets
OLiMPiC dataset
OpenScore Lieder Linearized MusicXML Piano Corpus, is a dataset of MusicXML - LMX - PNG triplets for piano systems in the OpenScore Lieder corpus. It contains synthetic images from MuseScore and scanned images from IMSLP. The synthetic dataset contains all the splits (train, dev, test). The scanned dataset contains only dev and test splits. These splits are aligned across both variants. Both variants are are sliced up into systems (piano staves) that are grouped into folders by the score (the song), which form the training samples:
samples/
123456/ ... one folder per score (song)
p1-s1.png ... one image and two annotations for each system `p{page}-s{system}`
p1-s1.lmx ... Linearized MusicXML
p1-s1.musicxml ... non-compressed MusicXML file
samples.test.txt ... list of samples for a partition, contains lines: `samples/123456/p1-s1`
statistics.test.yaml
vocabulary.txt ... list of all vocabulary tokens for LMX annotations
A permanent handle to the dataset: http://hdl.handle.net/11234/1-5419
You can preview the scanned dataset test partition here:
https://ufallab.ms.mff.cuni.cz/~mayer/icdar2024/scanned/
And the synthetic dataset train partition here:
https://ufallab.ms.mff.cuni.cz/~mayer/icdar2024/synthetic/
Partition | Systems (samples) | Scores (songs) | In synthetic dataset | In scanned dataset |
---|---|---|---|---|
test | 1 493 | 100 | ✔️ | ✔️ |
dev | 1 438 | 100 | ✔️ | ✔️ |
train | 15 014 | 1095 | ✔️ | ❌ |
To get the source IMSLP PDFs and manually annotated system bounding boxes for the scanned dataset, download the attached olimpic-1.0-sources-for-scanned.2024-02-12.tar.gz
file.
GrandStaff LMX dataset
We've also added LMX and MusicXML annotations to the GrandStaff dataset. For each .krn
file we added a .lmx
file and a .musicxml
file in the same format as in the datasets described above. These additional files are attached as grandstaff-lmx.2024-02-12.tar.gz
to this release.
A permanent handle to the dataset: http://hdl.handle.net/11234/1-5423
The dataset can be previewed here:
https://ufallab.ms.mff.cuni.cz/~mayer/icdar2024/grandstaff/
Be careful when harmonizing it with the other two datasets, there is a list of issues to be aware of:
- The semantic content diversity regarding special symbols and inter-staff interactions is lower. The GrandStaff dataset does not contain slurs, arpeggios are present in images but are not present in LMX, grace notes have missing stems (probbably a JPEG compression artefact or a Verovio bug).
- LMX token sequence lengths can be much larger. Partly because some systems have regularly 6 measures, whereas the previous datasets typically cap at 4, and partly because some scores are very dense and contain large numbers of notes.
- The GrandStaff dataset is much larger ~50K samples, compared to the previous ~15K samples. So only a subset of the dataset should be used when training on both.
- The Humdrum kern format seems not to support mid-voice staff changes. And even if it does support it, the convertor we used
music21
seems not to able to encode them via its internal representation format. Looking at 100 random GrandStaff images, we were not able to find a single instance where a voice would cross between staves. There are places that are almost begging to be represented that way, see the measure 4 and 5, middle ascending voice:
beethoven/piano-sonatas/sonata29-4/maj2_up_m-181-186.jpg
:
In the OpenScore Lieder corpus, mid-voice staff changes are relatively common. We were able to easily find 4 examples in 100 random images from the scanned dataset. See the last measures:
We think that the OpenScore Lieder corpus is more interesting and complex in terms of music notation, compared to the KernScores corpus from which the GrandStaff dataset was made. We believe the choice to use MuseScore as the representation format for OS Lieder was a well-made one, especially regarding this area of OMR reserach.