Skip to content

Extracts PAE incipits from RISM online catalog.

License

Notifications You must be signed in to change notification settings

mangelroman/rism

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RISM extractor

This python script extracts PAE incipits from RISM database.

Additionally, it creates Humdrum **kern and MIDI files out of the extracted incipits, using paekern and hum2mid tools from humdrum-tools repository.

Usage

  1. Download XML file from RISM and decompress:
wget https://opac.rism.info/fileadmin/user_upload/lod/update/rismAllMARCXML.zip
unzip rismAllMARCXML.zip
  1. Create a new conda environment based on the provided file:
conda env create --file=environment.yml 
  1. Run the script (set --length to the total number of records in the XML file for progress update purposes):
python rism.py --data-dir=./output_folder --length=1400000 --num-workers=4 rism_201219.xml

About

Extracts PAE incipits from RISM online catalog.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages