Music generation project for the subject Deep Learning for Acoustic Signal Processing (DLAS)

Disclaimer: this project is meant to be used only with audio without copyright, or audio you have permission to use.

Project structure

The source code is divided as follows:

data: contains the Lightning data modules.
datasets: contains PyTorch datasets.
models: contains Lightning modules and PyTorch modules.
notebooks: contains useful notebooks to visualize results.
preprocess_datasets: contains scripts to preprocess data and extract features.
specvqgan: contains a modified version of SpecVQGAN repository, to use the models inside this project.
training: includes training scripts and configuration files. The configuration is setup using hydra-core and OmegaConf, similarly to SpecVQGAN.

A modified version of laion_clap library is also added to work with colossalai, only needed for the model that computes the CLAP embeddings online.

Training

The following steps are needed to train a GPT-2 with precomputed CLAP embeddings and EnCodec codes as input:

Make sure CUDA is installed in your machine.
Precompute CLAP embeddings and EnCodec embeddings from the .wav files using the scripts in preprocess_datasets.
Build the Docker container running docker build . -t circe in the root directory of the project. The CUDA version of the container should be less or equal than the CUDA version in the host machine.
Run the training with a command similar to the following:

docker run --rm --runtime=nvidia --gpus all -v /home/user/Circe/models-circe-encodec:/Circe/models-circe-encodec -v /home/user/Circe/datasets:/Circe/datasets circe circe.training.training_circe ++trainer.default_root_dir=/Circe/models-circe-encodec/

In this example, the folders with the precomputed codes and CLAP embeds should be located in /home/user/Circe/datasets. If the CLAP embeddings are extracted online, the property ++model.path_pretrained_clap should be included in circe-gpt.yaml.

Finally, in the case CLAP is not wanted and only EnCodec or SpecVQGAN codes are needed, the model hf-gpt.yaml should be used by modifying config.yaml.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
docs		docs
laion_clap		laion_clap
src/circe		src/circe
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Music generation project for the subject Deep Learning for Acoustic Signal Processing (DLAS)

Project structure

Training

About

Releases

Packages

Languages

License

Dedalo314/circe

Folders and files

Latest commit

History

Repository files navigation

Music generation project for the subject Deep Learning for Acoustic Signal Processing (DLAS)

Project structure

Training

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages