Accepted to Interspeech 2024
This repository hosts the artifacts pertaining to our paper SALSA: Speedy ASR-LLM Synchronous Aggregation accepted to Interspeech 2024.
The main contribution of our paper is 🔎 a simple LLM stitching technique that uses a tokenization agnostic algorithm to combine the generation capability from LLAMA
and speech comprehension from Whisper
.
Clone the repo:
git clone [https://github.com/Lightning-AI/lit-gpt](https://github.com/csalt-research/salsa)
cd salsa
Install the minimal dependencies:
pip install -r requirements.txt
Install with all dependencies (including quantization, sentencepiece, tokenizers for Llama models, etc.):
pip install -r requirements-all.txt
Finally, to run the recipes you will also need access to LLAMA weights, which can be obtained here: https://huggingface.co/meta-llama
You are all set! 🎉
Our codebase has a simple, easily customizable wrapper script run.sh
, that contains the complete experimental setup divided into different stages. To run all stages, simply execute:
./run.sh --hf_access_token <huggingface_personal_access_token>
(Please see this blog for details on how to generate a personal access token for hugging face)
To run a specific stage, let's say "dataset creation", you can execute:
./run.sh --hf_access_token <huggingface_personal_access_token> --stage 1 --stop_stage 1
Unfortunately, for datasets other than Fleurs, Commonvoice, and Librispeech, one must modify data_preparation/dump_dataset_v2.py
and then proceed to run other stages as mentioned above.
See the open issues for a list of proposed features (and known issues) relevant to this work. For lit-gpt related features/issues, checkout their github repository.
Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.
- If you have suggestions for adding or removing projects, feel free to open an issue to discuss it, or directly create a pull request after you edit the README.md file with necessary changes.
- Please open an individual PR for each suggestion.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/NewFeature
) - Commit your Changes (
git commit -m 'Add appropriate commit message'
). The correct way to write your commit message can be found here - Push to the Branch (
git push origin feature/NewFeature
) - Open a Pull Request
- Ashish Mittal - Research Scientist, IBM Research & PhD, CSE, IIT Bombay - Ashish Mittal
- Darshan Prabhu - PhD, CSE, IIT Bombay - Darshan Prabhu
- Sunita Sarawagi - Associate Professor, CSE, IIT Bombay - Sunita Sarawagi
- Preethi Jyothi - Associate Professor, CSE, IIT Bombay - Preethi Jyothi
If you use this code for your research, please consider citing our work.
@misc{mittal2024salsaspeedyasrllmsynchronous,
title={SALSA: Speedy ASR-LLM Synchronous Aggregation},
author={Ashish Mittal and Darshan Prabhu and Sunita Sarawagi and Preethi Jyothi},
year={2024},
eprint={2408.16542},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2408.16542},
}
Distributed under the MIT License. See LICENSE for more information.