Skip to content

csalt-research/salsa

Repository files navigation

CSALT @ IITB

SALSA: Speedy ASR-LLM Synchronous Aggregation

Accepted to Interspeech 2024

Downloads Contributors Forks Stargazers

Table Of Contents

About The Repository

This repository hosts the artifacts pertaining to our paper SALSA: Speedy ASR-LLM Synchronous Aggregation accepted to Interspeech 2024.

The main contribution of our paper is 🔎 a simple LLM stitching technique that uses a tokenization agnostic algorithm to combine the generation capability from LLAMA and speech comprehension from Whisper.

Getting Started

Clone the repo:

git clone [https://github.com/Lightning-AI/lit-gpt](https://github.com/csalt-research/salsa)
cd salsa

Install the minimal dependencies:

pip install -r requirements.txt

Install with all dependencies (including quantization, sentencepiece, tokenizers for Llama models, etc.):

pip install -r requirements-all.txt

Finally, to run the recipes you will also need access to LLAMA weights, which can be obtained here: https://huggingface.co/meta-llama

You are all set! 🎉

 

Running experiments

Our codebase has a simple, easily customizable wrapper script run.sh, that contains the complete experimental setup divided into different stages. To run all stages, simply execute:

./run.sh --hf_access_token <huggingface_personal_access_token>

(Please see this blog for details on how to generate a personal access token for hugging face)

To run a specific stage, let's say "dataset creation", you can execute:

./run.sh --hf_access_token <huggingface_personal_access_token> --stage 1 --stop_stage 1

Unfortunately, for datasets other than Fleurs, Commonvoice, and Librispeech, one must modify data_preparation/dump_dataset_v2.py and then proceed to run other stages as mentioned above.

 

Roadmap

See the open issues for a list of proposed features (and known issues) relevant to this work. For lit-gpt related features/issues, checkout their github repository.

Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

  • If you have suggestions for adding or removing projects, feel free to open an issue to discuss it, or directly create a pull request after you edit the README.md file with necessary changes.
  • Please open an individual PR for each suggestion.

Creating A Pull Request

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/NewFeature)
  3. Commit your Changes (git commit -m 'Add appropriate commit message'). The correct way to write your commit message can be found here
  4. Push to the Branch (git push origin feature/NewFeature)
  5. Open a Pull Request

Authors

  • Ashish Mittal - Research Scientist, IBM Research & PhD, CSE, IIT Bombay - Ashish Mittal
  • Darshan Prabhu - PhD, CSE, IIT Bombay - Darshan Prabhu
  • Sunita Sarawagi - Associate Professor, CSE, IIT Bombay - Sunita Sarawagi
  • Preethi Jyothi - Associate Professor, CSE, IIT Bombay - Preethi Jyothi

Citation

If you use this code for your research, please consider citing our work.

@misc{mittal2024salsaspeedyasrllmsynchronous,
      title={SALSA: Speedy ASR-LLM Synchronous Aggregation}, 
      author={Ashish Mittal and Darshan Prabhu and Sunita Sarawagi and Preethi Jyothi},
      year={2024},
      eprint={2408.16542},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2408.16542}, 
}

License

Distributed under the MIT License. See LICENSE for more information.