This guide describes how to set up Oneiros from source (built atop vllm) with the required dependencies.
- Python ≥ 3.12
- CUDA 12.4
- A compatible GPU with recent NVIDIA drivers
- Ensure the following environment variables are properly configured:
(Follow the official NVIDIA installation guide for details.)
export PATH=/usr/local/cuda/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
Install the PyTorch release built for CUDA 12.4:
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124Download and install the matching libtorch distribution from
👉 https://pytorch.org/get-started/locally/
Make sure to set the correct LIBTORCH path when building dependent modules.
cd $Oneiros_ROOT/vattentionEdit setup.py to point to your libtorch installation, e.g.:
LIBTORCH_PATH = "/path/to/libtorch"Then install:
pip install --no-build-isolation .cd $Oneiros_ROOT
pip install -r requirements-build.txt
pip install -e . --no-build-isolationThis will install Oneiros in editable mode for development.
After installation, verify by running:
python test/basic.py- Use a clean virtual environment (e.g.,
conda,venv) to avoid dependency conflicts. - Ensure CUDA and PyTorch versions are consistent.
- If compilation fails, double-check your
libtorchpath andLD_LIBRARY_PATHsettings.
If you find Oneiros useful in your research, please cite:
@inproceedings{li2025oneiros,
title = {Oneiros: KV Cache Optimization through Parameter Remapping for Multi-tenant LLM Serving},
author = {Li, Ruihao and Pal, Shagnik and Pullu, Vineeth Narayan and Sinha, Prasoon and Ryoo, Jeeho and John, Lizy K. and Yadwadkar, Neeraja J.},
booktitle = {Proceedings of the ACM Symposium on Cloud Computing (SoCC '25)},
year = {2025},
publisher = {Association for Computing Machinery}
}