Skip to content

UT-SysML/Oneiros

Repository files navigation

🧠 Oneiros Installation Guide

This guide describes how to set up Oneiros from source (built atop vllm) with the required dependencies.


📋 Prerequisites

  • Python ≥ 3.12
  • CUDA 12.4
  • A compatible GPU with recent NVIDIA drivers
  • Ensure the following environment variables are properly configured:
    export PATH=/usr/local/cuda/bin:$PATH
    export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
    (Follow the official NVIDIA installation guide for details.)

🧩 Step 1 — Install PyTorch

Install the PyTorch release built for CUDA 12.4:

pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1     --index-url https://download.pytorch.org/whl/cu124

⚙️ Step 2 — Install libtorch

Download and install the matching libtorch distribution from
👉 https://pytorch.org/get-started/locally/

Make sure to set the correct LIBTORCH path when building dependent modules.


🚀 Step 3 — Build and Install Vattention

cd $Oneiros_ROOT/vattention

Edit setup.py to point to your libtorch installation, e.g.:

LIBTORCH_PATH = "/path/to/libtorch"

Then install:

pip install --no-build-isolation .

🧱 Step 4 — Build and Install Oneiros

cd $Oneiros_ROOT
pip install -r requirements-build.txt
pip install -e . --no-build-isolation

This will install Oneiros in editable mode for development.


✅ Verification

After installation, verify by running:

python test/basic.py

🧾 Notes

  • Use a clean virtual environment (e.g., conda, venv) to avoid dependency conflicts.
  • Ensure CUDA and PyTorch versions are consistent.
  • If compilation fails, double-check your libtorch path and LD_LIBRARY_PATH settings.

📚 Citation

If you find Oneiros useful in your research, please cite:

@inproceedings{li2025oneiros,
  title     = {Oneiros: KV Cache Optimization through Parameter Remapping for Multi-tenant LLM Serving},
  author    = {Li, Ruihao and Pal, Shagnik and Pullu, Vineeth Narayan and Sinha, Prasoon and Ryoo, Jeeho and John, Lizy K. and Yadwadkar, Neeraja J.},
  booktitle = {Proceedings of the ACM Symposium on Cloud Computing (SoCC '25)},
  year      = {2025},
  publisher = {Association for Computing Machinery}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published