SiLLM

About

SiLLM is a high-performance asynchronous inference engine designed to optimize model execution through two parallelism mechanisms.

GPU-CPU Overlapping
- Fully asynchronous inference scheduling
- Fully asynchronous input processing
- Fully asynchronous output processing
Sequence-Parallel Sampling
- Fully parallel sampling across GPUs

Getting Started

SiLLM is built on top of vLLM, utilizing vLLM's front end for model loading and leveraging PagedAttention for model execution. Additionally, it integrates custom plugins to enable asynchronous scheduling, asynchronous input/output processing, and parallel sampling.

Step 1: Install vLLM from pip

# Install vLLM
pip install openai==1.45.0 gputil aioprometheus psutil transformers termcolor ipywidgets
pip install vllm==0.6.0

Step 2: Install Albireo plugin from source

# Install Albireo Plugin
python3 python_only_dev.py
apt-get install libboost-all-dev
cd albireo
pip install -v .

License

This library is licensed under the Apache 2.0 License.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.buildkite		.buildkite
albireo		albireo
benchmarks		benchmarks
cmake		cmake
csrc		csrc
docs		docs
examples		examples
tests		tests
vllm		vllm
.clang-format		.clang-format
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitmodules		.gitmodules
.readthedocs.yaml		.readthedocs.yaml
.yapfignore		.yapfignore
CMakeLists.txt		CMakeLists.txt
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile.cpu		Dockerfile.cpu
Dockerfile.neuron		Dockerfile.neuron
Dockerfile.openvino		Dockerfile.openvino
Dockerfile.ppc64le		Dockerfile.ppc64le
Dockerfile.rocm		Dockerfile.rocm
Dockerfile.tpu		Dockerfile.tpu
Dockerfile.xpu		Dockerfile.xpu
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
collect_env.py		collect_env.py
format.sh		format.sh
install.sh		install.sh
pyproject.toml		pyproject.toml
python_only_dev.py		python_only_dev.py
requirements-adag.txt		requirements-adag.txt
requirements-build.txt		requirements-build.txt
requirements-common.txt		requirements-common.txt
requirements-cpu.txt		requirements-cpu.txt
requirements-cuda.txt		requirements-cuda.txt
requirements-dev.txt		requirements-dev.txt
requirements-lint.txt		requirements-lint.txt
requirements-neuron.txt		requirements-neuron.txt
requirements-openvino.txt		requirements-openvino.txt
requirements-rocm.txt		requirements-rocm.txt
requirements-test.txt		requirements-test.txt
requirements-tpu.txt		requirements-tpu.txt
requirements-xpu.txt		requirements-xpu.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SiLLM

About

Getting Started

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

scitix/SiLLM

Folders and files

Latest commit

History

Repository files navigation

SiLLM

About

Getting Started

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages