Skip to content

Releases: jllllll/llama-cpp-python-cuBLAS-wheels

MacOS Metal Wheels

17 Sep 22:48
8bfb842
Compare
Choose a tag to compare

Available for Intel and Apple Silicon CPUs.

Install with:

python -m pip install llama-cpp-python --prefer-binary --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/basic/cpu

0.1.85 builds likely won't work until fixes to the workflow are made.

CPU-only

07 Aug 21:57
1930fa1
Compare
Choose a tag to compare

While this repo is focused on providing cuBLAS wheels, it has become evident that there is a need for CPU-only wheels that do not require AVX2.

Wheels can be more easily downloaded from: https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX/cpu
Replace AVX with one of basic, AVX2 or AVX512 depending on what your CPU supports.

Basic non-AVX Wheels

07 Aug 21:53
1930fa1
Compare
Choose a tag to compare

Wheels without AVX, FMA and F16C support for compatibility with older CPUs.

AMD ROCm

04 Aug 07:27
7578ef0
Compare
Choose a tag to compare

All wheels built for AVX2 CPUs for now.

Linux

Wheels Built for ROCm 5.4.2, 5.5 and 5.6.1.

Windows

Should be considered experimental and may not work at all. Windows ROCm is very new.

To test it, you will need ROCm for Windows: https://www.amd.com/en/developer/rocm-hub/hip-sdk.html
Consult the possibly inaccurate GPU compatibility chart here: https://rocm.docs.amd.com/en/docs-5.5.1/release/windows_support.html
If your GPU isn't on that list, or it just doesn't work, you may need to build llama-cpp-python manually and hope your GPU is compatible.
Another option is to do this: ggerganov/llama.cpp#1087 (comment)

Pre-0.1.80 wheels built using ggerganov/llama.cpp#1087

Installation

To install, you can use this command:

python -m pip install llama-cpp-python --prefer-binary --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX2/rocm5.5

This will install the latest llama-cpp-python version available from here for ROCm 5.5. You can change rocm5.5 to change the ROCm version.
Supported ROCm versions:

  • Windows
    • 5.5.1
  • Linux
    • 5.4.2 5.5 5.6.1
    • Some adjacent versions of ROCm may also be compatible.
      For example, 5.4.1 should be compatible with the 5.4.2 wheel.

GitHub Actions workflow here: https://github.com/jllllll/llama-cpp-python-cuBLAS-wheels/blob/main/.github/workflows/build-wheel-rocm.yml

Webui Wheels

20 Jul 01:26
a484311
Compare
Choose a tag to compare

These are basic/AVX/AVX2 wheels built under a different namespace to allow for simultaneous installation with the main llama-cpp-python package.

Installation can be done with this command:

python -m pip install llama-cpp-python-cuda --prefer-binary --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/textgen/AVX2/cu117

The index URL can be changed similarly to what is described in the main installation instructions.

AVX512

27 Jun 19:50
d06d13a
Compare
Choose a tag to compare

AVX and AVX2 wheels can be found in a different release.

AVX

27 Jun 06:05
3fcd05f
Compare
Choose a tag to compare
AVX

AVX wheels

AVX2 wheels can be found in the main Wheels release.
AVX512 wheels can be found in a different release.

Wheels

26 Jun 23:03
d0a2be3
Compare
Choose a tag to compare

AVX2 wheels

AVX and AVX512 wheels can be found in a different release.