This branch is up to date with master.

Name	Name	Last commit message	Last commit date
Latest commit cryptk fix: add missing Makefile dependencies to allow for parallelized buil… Mar 14, 2024 6a8041e · Mar 14, 2024 History 209 Commits
.github	.github	chore(deps): update actions/checkout digest to 8ade135 (#234 )	Oct 4, 2023
examples	examples	Fix: per-prediction seed (#198 )	Sep 3, 2023
llama.cpp @ ac43576	llama.cpp @ ac43576	Bump llama.cpp from `bc9d3e3` to `ac43576` (#242 )	Sep 27, 2023
patches	patches	fix(cuda): pass pointer instead of copy-by-value in llama_sample_token (	Sep 21, 2023
.gitignore	.gitignore	Feat: Tokenize String, Test / Build cleanup (#164 )	Aug 15, 2023
.gitmodules	.gitmodules	First import	Apr 4, 2023
LICENSE	LICENSE	First import	Apr 4, 2023
Makefile	Makefile	fix: add missing Makefile dependencies to allow for parallelized buil…	Mar 14, 2024
README.md	README.md	Enable build for ROCm/HIPBLAS (#235 )	Sep 23, 2023
binding.cpp	binding.cpp	Fix Stopwords/Antiprompt (#248 )	Oct 3, 2023
binding.h	binding.h	feat(speculative-sampling): Add speculative sampling (#200 )	Sep 4, 2023
go.mod	go.mod	fix(deps): update module github.com/onsi/ginkgo/v2 to v2.13.0 (#256 )	Oct 9, 2023
go.sum	go.sum	fix(deps): update module github.com/onsi/ginkgo/v2 to v2.13.0 (#256 )	Oct 9, 2023
llama.go	llama.go	RWLock for callbacks (#227 )	Sep 21, 2023
llama_cublas.go	llama_cublas.go	Add to build tags all info - might just work for most cases	May 14, 2023
llama_openblas.go	llama_openblas.go	Optional linking with build tags	May 14, 2023
llama_suite_test.go	llama_suite_test.go	Add a simple test	Apr 15, 2023
llama_test.go	llama_test.go	ci: add GPU tests (#245 )	Sep 29, 2023
options.go	options.go	feat(speculative-sampling): Add speculative sampling (#200 )	Sep 4, 2023
renovate.json	renovate.json	Add renovate.json	Apr 24, 2023

Repository files navigation

go-llama.cpp

LLama.cpp golang bindings.

The go-llama.cpp bindings are high level, as such most of the work is kept into the C/C++ code to avoid any extra computational cost, be more performant and lastly ease out maintenance, while keeping the usage as simple as possible.

Check out this and this write-ups which summarize the impact of a low-level interface which calls C functions from Go.

If you are looking for an high-level OpenAI compatible API, check out here.

Attention!

Since #180 is merged, now go-llama.cpp is not anymore compatible with ggml format, but it works ONLY with the new gguf file format. See also the upstream PR: ggml-org/llama.cpp#2398.

If you need to use the ggml format, use the https://github.com/go-skynet/go-llama.cpp/releases/tag/pre-gguf tag.

Usage

Note: This repository uses git submodules to keep track of LLama.cpp.

Clone the repository locally:

git clone --recurse-submodules https://github.com/go-skynet/go-llama.cpp

To build the bindings locally, run:

cd go-llama.cpp
make libbinding.a

Now you can run the example with:

LIBRARY_PATH=$PWD C_INCLUDE_PATH=$PWD go run ./examples -m "/model/path/here" -t 14

Acceleration

OpenBLAS

To build and run with OpenBLAS, for example:

BUILD_TYPE=openblas make libbinding.a
CGO_LDFLAGS="-lopenblas" LIBRARY_PATH=$PWD C_INCLUDE_PATH=$PWD go run -tags openblas ./examples -m "/model/path/here" -t 14

CuBLAS

To build with CuBLAS:

BUILD_TYPE=cublas make libbinding.a
CGO_LDFLAGS="-lcublas -lcudart -L/usr/local/cuda/lib64/" LIBRARY_PATH=$PWD C_INCLUDE_PATH=$PWD go run ./examples -m "/model/path/here" -t 14

ROCM

To build with ROCM (HIPBLAS):

BUILD_TYPE=hipblas make libbinding.a
CC=/opt/rocm/llvm/bin/clang CXX=/opt/rocm/llvm/bin/clang++ CGO_LDFLAGS="-O3 --hip-link --rtlib=compiler-rt -unwindlib=libgcc -lrocblas -lhipblas" LIBRARY_PATH=$PWD C_INCLUDE_PATH=$PWD go run ./examples -m "/model/path/here" -ngl 64 -t 32

OpenCL

BUILD_TYPE=clblas CLBLAS_DIR=... make libbinding.a
CGO_LDFLAGS="-lOpenCL -lclblast -L/usr/local/lib64/" LIBRARY_PATH=$PWD C_INCLUDE_PATH=$PWD go run ./examples -m "/model/path/here" -t 14

You should see something like this from the output when using the GPU:

ggml_opencl: selecting platform: 'Intel(R) OpenCL HD Graphics'
ggml_opencl: selecting device: 'Intel(R) Graphics [0x46a6]'
ggml_opencl: device FP16 support: true

GPU offloading

Metal (Apple Silicon)

BUILD_TYPE=metal make libbinding.a
CGO_LDFLAGS="-framework Foundation -framework Metal -framework MetalKit -framework MetalPerformanceShaders" LIBRARY_PATH=$PWD C_INCLUDE_PATH=$PWD go build ./examples/main.go
cp build/bin/ggml-metal.metal .
./main -m "/model/path/here" -t 1 -ngl 1

Enjoy!

The documentation is available here and the full example code is here.

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

go-llama.cpp

Attention!

Usage

Acceleration

OpenBLAS

CuBLAS

ROCM

OpenCL

GPU offloading

Metal (Apple Silicon)

License

About

Releases

Sponsor this project

Packages

Used by 113

Contributors 14

Languages

License

go-skynet/go-llama.cpp

Folders and files

Latest commit

History

Repository files navigation

go-llama.cpp

Attention!

Usage

Acceleration

OpenBLAS

CuBLAS

ROCM

OpenCL

GPU offloading

Metal (Apple Silicon)

License

About

Resources

License

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Used by 113

Contributors 14

Languages

Packages