Releases · CISC/llama.cpp

13 Aug 08:08

fc4ca27

b3579 Latest

Latest

ci : fix github workflow vulnerable to script injection (#9008)

Signed-off-by: Diogo Teles Sant'Anna <diogoteles@google.com>

Assets 19

cudart-llama-bin-win-cu11.7.1-x64.zip

293 MB 2024-08-13T08:08:15Z
cudart-llama-bin-win-cu12.2.0-x64.zip

413 MB 2024-08-13T08:08:24Z
llama-b3579-bin-macos-arm64.zip

48 MB 2024-08-13T08:08:36Z
llama-b3579-bin-macos-x64.zip

49.5 MB 2024-08-13T08:08:38Z
llama-b3579-bin-ubuntu-x64.zip

53.6 MB 2024-08-13T08:08:40Z
llama-b3579-bin-win-avx-x64.zip

7.64 MB 2024-08-13T08:08:43Z
llama-b3579-bin-win-avx2-x64.zip

7.63 MB 2024-08-13T08:08:44Z
llama-b3579-bin-win-avx512-x64.zip

7.64 MB 2024-08-13T08:08:45Z
llama-b3579-bin-win-cuda-cu11.7.1-x64.zip

124 MB 2024-08-13T08:08:46Z
llama-b3579-bin-win-cuda-cu12.2.0-x64.zip

123 MB 2024-08-13T08:08:50Z
Source code (zip)

2024-08-12T16:28:23Z
Source code (tar.gz)

2024-08-12T16:28:23Z

06 Aug 14:07

github-actions

b3531

efda90c

b3531

[Vulkan] Fix compilation of `vulkan-shaders-gen` on w64devkit after `…

…e31a4f6` (#8880)

* Fix compilation issue in `vulkan-shaders-gen`

https://github.com/ggerganov/llama.cpp/commit/e31a4f679779220312c165b0f5994c680a610e38 broke compilation on w64devkit. Including `algorithm` seems to fix that.

* Guard it under `#ifdef _WIN32`

Assets 20

11 Jun 08:11

github-actions

b3130

4bfe50f

b3130

tests : check the Python version (#7872)

ggml-ci

Assets 20

02 Jun 22:34

github-actions

b3069

1669810

b3069

flake.lock: Update (#7686)

Flake lock file updates:

• Updated input 'flake-parts':
    'github:hercules-ci/flake-parts/8dc45382d5206bd292f9c2768b8058a8fd8311d9?narHash=sha256-/GJvTdTpuDjNn84j82cU6bXztE0MSkdnTWClUCRub78%3D' (2024-05-16)
  → 'github:hercules-ci/flake-parts/2a55567fcf15b1b1c7ed712a2c6fadaec7412ea8?narHash=sha256-iKzJcpdXih14qYVcZ9QC9XuZYnPc6T8YImb6dX166kw%3D' (2024-06-01)
• Updated input 'flake-parts/nixpkgs-lib':
    'https://github.com/NixOS/nixpkgs/archive/50eb7ecf4cd0a5756d7275c8ba36790e5bd53e33.tar.gz?narHash=sha256-QBx10%2Bk6JWz6u7VsohfSw8g8hjdBZEf8CFzXH1/1Z94%3D' (2024-05-02)
  → 'https://github.com/NixOS/nixpkgs/archive/eb9ceca17df2ea50a250b6b27f7bf6ab0186f198.tar.gz?narHash=sha256-lIbdfCsf8LMFloheeE6N31%2BBMIeixqyQWbSr2vk79EQ%3D' (2024-06-01)
• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/bfb7a882678e518398ce9a31a881538679f6f092?narHash=sha256-4zSIhSRRIoEBwjbPm3YiGtbd8HDWzFxJjw5DYSDy1n8%3D' (2024-05-24)
  → 'github:NixOS/nixpkgs/ad57eef4ef0659193044870c731987a6df5cf56b?narHash=sha256-SzDKxseEcHR5KzPXLwsemyTR/kaM9whxeiJohbL04rs%3D' (2024-05-29)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Assets 21

20 May 16:41

github-actions

b2950

db10f01

b2950

rpc : track allocated buffers (#7411)

* rpc : track allocated buffers

ref: #7407

* rpc : pack rpc_tensor tightly

Assets 21

09 May 13:55

github-actions

b2830

a743d76

b2830

CUDA: generalize FP16 fattn vec kernel (#7061)

* CUDA: generalize FP16 fattn vec kernel

* disable unsupported head sizes for AMD in test

* try AMD fix

* fix batch size 2-8

* partially revert changes

Assets 19

09 May 06:13

github-actions

b2824

4426e29

b2824

cmake : fix typo (#7151)

Assets 19

06 May 13:25

github-actions

b2795

bcdee0d

b2795

minor : fix trailing whitespace

Assets 19

05 May 12:48

github-actions

b2793

8f8acc8

b2793

Disable benchmark on forked repo (#7034)

* Disable benchmark on forked repo

* only check owner on schedule event

* check owner on push also

* more readable as multi-line

* ternary won't work

* style++

* test++

* enable actions debug

* test--

* remove debug

* test++

* do debug where we can get logs

* test--

* this is driving me crazy

* correct github.event usage

* remove test condition

* correct github.event usage

* test++

* test--

* event_name is pull_request_target

* test++

* test--

* update ref checks

Assets 19

29 Apr 14:35

github-actions

b2761

f4ab2a4

b2761

llama : fix BPE pre-tokenization (#6920)

* merged the changes from deepseeker models to main branch

* Moved regex patterns to unicode.cpp and updated unicode.h

* Moved header files

* Resolved issues

* added and refactored unicode_regex_split and related functions

* Updated/merged the deepseek coder pr

* Refactored code

* Adding unicode regex mappings

* Adding unicode regex function

* Added needed functionality, testing remains

* Fixed issues

* Fixed issue with gpt2 regex custom preprocessor

* unicode : fix? unicode_wstring_to_utf8

* lint : fix whitespaces

* tests : add tokenizer tests for numbers

* unicode : remove redundant headers

* tests : remove and rename tokenizer test scripts

* tests : add sample usage

* gguf-py : reader prints warnings on duplicate keys

* llama : towards llama3 tokenization support (wip)

* unicode : shot in the dark to fix tests on Windows

* unicode : first try custom implementations

* convert : add "tokenizer.ggml.pre" GGUF KV (wip)

* llama : use new pre-tokenizer type

* convert : fix pre-tokenizer type writing

* lint : fix

* make : add test-tokenizer-0-llama-v3

* wip

* models : add llama v3 vocab file

* llama : adapt punctuation regex + add llama 3 regex

* minor

* unicode : set bomb

* unicode : set bomb

* unicode : always use std::wregex

* unicode : support \p{N}, \p{L} and \p{P} natively

* unicode : try fix windows

* unicode : category support via std::regex

* unicode : clean-up

* unicode : simplify

* convert : add convert-hf-to-gguf-update.py

ggml-ci

* lint : update

* convert : add falcon

ggml-ci

* unicode : normalize signatures

* lint : fix

* lint : fix

* convert : remove unused functions

* convert : add comments

* convert : exercise contractions

ggml-ci

* lint : fix

* cmake : refactor test targets

* tests : refactor vocab tests

ggml-ci

* tests : add more vocabs and tests

ggml-ci

* unicode : cleanup

* scripts : ignore new update script in check-requirements.sh

* models : add phi-3, mpt, gpt-2, starcoder

* tests : disable obsolete

ggml-ci

* tests : use faster bpe test

ggml-ci

* llama : more prominent warning for old BPE models

* tests : disable test-tokenizer-1-bpe due to slowness

ggml-ci

---------

Co-authored-by: Jaggzh <jaggz.h@gmail.com>
Co-authored-by: Kazim Abrar Mahi <kazimabrarmahi135@gmail.com>

Assets 19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: CISC/llama.cpp

b3579

b3531

b3130

b3069

b2950

b2830

b2824

b2795

b2793

b2761