Releases: dropbox/gemlite
Releases · dropbox/gemlite
v0.5.1.post1
20 Oct 15:36
Compare
Sorry, something went wrong.
No results found
Patch to fix A16W8 issues
v0.5.1
22 Sep 14:32
Compare
Sorry, something went wrong.
No results found
Various fixes related MXFP:
v0.5.0
18 Aug 10:58
Compare
Sorry, something went wrong.
No results found
Rocm integration by @mobicham in #35
Improve memory usage by @mobicham in #36
Add MXFP support to gemlite by @mobicham in #37
Faster mxpf8 activation quantization
Improve helper
Bug fixes
v0.4.8
10 Jun 09:37
Compare
Sorry, something went wrong.
No results found
Improve and clean-up helper.py to make it compatible with torchao/vllm.
Restore ptrs in autotune for gemvs that was causing the first forward pass to be incorrect.
Disable fp8 rounding.
Update caches.
v0.4.7
02 Jun 08:01
Compare
Sorry, something went wrong.
No results found
This release is mainly focusing on improving performance:
Faster GEMVs via fp16 acc and output caching.
Better GEMM/Split-K performance with improved autotuning.
Faster autotuning mode to avoid long startup time.
What's Changed
v0.4.6
13 May 11:00
Compare
Sorry, something went wrong.
No results found
This release is mainly focusing on vllm V1 (torch.compile) support.
What's Changed
v0.4.5
06 May 07:40
Compare
Sorry, something went wrong.
No results found
Update caches for 48GB gpus (Qwen2 VL/Llama3 8B)
Add cpu-side packing
Relax min size to 32
fp16 acc fix
add persistent SPLIT_K version
fix tl.contiguous hint
make m,n block sizes safe
add BitNet support in helper
add custom load_state_dict to allow weight serialization
Update swizzle
v0.4.4
24 Mar 15:07
Compare
Sorry, something went wrong.
No results found
v0.4.3
17 Mar 15:26
Compare
Sorry, something went wrong.
No results found
Add faster packing / unpacking utils
Set MIN_SIZE = 64 for Gemma 3
Update caches
v0.4.2.post1
21 Feb 13:43
Compare
Sorry, something went wrong.
No results found
Avoid recompilation when the batch-size M changes: dcc2455
Expose autotune M logic via set_autotune_setting(): 37dab27
Fix bug related to config caching that was ignoring the pre-loaded cache: 3c4ab53