Releases · casper-hansen/AutoAWQ · GitHub

21 Sep 11:51

v0.1.0

What's Changed

Support Falcon 180B by @casper-hansen in #35
[NEW] GEMV kernel implementation by @casper-hansen in #40
Allow user to use custom calibration data for quantization by @boehm-e in #27
Safetensors and model sharding by @casper-hansen in #47
2x faster context processing with GEMV by @casper-hansen in #58
Support kv_heads by @casper-hansen in #60
Refactor quantization code by @casper-hansen in #62
support windows by @qwopqwop200 in #53
Improve model loading by @casper-hansen in #66

New Contributors

@boehm-e made their first contribution in #27

Full Changelog: v0.0.2...v0.1.0

Contributors

boehm-e, casper-hansen, and qwopqwop200

Assets 10

06 Sep 20:28

v0.0.2

What's Changed

Refactor fused modules by @casper-hansen in #18
fuse_layers bug fix by @qwopqwop200 in #21
support speedtest to benchmark FP16 model by @wanzhenchn in #25
Implement batch size for speed test by @casper-hansen in #26
[BUG] Fix illegal memory access + Quantized Multi-GPU support by @casper-hansen in #28
YaRN support for LLaMa models by @casper-hansen in #23

New Contributors

@wanzhenchn made their first contribution in #25

Full Changelog: v0.0.1...v0.0.2

Contributors

wanzhenchn, casper-hansen, and qwopqwop200

Assets 10

01 Sep 15:34

v0.0.1

What's Changed

Add GPTJ Support by @jamesdborin in #1
windows support by @qwopqwop200 in #16
Release PyPi package + Create GitHub workflow by @casper-hansen in #9

New Contributors

@jamesdborin made their first contribution in #1
@qwopqwop200 made their first contribution in #16
@casper-hansen made their first contribution in #9

Full Changelog: https://github.com/casper-hansen/AutoAWQ/commits/v0.0.1

Contributors

casper-hansen, jamesdborin, and qwopqwop200

Assets 10