Skip to content

Releases: MaggotHATE/Llama_chat

Beta 170

18 Sep 17:56
Compare
Choose a tag to compare

Moved to Sampling v2 (and all other relevant commits from llama.cpp), reworked messages regeneration.

  • DRY is not reimplemented yet
  • Naming scheme now reflexes not only the backend, but also the use of llamafile, OpenMP and OpenBLAS

Beta 169

27 Aug 18:42
Compare
Choose a tag to compare

Reworked XTC to better align with the original vision +updates from ggerganov/llama.cpp#9118 and misc fixes.

XTC implementation should be complete now.

Beta 168.1

25 Aug 14:31
Compare
Choose a tag to compare

XTC: added xtc_threshold_max parameter to limit the upper probability. 1.0 by default - still works like normal XTC.

This parameter may be used for models that have less clichéd tokens at top, but still have intermediate ranges of undesirable tokes. Needs testing.

Beta 168

25 Aug 08:50
Compare
Choose a tag to compare

XTC improvements:

  • all candidates are scanned now - ensures that it detects two penalizable tokens in two total
  • sorting swaps tokens even if values are equal - ensures that in two total tokens, the most probable will be cut off

What we have so far for default settings:

"samplers_sequence": "mx",
"xtc_probability": 0.5,
"xtc_threshold": 0.1,
"xtc_probability_once": true,
"xtc_min": 2,
"min_p": 0.02,

xtc_probability_once defines whether chance to penalize at all is calculated once at the start (as in original) or individually for each token above xtc_threshold.

xtc_min defines the minimum number of tokens above xtc_threshold to trigger the effect.

As previously, these settings are not in UI yet, so you will need to add them manually to config.json.

Beta 167

22 Aug 14:11
Compare
Choose a tag to compare

XTC: added xtc_probability_once boolean parameter. If true, probability will be calculated once, if false - for each candidate. This should make current implementation more sufficient and better for testing.

As a reminder, add this into the model's part of config.json to test XTC with the original settings:

"samplers_sequence": "mx",
"xtc_probability": 0.5,
"xtc_threshold": 0.1,
"xtc_probability_once": true,
"min_p": 0.02,

Beta 166 (fix 3)

21 Aug 19:58
Compare
Choose a tag to compare

Reworked XTC, thanks to @LostRuins

Still not sure if the implementation is optimal, but it works better now.

No Vulkan for now, waiting for a fix.

Beta 166 (actually fixed)

20 Aug 08:19
Compare
Choose a tag to compare

Fixed randomization not working in previous "fix", removed re-normalization.

Beta 166

19 Aug 07:12
Compare
Choose a tag to compare

Added a very crude implementation of Exclude Top Choices (XTC) sampler by @p-e-w , will rework properly later. Seems to work for now, needs a lot of testing.

Start from adding these settings in config.json (only in config for now):

"samplers_sequence": "mx",
"xtc_probability": 0.5,
"xtc_threshold": 0.1,
"min_p": 0.02,

Beta 165

18 Aug 18:34
Compare
Choose a tag to compare

Small fixes, added SDL2 builds since it uses less VRAM on UI.

Beta 164

16 Aug 19:47
Compare
Choose a tag to compare

Latest commits from llama.cpp: support for Nemotron and EXAONE models

  • from now on all CPU-only builds use OpenBLAS (compiled statically - including chatTest)