You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I get "Unable to restore slot, no available space in KV cache or invalid slot save file" when trying to restore a 200MB cache from disk. Everything works finr with a 100MB cache. I have plenty of memory, running on an M4 Max with 128GB RAM.
I did set -np to 7, which - as reported elsewhere - causes speculative decoding to speed up dramatically (to 2x the normal speed).
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I get "Unable to restore slot, no available space in KV cache or invalid slot save file" when trying to restore a 200MB cache from disk. Everything works finr with a 100MB cache. I have plenty of memory, running on an M4 Max with 128GB RAM.
I did set -np to 7, which - as reported elsewhere - causes speculative decoding to speed up dramatically (to 2x the normal speed).
This is how I run my server:
./build/bin/llama-server -m /Users/mattsinalco/.cache/huggingface/hub/models--unsloth--Llama-3.3-70B-Instruct-GGUF/snapshots/0c14ebbedd129fb190c8241facca9a360e81c650/Llama-3.3-70B-Instruct-Q4_K_M.gguf -md /Users/mattsinalco/.cache/huggingface/hub/models--unsloth--Llama-3.2-1B-Instruct-GGUF/snapshots/a5594fb18df5dfc6b43281423fcce6750cd92de5/Llama-3.2-1B-Instruct-Q4_K_M.gguf -ngl 99 -ngld 99 -fa --port 8034 --ctx-size 8192 --ctx-size-draft 8192 --draft-min 0 --draft-max 16 -np 7 --host 0.0.0.0 --slots --slot-save-path /Users/mattsinalco/mathias/caching
Beta Was this translation helpful? Give feedback.
All reactions