聊天记录已经训练完,把model_output下到本地,想在本地跑,执行weclone-cli server后就报You are trying to offload the whole model to the disk. Please use the disk_offload function instead.的错误了。
机器是Mac M1。
已经尝试增加了
"quantization_bit": 2,
"quantization_type": "nf4",
"double_quantization": true,
"quantization_method": "bitsandbytes"
enable_clean改为false, fp16改为false