-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
多线程推理出现错误 #1044
Comments
用的下面的模式: self.cosyvoice = CosyVoice2( cosy_voice_path + 'pretrained_models/CosyVoice2-0.5B', load_jit=False, load_trt=True, fp16=True ) load_trt = true 多线程出现错误, load_trt 加载 TensorRT 优化的模型 |
补充一下用的A10 |
应该是模型不太鲁邦,试试fp16=False |
好 ,我试试,同一台机器 跑了2个python程序, GPU 利用率已经100%, 但是显存才用到10021MiB / 23028MiB 还有一半,我想如何吃点剩下一半?另外 cpu 8核 已经快不行了,是不是不能在增加进程了?如下图2 Every 1.0s: nvidia-smi iZbp14v9nxa3mr9n1tu44pZ: Thu Mar 6 15:37:56 2025 Thu Mar 6 15:37:56 2025 +-----------------------------------------------------------------------------------------+ top - 15:40:25 up 15 days, 5:02, 6 users, load average: 7.45, 7.24, 5.97 |
Exception in thread Thread-19 (llm_job):
Traceback (most recent call last):
File "/root/miniconda3/envs/cosyvoice/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/root/miniconda3/envs/cosyvoice/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/root/CosyVoice/cosyvoice/cli/model.py", line 113, in llm_job
for i in self.llm.inference(text=text.to(self.device),
File "/root/miniconda3/envs/cosyvoice/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 56, in generator_context
response = gen.send(request)
File "/root/CosyVoice/cosyvoice/llm/llm.py", line 326, in inference
top_ids = self.sampling_ids(logp.squeeze(dim=0), out_tokens, sampling, ignore_eos=True if i < min_len else False).item()
File "/root/CosyVoice/cosyvoice/llm/llm.py", line 150, in sampling_ids
top_ids = self.sampling(weighted_scores, decoded_tokens, sampling)
File "/root/CosyVoice/cosyvoice/utils/common.py", line 110, in ras_sampling
top_ids = nucleus_sampling(weighted_scores, top_p=top_p, top_k=top_k)
File "/root/CosyVoice/cosyvoice/utils/common.py", line 131, in nucleus_sampling
top_ids = indices[prob.multinomial(1, replacement=True)]
RuntimeError: probability tensor contains either
inf
,nan
or element < 0The text was updated successfully, but these errors were encountered: