Replies: 1 comment
-
| Hi @e2eNAK ! Can you provide the exact config for reproducing this. Thanks! | 
Beta Was this translation helpful? Give feedback.
                  
                    0 replies
                  
                
            
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment
  
        
    
Uh oh!
There was an error while loading. Please reload this page.
-
The exact issue it states is : :/usr/local/lib/python3.10/site-packages/transformers/generation/utils.py:1535: UserWarning: You are calling .generate() with the  input_ids  being on a device type different than your model's device.  input_ids  is on cpu, whereas the model is on cuda. You may experience unexpected behaviors or slower generation. Please make sure that you have put  input_ids  to the correct device by calling for example input_ids = input_ids.to('cuda') before running  .generate() .
warnings.warn(
WARNING:nemoguardrails.actions.action_dispatcher:Error while execution generate_user_intent: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)
when i tried messages.to("cuda") it gives me :
LlamaForCausalLM(
(model): LlamaModel(
(embed_tokens): Embedding(32000, 5120)
(layers): ModuleList(
(0-39): 40 x LlamaDecoderLayer(
(self_attn): LlamaAttention(
(q_proj): Linear(in_features=5120, out_features=5120, bias=False)
(k_proj): Linear(in_features=5120, out_features=5120, bias=False)
(v_proj): Linear(in_features=5120, out_features=5120, bias=False)
(o_proj): Linear(in_features=5120, out_features=5120, bias=False)
(rotary_emb): LlamaRotaryEmbedding()
)
(mlp): LlamaMLP(
(gate_proj): Linear(in_features=5120, out_features=13824, bias=False)
(up_proj): Linear(in_features=5120, out_features=13824, bias=False)
(down_proj): Linear(in_features=13824, out_features=5120, bias=False)
(act_fn): SiLUActivation()
)
(input_layernorm): LlamaRMSNorm()
(post_attention_layernorm): LlamaRMSNorm()
)
)
(norm): LlamaRMSNorm()
)
(lm_head): Linear(in_features=5120, out_features=32000, bias=False)
)
Fetching 7 files: 100%|██████████| 7/7 [00:00<00:00, 6462.72it/s]
ERROR:nemoguardrails.server.api:'list' object has no attribute 'to'
Traceback (most recent call last):
File "/nemoguardrails/nemoguardrails/server/api.py", line 337, in chat_completion
messages = messages.to("cuda")
AttributeError: 'list' object has no attribute 'to'
Beta Was this translation helpful? Give feedback.
All reactions