Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A40显卡微调Qwen1.5-7B-Chat报错:RuntimeError: mat1 and mat2 must have the same dtype, but got Float and BFloat16 #252

Open
lordk911 opened this issue Apr 2, 2024 · 2 comments

Comments

@lordk911
Copy link

lordk911 commented Apr 2, 2024

CUDA_VISIBLE_DEVICES=0,1 python dbgpt_hub/train/sft_train.py \
    --model_name_or_path $model_name_or_path \
    --do_train \
    --dataset $dataset \
    --max_source_length 2048 \
    --max_target_length 512 \
    --finetuning_type lora \
    --lora_target q_proj,v_proj \
    --template chatml \
    --lora_rank 64 \
    --lora_alpha 32 \
    --output_dir $output_dir \
    --overwrite_cache \
    --overwrite_output_dir \
    --per_device_train_batch_size 1 \
    --gradient_accumulation_steps 16 \
    --lr_scheduler_type cosine_with_restarts \
    --logging_steps 50 \
    --save_steps 2000 \
    --learning_rate 2e-4 \
    --num_train_epochs 8 \
    --plot_loss \
    --bf16  >> ${train_log}
Original Traceback (most recent call last):
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in _worker
    output = module(*input, **kwargs)
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/peft/peft_model.py", line 922, in forward
    return self.base_model(
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 1173, in forward
    outputs = self.model(
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 1048, in forward
    layer_outputs = self._gradient_checkpointing_func(
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/torch/_compile.py", line 24, in inner
    return torch._dynamo.disable(fn, recursive)(*args, **kwargs)
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 328, in _fn
    return fn(*args, **kwargs)
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/torch/_dynamo/external_utils.py", line 17, in inner
    return fn(*args, **kwargs)
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 451, in checkpoint
    return CheckpointFunction.apply(function, preserve, *args)
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/torch/autograd/function.py", line 539, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 230, in forward
    outputs = run_function(*args)
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 773, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 663, in forward
    query_states = self.q_proj(hidden_states)
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/miniconda3/envs/dbgpt_hub/lib/python3.10/site-packages/peft/tuners/lora.py", line 817, in forward
    result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self.bias)
RuntimeError: mat1 and mat2 must have the same dtype, but got Float and BFloat16
@misi0202
Copy link

misi0202 commented May 6, 2024

官方文档里好像还不支持Qwen1.5 baseline只有Qwen

@renlongY
Copy link

renlongY commented May 9, 2024

不支持Qwen1.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants