Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! #1378

Open
MengFoong opened this issue Feb 19, 2025 · 0 comments

Comments

@MengFoong
Copy link

Filtering the images containing characters which are not in opt.character
Filtering the images whose label is longer than opt.batch_max_length

dataset_root: all_data
opt.select_data: ['all_data']
opt.batch_ratio: ['1']

dataset_root: all_data dataset: all_data
all_data/en_sample
sub-directory: /en_sample num samples: 882
all_data/rec\test
sub-directory: /rec\test num samples: 0
all_data/rec\train
sub-directory: /rec\train num samples: 0
all_data/rec\val
sub-directory: /rec\val num samples: 0
num total samples of all_data: 882 x 1.0 (total_data_usage_ratio) = 882
num samples of all_data per batch: 10 x 1.0 (batch_ratio) = 10

Total_batch_size: 10 = 10

dataset_root: all_data/en_sample dataset: /
all_data/en_sample/
sub-directory: /. num samples: 882

...

continue to train, start_iter: 300000
training time: 11.559250354766846
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...

RuntimeError Traceback (most recent call last)
Cell In[6], line 2
1 opt = get_config("config_files/en_fine_tunning_config.yaml")
----> 2 train(opt, amp=False)

File c:\Users\mengfoong\Desktop\Train_Docling_2\EasyOCR-Trainer\train.py:233, in train(opt, show_number, amp)
230 model.eval()
231 with torch.no_grad():
232 valid_loss, current_accuracy, current_norm_ED, preds, confidence_score, labels,
--> 233 infer_time, length_of_data = validation(model, criterion, valid_loader, converter, opt, device)
234 model.train()
235 print(infer_time, length_of_data)

File c:\Users\mengfoong\Desktop\Train_Docling_2\EasyOCR-Trainer\test_1.py:45, in validation(model, criterion, evaluation_loader, converter, opt, device)
43 preds_size = torch.IntTensor([preds.size(1)] * batch_size)
44 # permute 'preds' to use CTCloss format
---> 45 cost = criterion(preds.log_softmax(2).permute(1, 0, 2), text_for_loss, preds_size, length_for_loss)
47 if opt.decode == 'greedy':
48 # Select max probabilty (greedy decoding) then decode index to character
49 _, preds_index = preds.max(2)

File c:\Users\mengfoong\Desktop\Train_Docling_2\venv\Lib\site-packages\torch\nn\modules\module.py:1739, in Module._wrapped_call_impl(self, *args, **kwargs)
1737 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1738 else:
...
3085 _Reduction.get_enum(reduction),
3086 zero_infinity,
3087 )

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and CPU!

I'm facing this error when I run the cell, anyone can share how you resolved this ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant