RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! #1378

MengFoong · 2025-02-19T10:06:04Z

Filtering the images containing characters which are not in opt.character
Filtering the images whose label is longer than opt.batch_max_length

dataset_root: all_data
opt.select_data: ['all_data']
opt.batch_ratio: ['1']

dataset_root: all_data dataset: all_data
all_data/en_sample
sub-directory: /en_sample num samples: 882
all_data/rec\test
sub-directory: /rec\test num samples: 0
all_data/rec\train
sub-directory: /rec\train num samples: 0
all_data/rec\val
sub-directory: /rec\val num samples: 0
num total samples of all_data: 882 x 1.0 (total_data_usage_ratio) = 882
num samples of all_data per batch: 10 x 1.0 (batch_ratio) = 10

Total_batch_size: 10 = 10

dataset_root: all_data/en_sample dataset: /
all_data/en_sample/
sub-directory: /. num samples: 882

...

continue to train, start_iter: 300000
training time: 11.559250354766846
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...

RuntimeError Traceback (most recent call last)
Cell In[6], line 2
1 opt = get_config("config_files/en_fine_tunning_config.yaml")
----> 2 train(opt, amp=False)

File c:\Users\mengfoong\Desktop\Train_Docling_2\EasyOCR-Trainer\train.py:233, in train(opt, show_number, amp)
230 model.eval()
231 with torch.no_grad():
232 valid_loss, current_accuracy, current_norm_ED, preds, confidence_score, labels,
--> 233 infer_time, length_of_data = validation(model, criterion, valid_loader, converter, opt, device)
234 model.train()
235 print(infer_time, length_of_data)

File c:\Users\mengfoong\Desktop\Train_Docling_2\EasyOCR-Trainer\test_1.py:45, in validation(model, criterion, evaluation_loader, converter, opt, device)
43 preds_size = torch.IntTensor([preds.size(1)] * batch_size)
44 # permute 'preds' to use CTCloss format
---> 45 cost = criterion(preds.log_softmax(2).permute(1, 0, 2), text_for_loss, preds_size, length_for_loss)
47 if opt.decode == 'greedy':
48 # Select max probabilty (greedy decoding) then decode index to character
49 _, preds_index = preds.max(2)

File c:\Users\mengfoong\Desktop\Train_Docling_2\venv\Lib\site-packages\torch\nn\modules\module.py:1739, in Module._wrapped_call_impl(self, *args, **kwargs)
1737 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1738 else:
...
3085 _Reduction.get_enum(reduction),
3086 zero_infinity,
3087 )

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and CPU!

I'm facing this error when I run the cell, anyone can share how you resolved this ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! #1378

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! #1378

MengFoong commented Feb 19, 2025

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! #1378

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! #1378

Comments

MengFoong commented Feb 19, 2025

Filtering the images containing characters which are not in opt.character Filtering the images whose label is longer than opt.batch_max_length

dataset_root: all_data opt.select_data: ['all_data'] opt.batch_ratio: ['1']

Total_batch_size: 10 = 10

dataset_root: all_data/en_sample dataset: / all_data/en_sample/ sub-directory: /. num samples: 882

...

continue to train, start_iter: 300000 training time: 11.559250354766846 Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...

Filtering the images containing characters which are not in opt.character
Filtering the images whose label is longer than opt.batch_max_length

dataset_root: all_data
opt.select_data: ['all_data']
opt.batch_ratio: ['1']

dataset_root: all_data/en_sample dataset: /
all_data/en_sample/
sub-directory: /. num samples: 882

continue to train, start_iter: 300000
training time: 11.559250354766846
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...