-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The IndexError interrupts training #11
Comments
Thank you for your interest in our work! Current code base only support one GPU one sample setting! Thanks! |
how can i use more than one card,i also meet this problem |
When training , make sure to use 8 gpu . This works . |
Is there a way to train with fewer than 8 GPUs but more than 1 GPU? |
I just have 6 GPUs |
sorry , I have no ideas . Maybe the author has a way |
OK,thank you |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Traceback (most recent call last):
File "/VisCom-SSD-2/hj/RVOS/DsHmp/dshmp/engine/train_loop.py", line 149, in train
self.run_step()
File "/VisCom-SSD-2/hj/RVOS/DsHmp/dshmp/engine/defaults.py", line 494, in run_step
self._trainer.run_step()
File "/VisCom-SSD-2/hj/RVOS/DsHmp/dshmp/engine/train_loop.py", line 395, in run_step
loss_dict = self.model(data, self.iter)
File "/home/ta/anaconda3/envs/ylf_rvos/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ta/anaconda3/envs/ylf_rvos/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 963, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/ta/anaconda3/envs/ylf_rvos/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/VisCom-SSD-2/hj/RVOS/DsHmp/dshmp/dshmp_model.py", line 288, in forward
return self.train_model(batched_inputs, iterations)
File "/VisCom-SSD-2/hj/RVOS/DsHmp/dshmp/dshmp_model.py", line 321, in train_model
motion_feat = torch.cat([lang_feat_fusion[motion_map.bool()], lang_feat], dim=0)
IndexError: The shape of the mask [1, 40] at index 0 does not match the shape of the indexed tensor [2, 40, 256] at index 0
[09/09 15:46:29 d2.utils.events]: iter: 0 lr: N/A max_mem: 1355M
The text was updated successfully, but these errors were encountered: