Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The IndexError interrupts training #11

Open
Ternura111 opened this issue Sep 9, 2024 · 7 comments
Open

The IndexError interrupts training #11

Ternura111 opened this issue Sep 9, 2024 · 7 comments

Comments

@Ternura111
Copy link

Traceback (most recent call last):
File "/VisCom-SSD-2/hj/RVOS/DsHmp/dshmp/engine/train_loop.py", line 149, in train
self.run_step()
File "/VisCom-SSD-2/hj/RVOS/DsHmp/dshmp/engine/defaults.py", line 494, in run_step
self._trainer.run_step()
File "/VisCom-SSD-2/hj/RVOS/DsHmp/dshmp/engine/train_loop.py", line 395, in run_step
loss_dict = self.model(data, self.iter)
File "/home/ta/anaconda3/envs/ylf_rvos/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ta/anaconda3/envs/ylf_rvos/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 963, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/ta/anaconda3/envs/ylf_rvos/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/VisCom-SSD-2/hj/RVOS/DsHmp/dshmp/dshmp_model.py", line 288, in forward
return self.train_model(batched_inputs, iterations)
File "/VisCom-SSD-2/hj/RVOS/DsHmp/dshmp/dshmp_model.py", line 321, in train_model
motion_feat = torch.cat([lang_feat_fusion[motion_map.bool()], lang_feat], dim=0)
IndexError: The shape of the mask [1, 40] at index 0 does not match the shape of the indexed tensor [2, 40, 256] at index 0
[09/09 15:46:29 d2.utils.events]: iter: 0 lr: N/A max_mem: 1355M

@heshuting555
Copy link
Owner

Thank you for your interest in our work!

Current code base only support one GPU one sample setting! Thanks!

@Sjunshu
Copy link

Sjunshu commented Sep 24, 2024

how can i use more than one card,i also meet this problem

@Ternura111
Copy link
Author

我怎么能用多张卡,我也遇到这个问题

When training , make sure to use 8 gpu . This works .

@Sjunshu
Copy link

Sjunshu commented Sep 24, 2024

Is there a way to train with fewer than 8 GPUs but more than 1 GPU?

@Sjunshu
Copy link

Sjunshu commented Sep 24, 2024

I just have 6 GPUs

@Ternura111
Copy link
Author

I just have 6 GPUs

sorry , I have no ideas . Maybe the author has a way

@Sjunshu
Copy link

Sjunshu commented Sep 24, 2024

OK,thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants