You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When we set batch_frame_size to 15, max_iter to 1, and run armory run carla_mot_adversarialpatch_undefended.json,
I see a 27 GB usage in GPU memory. However, if I change max_iter to any number greater than 1 while holding batch_frame_size constant at 15, I see that the GPU memory usage shoots to 42.3 GB.
Since we want to explore the background subtraction based defense for the tracking scenario, it becomes important to capture multiple frames simultaneously for our defense and hence this memory usage is important to us.
Steps To Reproduce
Got to: lib/armory/scenario_configs/eval6/carla_mot
Run 1:
Edit the carla_mot_adversarialpatch_undefended.json file: set batch_frame_size to 15 and max_iter to 1.
Then run armory run carla_mot_adversarialpatch_undefended.json
Track the GPU usage, say through the command nvidia-smi and note the maximum GPU memory usage
I see that we need 27 GB
Run 2:
Edit the carla_mot_adversarialpatch_undefended.json file: set batch_frame_size to 15 and max_iter to 5 (or any number > 1) .
Then run armory run carla_mot_adversarialpatch_undefended.json
Track the GPU usage, say through the command nvidia-smi and note the maximum GPU memory usage
I see that we need 42.3 GB.
A similar spike in memory requirements was also observed by Mike Tan during our tag-up in December.
Could you please explain/fix this sudden spike in memory usage as we go beyond 1 iteration?
Additional Information
No response
The text was updated successfully, but these errors were encountered:
I'm able to reproduce the error. I've observed that, using a 24GB GPU, the highest batch_frame_size I can use without error is 8
The error is occurring in the second iteration at this line, which is calling this line in ART
This is likely occurring due to the high resolution of the dataset.
One thing I was able to try that did make the error go away was to convert the model to fp16 here with estimator._model = estimator._model.half() and the inputs to fp16 here (unfortunately doing so here, which would be easier, didn't work since ART recasts the input to fp32). Obviously this has the downsides though of using lower precision and needing to modify ART (if you did go this route, it'd likely be better to subclass the ART class and overwrite _get_losses() accordingly).
You mentioned @yusong-tan ran into this issue too, so I'll tag him in case there's any extra context he can add, but this may be a limitation we need to try and work around
Description of the bug
When we set
batch_frame_size
to 15,max_iter
to 1, and runarmory run carla_mot_adversarialpatch_undefended.json
,I see a 27 GB usage in GPU memory. However, if I change
max_iter
to any number greater than 1 while holdingbatch_frame_size
constant at 15, I see that the GPU memory usage shoots to 42.3 GB.Since we want to explore the background subtraction based defense for the tracking scenario, it becomes important to capture multiple frames simultaneously for our defense and hence this memory usage is important to us.
Steps To Reproduce
Got to:
lib/armory/scenario_configs/eval6/carla_mot
Run 1:
Edit the
carla_mot_adversarialpatch_undefended.json
file: setbatch_frame_size
to 15 andmax_iter
to 1.Then run
armory run carla_mot_adversarialpatch_undefended.json
Track the GPU usage, say through the command
nvidia-smi
and note the maximum GPU memory usageI see that we need 27 GB
Run 2:
Edit the
carla_mot_adversarialpatch_undefended.json
file: setbatch_frame_size
to 15 andmax_iter
to 5 (or any number > 1) .Then run
armory run carla_mot_adversarialpatch_undefended.json
Track the GPU usage, say through the command
nvidia-smi
and note the maximum GPU memory usageI see that we need 42.3 GB.
A similar spike in memory requirements was also observed by Mike Tan during our tag-up in December.
Could you please explain/fix this sudden spike in memory usage as we go beyond 1 iteration?
Additional Information
No response
The text was updated successfully, but these errors were encountered: