TPU RESOURCE_EXHAUSTED #14
Unanswered
zhenlan0426
asked this question in
Q&A
Replies: 1 comment
-
I am also experiencing similar issues. I think you can open an issue in the JAX repository and mention this issue in the new one. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Do you have any tips on how to debug "RESOURCE_EXHAUSTED: Error loading program: Attempting to reserve XXX at the bottom of memory." I reduced the batch_size to one and would still get this error. I am leveraging pmap to do data parallel training. With GPU on pytorch, I can train a whisper-large (without data parallel and 24G VRAM). But on TPU, I can only train a whisper-small. Any pointer will be helpful!
Beta Was this translation helpful? Give feedback.
All reactions