-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OOM Error] Out of Memory with 32k tokens #5
Comments
You can consider using DeepSpeed-Inference to solve this problem, which may also speed up the inference process. It provides a simple implementation by making slight modifications to the You can refer to the file
Then, maybe you need to add one line in the file
And you can use the DeepSpeed launcher
|
hi, for this line: |
There is a blog about DeepSpeed Inference, which may help you to understand clearly how DeepSpeed accelerates inference. For the first problem, DeepSpeed uses tensor parallelism to shard the model and generate results through communication. This means that each GPU will not encode the same inputs. For more details, you can refer to the issue: microsoft/DeepSpeed#4154. For the second problem, |
hi, @JL-Cheng , how did you get 32K of data? As far as I know, in https://huggingface.co/datasets/abacusai/LongChat-Lines/viewer/default/100, the maximum data length is 26K. |
Thank you for your valuable contribution! I have been experimenting with your evaluation codes on the
LongChat-Lines dataset
. However, I encountered anOut of Memory Error
when the token length reached 32k.I am fortunate to have multiple 80G A100 GPUs at my disposal. However, I noticed that your evaluation code does not incorporate parallel processing, and only one GPU is utilized during evaluation.
I would greatly appreciate it if you could provide more information about the resources used in the experimental section of your paper. Additionally, I am curious if you implemented any form of parallelization to enhance the evaluation process.
Thank you once again for your assistance!
The text was updated successfully, but these errors were encountered: