Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extremely low GPU utilization #122

Open
RupertDick5415 opened this issue Feb 26, 2025 · 4 comments
Open

extremely low GPU utilization #122

RupertDick5415 opened this issue Feb 26, 2025 · 4 comments

Comments

@RupertDick5415
Copy link

Hello,

I am a beginner in reinforcement learning, and I am experiencing extremely low GPU utilization while training with the rl-agent you created in the highwayenv environment. The program outputs the following information:

[INFO] Choosing GPU device: 0, memory used: 992
/home/guo/anaconda3/envs/highway/lib/python3.8/site-packages/gymnasium/core.py:311: UserWarning: WARN: env.config to get variables from other wrappers is deprecated and will be removed in v1.0, to get this variable you can do env.unwrapped.config for environment variables or env.get_wrapper_attr('config') that will search the reminding wrappers.
logger.warn(
[INFO] Episode 0 score: 7.8
However, my GPU utilization is only 5%. When I checked using nvidia-smi, the output was as follows:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.256.02 Driver Version: 470.256.02 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro P2200 Off | 00000000:01:00.0 On | N/A |
| 60% 60C P0 24W / 75W | 1492MiB / 5050MiB | 5% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
As you can see, the GPU utilization is very low. Could you please advise me on how to resolve this issue?

I would greatly appreciate it if you could take the time to respond.

Thank you!

@RupertDick5415
Copy link
Author

IT seem DQNagent can"t use batched tranning?
I wonder if it is my porblem? @eleurent

@RupertDick5415
Copy link
Author

I debug and found DQNagent don't have the attribute "batched".It seem a real porblem here? @eleurent

@RupertDick5415
Copy link
Author

Dear Author,

I noticed that while using your library to train the QDN agent, the program only utilizes a single CPU to collect samples. Even when the process is set to 8, it still only uses one CPU, which results in low GPU utilization. I wonder if you could help resolve this issue?

Thank you! @eleurent @davidwitten @Gamenot @ashishrana160796

@RupertDick5415
Copy link
Author

how to use multi cpu to train one agent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant