Skip to content

rl-tools/flashrl

Repository files navigation

FlashRL Megakernel

Based on RLtools, which is a CPU focused deep RL for continuous control library.

Timing

Full PPO step (rollout, GAE, actor training, critic training)

Step time: 6150 ms

  • Collect: 415 ms
  • Evaluate critic (for GAE): 430 ms
  • Training: 5120 ms
    • Epoch $\times$ Batch $= 32$
    • Actor forward: 24 ms
    • Actor backward: 57 ms
    • Train critic: 81 ms

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •