🏠
Working from home
PhD Student @UMass Amherst
Sequential Decision Making, Deep
Reinforcement Learning
-
University of Massachusetts, Amherst
- Amherst, MA, USA
- https://abhinavbhatia.me
Pinned Loading
-
A high-quality, truly single-file im...
A high-quality, truly single-file implementation of PPO -- simple to use, transparent, and dependency-light (only torch and gymnasium). Includes a Lagrange penalty-based constrained-MDP solver and supports both continuous and discrete action spaces. Compatible with RNN policies. Designed for clarity, reproducibility, and research-grade performance. 1# -----------------------------------------------------------------------------2# PPO (Proximal Policy Optimization) — High-Quality Single-File Implementation3# Author: Abhinav Bhatia4# Source: https://gist.github.com/bhatiaabhinav/edb07949471c0ae9e71811146cd463115# -
Metareasoning.jl
Metareasoning.jl PublicDecision-theoretic metareasoning to control hyperparameter and stopping point of anytime algorithms using deep reinforcement learning.
Julia 1
-
AnytimeWeightedAStar.jl
AnytimeWeightedAStar.jl PublicJulia Implementation of Anytime Weighted A* (AWA*) and Randomized Weighted A* (RWA*) algorithm
-
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.
