You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
3.[UI Render function string](https://github.com/rl-tools/example/blob/39acaa5b5402eacf5c2cab7b2e96db71f2ea110f/include/my_pendulum/operations_cpu.h#L16): This function uses the HTML5 Canvas rendering API and can be easily created using [https://studio.rl.tools](https://studio.rl.tools). Nnote that due to the wide spread use of the HTML5 Canvas drawing API, also ChatGPT is really good at creating render functions for different environments if you give it an example like the ones provided on [https://studio.rl.tools](https://studio.rl.tools).
48
48
49
-
The experiment tracking and save-trajectories step will periodically record trajectories and store them as `.json` files. After/while running the training you can run `./serve.sh` which should start a local webserver on [http://localhost:8080](http://localhost:8080) where you can see the recorded trajectories based on the render function you provided.
49
+
The experiment tracking and save-trajectories step will periodically record trajectories and store them as `.json` files. After/while running the training you can run `python3 -m http.server` which should start a local webserver on [http://localhost:8080](http://localhost:8080) where you can see the recorded trajectories based on the render function you provided.
using LOOP_CORE_CONFIG = rlt::rl::algorithms::ppo::loop::core::Config<T, TI, RNG, ENVIRONMENT, LOOP_CORE_PARAMETERS>;
43
49
#ifndef BENCHMARK
44
50
using LOOP_EXTRACK_CONFIG = rlt::rl::loop::steps::extrack::Config<LOOP_CORE_CONFIG>; // Sets up the experiment tracking structure (https://docs.rl.tools/10-Experiment%20Tracking.html)
staticconstexpr TI EVALUATION_INTERVAL = LOOP_CORE_CONFIG::CORE_PARAMETERS::STEP_LIMIT / 5;
48
54
staticconstexpr TI NUM_EVALUATION_EPISODES = 10;
49
55
staticconstexpr TI N_EVALUATIONS = NEXT::CORE_PARAMETERS::STEP_LIMIT / EVALUATION_INTERVAL;
50
56
};
51
57
using LOOP_EVALUATION_CONFIG = rlt::rl::loop::steps::evaluation::Config<LOOP_EXTRACK_CONFIG, LOOP_EVAL_PARAMETERS<LOOP_EXTRACK_CONFIG>>; // Evaluates the policy in a fixed interval and logs the return
0 commit comments