Skip to content

Conversation

@spirosperos
Copy link

@spirosperos spirosperos commented Aug 6, 2025

What this does

This PR fixes two critical bugs in the HIL-SERL simulation training framework and adds comprehensive documentation:

Bug Fixes:

  • 🐛 Bug Fix: Fixed control_time_s parameter not being respected during training - episodes were always 10 seconds regardless of configuration
  • 🐛 Bug Fix: Fixed random_block_position flag not being properly passed to the gym environment, preventing cube randomization

Improvements:

  • 📚 Documentation: Added comprehensive training guide (hil_serl_simulation_training_guide_README.md) for HIL-SERL simulation training
  • 🔧 Enhancement: Added extensive debug logging for easier training monitoring and troubleshooting

Key Changes:

  • Implemented TimeLimitWrapper class to properly enforce episode time limits based on control_time_s configuration
  • Added random_block_position parameter to environment configuration and properly passed it to gym environment
  • Enhanced logging throughout training process with detailed episode progress, time tracking, and environment state information

How it was tested

  • Time Limit Fix: Verified that setting "control_time_s": 40.0 in configuration now properly limits episodes to 40 seconds instead of default 10 seconds
  • Cube Randomization Fix: Confirmed that "random_block_position": true now properly randomizes cube positions between episodes
  • Debug Logging: Tested debug outputs during training to ensure proper monitoring of episode progress, time remaining, and environment state
  • Documentation: Verified all commands and configurations in the new README work correctly with the fixed implementation

Test Commands:

# Test recording with new time limit
python -m lerobot.scripts.rl.gym_manipulator --config_path examples/hil_serl_simulation_training/hi_rl_test_gamepad.json

# Test training with both fixes
python -m lerobot.scripts.rl.learner --config_path examples/hil_serl_simulation_training/train_gym_hil_env_gamepad.json
python -m lerobot.scripts.rl.actor --config_path examples/hil_serl_simulation_training/train_gym_hil_env_gamepad.json

How to checkout & try? (for the reviewer)

  1. Test Time Limit Fix:
# Modify control_time_s in hi_rl_test_gamepad.json to 20.0 and verify episodes last 20 seconds
python -m lerobot.scripts.rl.gym_manipulator --config_path examples/hil_serl_simulation_training/hi_rl_test_gamepad.json
  1. Test Cube Randomization:
# Set random_block_position to false in config and verify cube stays in same position
# Set to true and verify cube randomizes between episodes
python -m lerobot.scripts.rl.gym_manipulator --config_path examples/hil_serl_simulation_training/hi_rl_test_gamepad.json
  1. Test Training with Both Fixes:
# Terminal 1: Start learner
python -m lerobot.scripts.rl.learner --config_path examples/hil_serl_simulation_training/train_gym_hil_env_gamepad.json

# Terminal 2: Start actor  
python -m lerobot.scripts.rl.actor --config_path examples/hil_serl_simulation_training/train_gym_hil_env_gamepad.json
  1. Review Documentation:
# Check the new training guide
cat examples/hil_serl_simulation_training/hil_serl_simulation_training_guide_README.md

Expected Behavior:

  • Episodes should respect the control_time_s setting (30s in training config, 40s in recording config)
  • Cube should randomize position when random_block_position: true
  • Debug logs should show detailed episode progress and time tracking

CarolinePascal and others added 30 commits July 28, 2025 11:09
…ngface#1609)

* fix(policies): remove action from batch for offline evaluation in diffusion, tdmpc, and vqbet policies

* style(diffusion): correct comment capitalization for clarity in modeling_diffusion.py
* fix bug about sampling t from beta distribution

* fix: address review comments

---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
…ould crash with exception, fix environment state docs (huggingface#1617)

* Fix bug in diffusion config validation when not using image features

* Fix DiffusionPolicy docstring about shape of env state
…both OpenCV and RealSense camera implementations
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
…face#1643)

* chore(ci): add some release stuff

* chore(ci): add requirements-macos

* chore(ci): added lockfiles for future reference

* feat(ci): add draft & prerelease option to release workflow tag
* Cleanup badges

* Remove comment

* Remove profiling section

* Move acknowledgment

* Move citations

* Fix badge display

* Move build your robot section

* Fix nightly badge

* Revert be13b3f

* Update README.md

Co-authored-by: HUANG TZU-CHUN <tzu.chun.huang.tw@gmail.com>
Signed-off-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>

* chore(docs): optimize readme for PyPI rendering

* chore(docs): move policy readme to docs folder + symlink in policy dirs

* fix(docs): max width og lerobot logo + url in citation block

---------

Signed-off-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>
Co-authored-by: HUANG TZU-CHUN <tzu.chun.huang.tw@gmail.com>
Co-authored-by: Steven Palma <steven.palma@huggingface.co>
* add: test to check proper construction with multiple features with STATE/ACTION type

* fix: robot and action state should match policy's expectations

* fix minor

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

---------

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
…te URLs (huggingface#1313)

* Update links to use absolute URLs. 

* Update dataset upload example link to use HF_USER variable and match the correct syntax.
@lukicdarkoo
Copy link
Member

There is a mess up with the branches, fixed in #2

@lukicdarkoo lukicdarkoo closed this Aug 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.