Skip to content

Conversation

@spirosperos
Copy link

What this does

This PR fixes two critical bugs in the HIL-SERL simulation training framework and adds comprehensive documentation:

Bug Fixes:

  • 🐛 Bug Fix: Fixed control_time_s parameter not being respected during training - episodes were always 10 seconds regardless of configuration
  • 🐛 Bug Fix: Fixed random_block_position flag not being properly passed to the gym environment, preventing cube randomization

Improvements:

  • 📚 Documentation: Added comprehensive training guide (hil_serl_simulation_training_guide_README.md) for HIL-SERL simulation training
  • 🔧 Enhancement: Added extensive debug logging for easier training monitoring and troubleshooting

Key Changes:

  • Implemented TimeLimitWrapper class to properly enforce episode time limits based on control_time_s configuration
  • Added random_block_position parameter to environment configuration and properly passed it to gym environment
  • Enhanced logging throughout training process with detailed episode progress, time tracking, and environment state information

How it was tested

  • Time Limit Fix: Verified that setting "control_time_s": 40.0 in configuration now properly limits episodes to 40 seconds instead of default 10 seconds
  • Cube Randomization Fix: Confirmed that "random_block_position": true now properly randomizes cube positions between episodes
  • Debug Logging: Tested debug outputs during training to ensure proper monitoring of episode progress, time remaining, and environment state
  • Documentation: Verified all commands and configurations in the new README work correctly with the fixed implementation

Test Commands:

# Test recording with new time limit
python -m lerobot.scripts.rl.gym_manipulator --config_path examples/hil_serl_simulation_training/hi_rl_test_gamepad.json

# Test training with both fixes
python -m lerobot.scripts.rl.learner --config_path examples/hil_serl_simulation_training/train_gym_hil_env_gamepad.json
python -m lerobot.scripts.rl.actor --config_path examples/hil_serl_simulation_training/train_gym_hil_env_gamepad.json

How to checkout & try? (for the reviewer)

  1. Test Time Limit Fix:
# Modify control_time_s in hi_rl_test_gamepad.json to 20.0 and verify episodes last 20 seconds
python -m lerobot.scripts.rl.gym_manipulator --config_path examples/hil_serl_simulation_training/hi_rl_test_gamepad.json
  1. Test Cube Randomization:
# Set random_block_position to false in config and verify cube stays in same position
# Set to true and verify cube randomizes between episodes
python -m lerobot.scripts.rl.gym_manipulator --config_path examples/hil_serl_simulation_training/hi_rl_test_gamepad.json
  1. Test Training with Both Fixes:
# Terminal 1: Start learner
python -m lerobot.scripts.rl.learner --config_path examples/hil_serl_simulation_training/train_gym_hil_env_gamepad.json

# Terminal 2: Start actor  
python -m lerobot.scripts.rl.actor --config_path examples/hil_serl_simulation_training/train_gym_hil_env_gamepad.json
  1. Review Documentation:
# Check the new training guide
cat examples/hil_serl_simulation_training/hil_serl_simulation_training_guide_README.md

Expected Behavior:

  • Episodes should respect the control_time_s setting (30s in training config, 40s in recording config)
  • Cube should randomize position when random_block_position: true
  • Debug logs should show detailed episode progress and time tracking

@spirosperos spirosperos changed the title Initial commit HIL-SERL tutorial for simulation fixed and expanded Aug 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants