Save observation history config with checkpoint and use it on play#901
Open
saikishor wants to merge 2 commits intomujocolab:mainfrom
Open
Save observation history config with checkpoint and use it on play#901saikishor wants to merge 2 commits intomujocolab:mainfrom
saikishor wants to merge 2 commits intomujocolab:mainfrom
Conversation
louislelay
reviewed
Apr 12, 2026
Collaborator
louislelay
left a comment
There was a problem hiding this comment.
Hi @saikishor, drive-by comment: do we actually need to persist the critic's observation history here? At eval time we only roll out the actor, don't we?
Contributor
Author
Yes, we do only that. I thought of updating both to have consistency. Anyway, I'm right now doing a different way to not save only history length but also to persist other parts |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR allows to save the observation history of actor or critic groups or at the term level in the checkpoints configuration, and this can be later reused to setup the observation while playing.
Right now, the user has to edit the files to set the history length for playing the checkpoint. If not, the following error appears
With the proposed change, we save the checkpoint with the following information:
Saving checkpoint to logs/rsl_rl/g1_velocity/2026-04-12_10-05-00_mjlab_1204_g1_lin_vel_obs_hist/model_550.pt with obs_history_cfg: {'actor': {'history_length': None, 'flatten_history_dim': True, 'terms': {'base_lin_vel': {'history_length': 5, 'flatten_history_dim': True}, 'base_ang_vel': {'history_length': 0, 'flatten_history_dim': True}, 'projected_gravity': {'history_length': 0, 'flatten_history_dim': True}, 'joint_pos': {'history_length': 0, 'flatten_history_dim': True}, 'joint_vel': {'history_length': 0, 'flatten_history_dim': True}, 'actions': {'history_length': 0, 'flatten_history_dim': True}, 'command': {'history_length': 0, 'flatten_history_dim': True}}}, 'critic': {'history_length': None, 'flatten_history_dim': True, 'terms': {'base_lin_vel': {'history_length': 5, 'flatten_history_dim': True}, 'base_ang_vel': {'history_length': 0, 'flatten_history_dim': True}, 'projected_gravity': {'history_length': 0, 'flatten_history_dim': True}, 'joint_pos': {'history_length': 0, 'flatten_history_dim': True}, 'joint_vel': {'history_length': 0, 'flatten_history_dim': True}, 'actions': {'history_length': 0, 'flatten_history_dim': True}, 'command': {'history_length': 0, 'flatten_history_dim': True}, 'foot_height': {'history_length': 0, 'flatten_history_dim': True}, 'foot_air_time': {'history_length': 0, 'flatten_history_dim': True}, 'foot_contact': {'history_length': 0, 'flatten_history_dim': True}, 'foot_contact_forces': {'history_length': 0, 'flatten_history_dim': True}}}}I observed the issue when I tried to launch a training of one of my observation terms to 5, and then tried to run the checkpoint it failed until I explicitly set the history length of the particular term to the appropriate value