Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On resetting a state to a given state / Extracting image observation from past state? #439

Closed
famishedrover opened this issue Aug 10, 2023 · 12 comments

Comments

@famishedrover
Copy link

Suppose I sample several states in the past and store them. Is it possible to reset the environment object to a given past state quickly? I am flexible is storing necessary state information to make the re-store possible. However the solution must have low space complexity. For example, storing the observation (which is a few floats) is okay, but storing the mujoco sim state can be very expensive.

Motivation :

I care about obtaining the image observation from a past state. Since storing images on the go is not feasible for my setup, I wanted to explore solutions where I can store less data intensive information like the observation etc. that is enough to exactly restore the environment state to get the image (or if there are other faster methods to get the image back).

Thank you!

@famishedrover
Copy link
Author

One other solution that works is to store the sequence of actions taken after calling reset() upto the state I wish to save. Then replaying the same action sequence gets me the correct image representation.

When I try the following :

from metaworld.envs import (ALL_V2_ENVIRONMENTS_GOAL_OBSERVABLE,
                            ALL_V2_ENVIRONMENTS_GOAL_HIDDEN)
                            # these are ordered dicts where the key : value
                            # is env_name : env_constructor

import numpy as np

door_open_goal_observable_cls = ALL_V2_ENVIRONMENTS_GOAL_OBSERVABLE["door-open-v2-goal-observable"]
door_open_goal_hidden_cls = ALL_V2_ENVIRONMENTS_GOAL_HIDDEN["door-open-v2-goal-hidden"]

env1 = door_open_goal_observable_cls(seed=5)
env2 = door_open_goal_observable_cls(seed=5)

env1.reset()
env2.reset()
env1.render_mode = 'rgb_array'
env2.render_mode = 'rgb_array'

acs = []
for ix in range(10):
    acs.append(env1.action_space.sample())
    res = env1.step(acs[-1])

obs = env1.render()   
state = env1.get_env_state() 

for ix in acs : 
    res2 = env2.step(ix)

obs2 = env2.render()

env2.reset()
env2.set_env_state(state)
obs3 = env2.render()

print ((obs == obs2).all())
print ((obs == obs3).all())
print ((obs2 == obs3).all())

I get

True 
False 
False

@reginald-mclean
Copy link
Collaborator

What version of Meta-World are you using?

@famishedrover
Copy link
Author

This is the commit I used to install (should be the latest one as I did it 4 days back.)

metaworld @ git+https://github.com/Farama-Foundation/Metaworld.git@d155d0051630bb365ea6a824e02c66c068947439

I have had similar issues with v1 metaworld.

@reginald-mclean
Copy link
Collaborator

It looks like you are using the Mujoco based (not mujoco-py) Meta-World where get_state and set_state don't function properly because of the change in bindings. As of right now the easiest thing to do would probably be something similar to the code you posted above: seed the environment with a specific seed, store actions, recreate the environment with that seed and apply the stored actions

@famishedrover
Copy link
Author

I can switch to the mujoco-py based one if it works there & you can share to how init envs through that.

By tinkering some code I could save the mujoco internal state ( env.unwrapped.wrapped_env.__getstate__(), __setstate__() ) & then recover the state but its very large in size ( at par with images so not usable for my usecase )

@reginald-mclean
Copy link
Collaborator

If you go back to the most recent commit before the bindings were changed, or use the v2.0.0 release zip, the code you posted above will work. There's no API changes

@famishedrover
Copy link
Author

famishedrover commented Aug 14, 2023

Ok version 2.0.0 ftw!
I had to change the render line to obs = env1.render('rgb_array') but as you said this works!

Do you have a timeline on when the get_state() etc for the most recent version can be corrected? I would be willing to pitch in if you want!

Quick solution for other readers :
I am using :
pip install git+https://github.com/Farama-Foundation/Metaworld.git@b2a4cbb98e20081412cb4cc7ae3d4afc456a732a
and fixing some mujoco version issue with this solution.

@reginald-mclean
Copy link
Collaborator

If you want to try and tackle it, create a PR for it when it's complete. We also have an issue with using EZPickle #426 that could be related. Just don't have time to look into it.

@famishedrover
Copy link
Author

famishedrover commented Aug 24, 2023

@reginald-mclean it seems that there is some bug either in mujoco bin or metaworld. The above works for me fine on macos (intel) but fails on Ubuntu (20). I can confirm that I'm running the same python version & the same metaworld version in both the cases, but when running on Ubuntu I keep getting all False.

Do you have any idea why?

Additional :
when I reset & obtain rgb_array the image obs is different for two env objects init with same seed.
two subsequent calls to env.render gives different image for the same env object. [This problem exists for mujoco210 i.e. mujoco_py 2.1 and mujoco200 i.e. mujoco_py2.0]

@famishedrover
Copy link
Author

While a proper fix comes along :
the issue happens because glfw is not being used in headless opengl. The fix is

  1. start a fake screen using
    export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libGLEW.so 
    xvfb-run -a -s "-screen 0 1400x900x24" bash
    
  2. Get image observation using
           a = mujoco_py.MjRenderContextOffscreen(self.sim, 0)
           a.render(*resolution)
           x = a.read_pixels(*resolution, depth=False)
           return x
          ```
    

@reginald-mclean
Copy link
Collaborator

reginald-mclean commented Aug 25, 2023

If it fixes the bug, it's not actually a bug. I think what happens is that your mac has a frame buffer that it can use to render, the headless Ubuntu you're using doesn't. The "xvfb-run -a" command does exactly that, creates a virtual frame buffer you can use. You can also use xvfb-run -a for running Python scripts by adding it to your command (ie xvfb-run -a python myFile.py)

@famishedrover
Copy link
Author

The key issue is that sim.render() method in mujocopy gives two different images even when run consecutively for the same sim state. This is unexpected afaik. This is resolved when I instead use the mjviewer & grab the image from it ( using xvfb ) instead of rendering in offscreen mode as is the default case in mujocopy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants