Time limits rework, bug fixes
Time limits rework
We have always felt for a long time that the original design of the time limit was severely limited as it did not allow this parameter to be changed after the environment was created, since it was defined by a method that was somehow fixedly dependent only on the environment parameters. For example:
def time_limit(self, params: EnvParamsT) -> int:
return 3 * params.height * params.width
What if someone wanted to choose a custom time limit? The only way is to subclass the environment and change the time limit method (or use wrapper, which is basically equivalent).
In this release we made this easier as time limit is now determined by just an env parameter max_steps
(similar to MiniGrid):
class EnvParams(struct.PyTreeNode):
height: int = struct.field(pytree_node=False, default=9)
width: int = struct.field(pytree_node=False, default=9)
view_size: int = struct.field(pytree_node=False, default=7)
max_steps: Optional[None] = struct.field(pytree_node=False, default=None) # NEW!
render_mode: str = struct.field(pytree_node=False, default="rgb_array")
Default time limit handling (all other environments were changed in a similar manner):
def default_params(self, **kwargs) -> XLandEnvParams:
params = XLandEnvParams(view_size=5)
params = params.replace(**kwargs)
if params.max_steps is None:
params = params.replace(max_steps=3 * (params.height * params.width))
return params
Now max_steps
can be changed after the initialization, although it is not a pytree node and can not be vmaped over.
What's Changed
- Allow XLand benchmarks as a single task envs in single-task ppo by @Howuhh in #13
- fix evaluation in standalone by @Howuhh in #14
- Change tile grid cell conversion in render function by @afspies in #16
- small improvements by @Howuhh in #19
- Time limits rework by @Howuhh in #20
New Contributors
Full Changelog: v0.7.0...v0.8.0