Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Torch deterministic #17

Open
HaoxiangYou opened this issue Oct 14, 2024 · 0 comments
Open

Torch deterministic #17

HaoxiangYou opened this issue Oct 14, 2024 · 0 comments

Comments

@HaoxiangYou
Copy link

HaoxiangYou commented Oct 14, 2024

Thank you for providing this awesome repo!

I try to make results consistent between different runs via seeding(seed, torch_deterministic=True) .

It is known torch has some broadcasting issue with deterministic algorithm: pytorch/pytorch#79987

So, I manually fix the broadcasting in each environment. e.g. In the envs/ant Line 204-206, I change the code to self.state.joint_q.view(self.num_envs, -1)[env_ids, 3:7] = self.start_rotation.clone().unsqueeze(0).expand(len(env_ids), -1) self.state.joint_q.view(self.num_envs, -1)[env_ids, 7:] = self.start_joint_q.clone().unsqueeze(0).expand(len(env_ids), -1) self.state.joint_qd.view(self.num_envs, -1)[env_ids, :] = torch.zeros(size=(len(env_ids), self.num_joint_qd), device = self.device)

After these changes, I run the experiments with/without torch_deterministic=True, e.g. below is the ant test where the blue one is without torch_deterministic=True and orange one with torch_deterministic=True

image

The non-deterministic run is similar to the paper results, however, for the deterministic setting, the rewards remain unchanged.

Does someone have ideas about what else the issue may torch_deterministic=True bring? Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant