-
Notifications
You must be signed in to change notification settings - Fork 528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add masked_index_benchmark #2989
Conversation
✅ Deploy Preview for pytorch-fbgemm-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
This pull request was exported from Phabricator. Differential Revision: D61284671 |
This pull request was exported from Phabricator. Differential Revision: D61284671 |
Summary: X-link: facebookresearch/FBGEMM#82 Pull Request resolved: pytorch#2989 This diff adds a benchmark for measuring host-to-device copy performance using `torch.ops.fbgemm.masked_index_put`. The host buffer is a UVM buffer (by default it is `malloc+cudaHostRegister`). Differential Revision: D61284671
45ddc7e
to
b8422e8
Compare
This pull request was exported from Phabricator. Differential Revision: D61284671 |
Summary: X-link: facebookresearch/FBGEMM#82 Pull Request resolved: pytorch#2989 This diff adds a benchmark for measuring host-to-device copy performance using `torch.ops.fbgemm.masked_index_put`. The host buffer is a UVM buffer (by default it is `malloc+cudaHostRegister`). Differential Revision: D61284671
b8422e8
to
2caedf3
Compare
This pull request was exported from Phabricator. Differential Revision: D61284671 |
2caedf3
to
4e318db
Compare
Summary: X-link: facebookresearch/FBGEMM#82 Pull Request resolved: pytorch#2989 This diff adds a benchmark for measuring host-to-device copy performance using `torch.ops.fbgemm.masked_index_put`. The host buffer is a UVM buffer (by default it is `malloc+cudaHostRegister`). Differential Revision: D61284671
Summary: X-link: facebookresearch/FBGEMM#82 Pull Request resolved: pytorch#2989 This diff adds a benchmark for measuring host-to-device copy performance using `torch.ops.fbgemm.masked_index_put`. The host buffer is a UVM buffer (by default it is `malloc+cudaHostRegister`). Reviewed By: jianyuh Differential Revision: D61284671
This pull request was exported from Phabricator. Differential Revision: D61284671 |
4e318db
to
1f9bfb1
Compare
This pull request has been merged in a83b65c. |
Summary:
This diff adds a benchmark for measuring host-to-device copy
performance using
torch.ops.fbgemm.masked_index_put
. The host bufferis a UVM buffer (by default it is
malloc+cudaHostRegister
).Differential Revision: D61284671