Skip to content

Better pack/unpack kernels#184

Merged
boeschf merged 5 commits intoghex-org:masterfrom
msimberg:better-pack-unpack
Dec 22, 2025
Merged

Better pack/unpack kernels#184
boeschf merged 5 commits intoghex-org:masterfrom
msimberg:better-pack-unpack

Conversation

@msimberg
Copy link
Contributor

@msimberg msimberg commented Oct 24, 2025

This parallelizes the packing/unpacking over levels in addition to indices. This also introduces two separate kernels for levels first/levels last as the access patterns are different.

This is on top of #182, not because it depends on it, but simply because it's easier to test both changes in one branch.


#define GHEX_UNSTRUCTURED_SERIALIZATION_THREADS_PER_BLOCK 32
#define GHEX_UNSTRUCTURED_SERIALIZATION_THREADS_PER_BLOCK_X 32
#define GHEX_UNSTRUCTURED_SERIALIZATION_THREADS_PER_BLOCK_Y 8
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick test with Y=1, and didn't see a major change in performance. I'd keep it at 8. In theory it should allow slightly better reuse of reading local_indices[idx], but the difference may be negligible.

Copy link
Collaborator

@boeschf boeschf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, nice improvement. I am a bit wary about the submodule changes - maybe merge with master first to be sure.

@msimberg
Copy link
Contributor Author

@boeschf I've merged latest master now.

I think the test failures are the same as on master?

@msimberg
Copy link
Contributor Author

I've merged main after the formatting changes. This should again be ready to go.

@boeschf boeschf merged commit 6f961b2 into ghex-org:master Dec 22, 2025
10 of 12 checks passed
@msimberg msimberg deleted the better-pack-unpack branch December 22, 2025 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants