Skip to content
This repository has been archived by the owner on Mar 20, 2024. It is now read-only.

Provided vdecompress example may not work with SEW=8, VLMAX>256 #893

Open
zingaburga opened this issue Jun 22, 2023 · 0 comments
Open

Provided vdecompress example may not work with SEW=8, VLMAX>256 #893

zingaburga opened this issue Jun 22, 2023 · 0 comments

Comments

@zingaburga
Copy link

The provided example to simulate a 'vdecompress' instruction seems like it'd behave unexpectedly under certain configurations.

From what I can gather, with SEW=8, viota will wrap around after 255, and vrgather can only access the first 256 bytes of the vector. Thus the provided sequence will only work for SEW>8, or if VLMAX has been explicitly checked and found to be <=256.
A length-agnostic approach for SEW=8 might be: change SEW to 16-bit + double LMUL, viota, change back to SEW=8, then vrgatherei16 (this only works for LMUL<=4, but LMUL is user-controllable).

If my understanding is correct, I recommend clarifying this in the documentation.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant