-
Notifications
You must be signed in to change notification settings - Fork 1
FourSimdBlocks for 16-byte archs #19
Copy link
Copy link
Open
Labels
enhancementNew feature or requestNew feature or requesttriageWaiting for owner's inputWaiting for owner's input
Description
Is your feature request related to a problem? Please describe.
On a theoretical 64-bit target with SSE2 support it might be useful to consider four SIMD blocks at a time, since the SSE2 vectors are 16-byte wide, so a combined movemask on four of them gives a single 64-bit int.
Describe the solution you'd like
There is a FourSimdBlocks type that is guaranteed to be four times larger than SimdBlock. It should have an impl of a
pub fn blocks(&self) -> (&AlignedBlock<SimdBlock>, &AlignedBlock<SimdBlock>, &AlignedBlock<SimdBlock>, &AlignedBlock<SimdBlock>)function, giving all the individual blocks.
It might also be beneficial to have a function that returns two TwoSimdBlock-aligned blocks.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requesttriageWaiting for owner's inputWaiting for owner's input