Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The purpose of this PR is to re-order the operations in
*
. The primary motivator was to ensure thatq' * q == q * q' == abs2(q)
exactly, even for float operations. The previous arithmetic was dependent on lucky rounding so these equalities were only approximate. I made sure thatOctonion
matchesQuaternion
for these operations when the back half is zero, so that promotion doesn't risk changing arithmetic. I also added methods for scalar multiplication (the compiler wasn't able to compile away the zeros on its own) and some basic tests to ensure that*
doesn't get broken by accident if somebody tries to improve it in the future.There is also some performance improvement due to spending some time fiddling with the ordering of operations.
*(::QuaternionF64,::QuaternionF64)
goes from 4ns down to 3ns on my machine (and with #75 it looks like it might drop further to the 2.5ns ballpark). In the process I tried to getabs2
to be as fast as possible, since #75 might make use of it. I changedabs
to track these changes.There are alternative orderings that could permit some
fma
instructions without breaking the target equalities and presumably be more accurate (and slightly faster on machines with nativefma
), but when I experimented withfma
versions they weren't generating faster code. Also, it's not easy to add "fma only if native" operations right now (muladd
can re-associate adjacent operations, not just the arguments). Maybe somebody revisits this in the future, but the gains won't be huge.