Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add runtime optimizations for math operations #1733

Merged
merged 7 commits into from
Dec 3, 2024

Conversation

gbrail
Copy link
Collaborator

@gbrail gbrail commented Nov 26, 2024

Add type-optimized linkers for invokedynamic operations focused around math operations.
Many of these are implemented using complex trees of if...then statements in
ScriptRuntime. By having a suite of type-specific linkers, we can skip right
to the correct branch most of the time, while still being able to fall back
to generic operations that work on every type.

This set of optimizations improves a few of the V8 benchmarks from 5% to 30%.
Since lots of Rhino code uses math operations like comparisons even if it is not
"doing lots of math" this should help in lots of places.

return Integer.valueOf((int) r);
try {
return Math.addExact(i1, i2);
} catch (ArithmeticException ae) {
Copy link
Contributor

@rPraml rPraml Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering, if we should inline parts of the "addExact" here and do the fallback instead of throwing an exception?

Definition of Math.addExact:

    @IntrinsicCandidate
    public static int addExact(int x, int y) {
        int r = x + y;
        // HD 2-12 Overflow iff both arguments have the opposite sign of the result
        if (((x ^ r) & (y ^ r)) < 0) {
            throw new ArithmeticException("integer overflow");
        }
        return r;
    }
  • The overflow case is imho rare, but I think it is slower of some orders of magnitude now as before.
  • Overflow detection is a "bit" magic with only one "if" branch.
  • The Math operations are IntrinsicCandidate, so they might be faster than inlining the same java code by hand.

Do you think it is worth to call Math.addExact (and hope that it is faster through the @IntrinsicCandidate) or take only the idea of the overflow detection?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like you, I was assuming that, first, using a function of the Math class directly would be good because it allows JVM implementers in the future to implement it more efficiently, and second, that the overflow case would be very rare and therefore the exception path wouldn't be so bad.

Previously, I had been doing the arithmetic as a long and checking later, and I thought that "addExact" would be faster, but it wasn't. I hadn't benchmarked an overflow, though -- I might try that over the next few days.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Holy crud, that exception behavior is much worse than I thought by two orders of magnitude!

Here are three functions:

private Object addIntsAsLongs(Integer a, Integer b) {
        long r = a.longValue() + b.longValue();
        if ((r > Integer.MAX_VALUE) || (r < Integer.MIN_VALUE)) {
            return Double.valueOf(r);
        }
        return Integer.valueOf((int)r);
   }

private Object addIntsTrickily(Integer a, Integer b) {
        int r = a + b;
        if (((a ^ r) & (b ^ r)) < 0) {
            return Double.valueOf(a.longValue() + b.longValue());
        }
        return r;
  }

private Object addIntsExact(Integer a, Integer b) {
        try {
            return Math.addExact(a, b);
        } catch (ArithmeticException ae) {
            long r = a.longValue() + b.longValue();
            return Double.valueOf((double)r);
        }
}

On my Intel box with Java 21, here are the results of a little benchmark.

With no overflow (2 + 2), all three run in about 0.203 nanoseconds (all are well within the standard deviation). (In other words, trying to do 32-bit addition versus 64-bit addition doesn't really matter.)

With overflow (1<<31 + 1<<31):

  • addIntsAsLongs: 1.648 nanoseconds
  • addIntsTrickily: 1.681 nanoseconds
  • addIntsExact: 7173 nanoseconds (not a typo -- 7.173 microseconds)

So I think I will revert to the code that we had before!

@gbrail
Copy link
Collaborator Author

gbrail commented Nov 30, 2024

I'm not hearing too many comments here so unless someone else wants to try this themselves I'm probably going to merge it in the next day or so. On to new things!

@gbrail gbrail merged commit 822b5c7 into mozilla:master Dec 3, 2024
3 checks passed
@gbrail gbrail deleted the indy-11-faster-math branch December 3, 2024 00:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants