Potential missed optimization in C2 JIT compiler #10609

zijian-yi · 2025-01-31T19:45:14Z

Describe the issue

Potential missed optimization in GraalVM C2 JIT compiler

Steps to reproduce the issue

Here is the program:

public final class Sum {
    private double sum;
    private double comp;

    public Sum(final double initialValue) {
        sum = initialValue;
        comp = 0.1;
    }
    static double twoSumLow(double a, double b, double sum) {
        final double bVirtual = sum - a;
        return (a - (sum - bVirtual)) + (b - bVirtual);
    }
    public void add(final double t) {
        final double newSum = (sum % comp);
        comp += twoSumLow(t, comp, newSum);
        sum += comp;
    }
    public static void main(String[] args) {
        int N = 50000000;
        Sum s = new Sum(1.0);
        for (int i = 0; i < N; ++i) {
            s.add(0.1);
        }
        // System.out.println(s.sum);
    }
}

Run the program with C1 and C2 respectively:

# Set $JAVA_HOME to corresponding JDKs before running
javac Sum.java
time java -XX:TieredStopAtLevel=1 Sum
time java -XX:TieredStopAtLevel=4 Sum

Below is the result I got on my machine (the exact numbers vary depending on the machine, but the performance difference should be noticeable, try increasing N if not):

Oracle 21:
java -XX:TieredStopAtLevel=1 Sum  5.52s user 0.02s system 100% cpu 5.535 total
java -XX:TieredStopAtLevel=4 Sum  5.58s user 0.01s system 100% cpu 5.573 total

Oracle 23:
java -XX:TieredStopAtLevel=1 Sum  0.68s user 0.01s system 100% cpu 0.692 total
java -XX:TieredStopAtLevel=4 Sum  0.72s user 0.02s system 100% cpu 0.737 total

Graal 25:
java -XX:TieredStopAtLevel=1 Sum  0.71s user 0.03s system 102% cpu 0.714 total
java -XX:TieredStopAtLevel=4 Sum  5.82s user 0.02s system 101% cpu 5.774 total

It looks like Oracle 23 (HotSpot JIT compiler) adds some new optimization(s), making the program run much faster. Such optimization(s) are not present in GraalVM yet.

Describe GraalVM and your environment:

GraalVM version: CE 25.0.0-dev-20250122_1329, 23.0.1+11.1
JDK major version: 25, 23
OS: Ubuntu 20.04
Architecture: AMD64

The text was updated successfully, but these errors were encountered:

davleopo · 2025-02-03T08:43:40Z

Thanks @zijian-yi for the report. We will have a look.

A minor side note for future performance reports - for Java benchmarking, especially micro benchmarks there is the jmh harness. https://openjdk.org/projects/code-tools/jmh/ a harness that helps to write benchmarks like you did here. There are many advantages of jmh - too many to enumerate them here in a simple comment but the most important one is that it makes benchmarking very small programs more reliable. The reproducer you shared is very small - a micro benchmark. Such programs tend to behave sometimes very non-intuitive with JVMs. If you have not heard of jmh yet maybe consider a short tutorial - https://www.baeldung.com/java-microbenchmark-harness .

Do you maybe have free cycles to port your reproducer to a jmh micro before we have a look ?

zijian-yi · 2025-02-03T17:20:27Z

Thanks for the advice @davleopo . I have heard of the tool but haven't used it much.
Here is a reproducer using JMH:

package org.sample;

import org.openjdk.jmh.annotations.*;
import java.util.concurrent.TimeUnit;

@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(3)
public class Sum {
    private static double sum = 1.0;
    private static double comp = 0.1;

    static double twoSumLow(double a, double b, double sum) {
        final double bVirtual = sum - a;
        return (a - (sum - bVirtual)) + (b - bVirtual);
    }

    @Benchmark
    public void add() {
        final double newSum = (sum % comp);
        comp += twoSumLow(0.1, comp, newSum);
        sum += comp;
    }
}

Reuslts:

graalvm-community-openjdk-25+5.1:
Benchmark  Mode  Cnt    Score   Error  Units
Sum.add    avgt   15  102.889 ± 0.098  ns/op

HotSpot build 23.0.2+7-58:
Benchmark  Mode  Cnt   Score   Error  Units
Sum.add    avgt   15  13.122 ± 0.091  ns/op

davleopo · 2025-02-04T09:54:13Z

@zijian-yi thanks for porting this to jmh. We will have a look.

rmosaner · 2025-02-05T13:09:00Z

Thank you for the reproducer. We found that the floating point modulo operation causes the slowdown.

This is tracked internally as [GR-61951]

zijian-yi added the bug label Jan 31, 2025

zijian-yi changed the title ~~Potential missed optimization in C2 JIT compiler.~~ Potential missed optimization in C2 JIT compiler Jan 31, 2025

davleopo added the compiler label Feb 3, 2025

rmosaner self-assigned this Feb 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential missed optimization in C2 JIT compiler #10609

Potential missed optimization in C2 JIT compiler #10609

zijian-yi commented Jan 31, 2025 •

edited

Loading

davleopo commented Feb 3, 2025

zijian-yi commented Feb 3, 2025 •

edited

Loading

davleopo commented Feb 4, 2025

rmosaner commented Feb 5, 2025

Potential missed optimization in C2 JIT compiler #10609

Potential missed optimization in C2 JIT compiler #10609

Comments

zijian-yi commented Jan 31, 2025 • edited Loading

davleopo commented Feb 3, 2025

zijian-yi commented Feb 3, 2025 • edited Loading

davleopo commented Feb 4, 2025

rmosaner commented Feb 5, 2025

zijian-yi commented Jan 31, 2025 •

edited

Loading

zijian-yi commented Feb 3, 2025 •

edited

Loading