Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sugmanue/rpcv2 improve cbor performance 04 #5564

Merged

Conversation

sugmanue
Copy link
Contributor

@sugmanue sugmanue commented Sep 5, 2024

Motivation and Context

Improvement to the lookup performance of marshallers and traits.

Both traits and marshalling types are open classes but with well known, finite, and, statically defined instances. We can improve the performance of the lookups of these values by enumerating them using an enum and associating the enum values to the concrete types, and then using a two level lookup: the fast level 1 uses the an enum map and the associate enumeration value as key, while the l2 uses a regular map with the values as keys as they are. This change improves not only RPCv2, but all the other protocols as well.

This change aims to reduce, or remove the hump highlighted in the flamegraph below

before-highligthed

Before

Benchmark                              (protocol)  (size)  Mode  Cnt      Score     Error  Units
JsonMarshallerBenchmark.unmarshall  smithy-rpc-v2   small  avgt    5   3415.127 ±  64.585  ns/op
JsonMarshallerBenchmark.unmarshall  smithy-rpc-v2  medium  avgt    5   5668.994 ±  16.272  ns/op
JsonMarshallerBenchmark.unmarshall  smithy-rpc-v2     big  avgt    5  19475.973 ± 133.802  ns/op
JsonMarshallerBenchmark.unmarshall       aws-json   small  avgt    5  10150.744 ±  28.942  ns/op
JsonMarshallerBenchmark.unmarshall       aws-json  medium  avgt    5  21037.573 ± 139.545  ns/op
JsonMarshallerBenchmark.unmarshall       aws-json     big  avgt    5  86466.945 ± 709.876  ns/op

And zoomed in flamegraph

before-enum-map

After

Benchmark                              (protocol)  (size)  Mode  Cnt      Score      Error  Units
JsonMarshallerBenchmark.unmarshall  smithy-rpc-v2   small  avgt    5   3166.243 ±  151.799  ns/op
JsonMarshallerBenchmark.unmarshall  smithy-rpc-v2  medium  avgt    5   5349.115 ±  172.858  ns/op
JsonMarshallerBenchmark.unmarshall  smithy-rpc-v2     big  avgt    5  17517.889 ±   41.898  ns/op
JsonMarshallerBenchmark.unmarshall       aws-json   small  avgt    5  10178.752 ±  167.762  ns/op
JsonMarshallerBenchmark.unmarshall       aws-json  medium  avgt    5  20440.322 ±  223.031  ns/op
JsonMarshallerBenchmark.unmarshall       aws-json     big  avgt    5  83252.862 ± 1159.517  ns/op

And zoomed in flamegraph

after-enum-map

Full flamegraph after this change

after-full

Modifications

Testing

Screenshots (if appropriate)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

Checklist

  • I have read the CONTRIBUTING document
  • Local run of mvn install succeeds
  • My code follows the code style of this project
  • My change requires a change to the Javadoc documentation
  • I have updated the Javadoc documentation accordingly
  • I have added tests to cover my changes
  • All new and existing tests passed
  • I have added a changelog entry. Adding a new entry must be accomplished by running the scripts/new-change script and following the instructions. Commit the new file created by the script in .changes/next-release with your changes.
  • My change is to implement 1.11 parity feature and I have updated LaunchChangelog

License

  • I confirm that this pull request can be released under the Apache 2 license

@sugmanue sugmanue force-pushed the sugmanue/rpcv2-improve-cbor-performance-04 branch from 0d997df to 46530cb Compare September 5, 2024 17:12
@sugmanue sugmanue force-pushed the sugmanue/rpcv2-improve-cbor-performance-04 branch from 46530cb to ca22f22 Compare September 5, 2024 17:44
Copy link

sonarqubecloud bot commented Sep 5, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
49.3% Coverage on New Code (required ≥ 80%)
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarCloud

Catch issues before they fail your Quality Gate with our IDE extension SonarLint

@sugmanue sugmanue marked this pull request as ready for review September 5, 2024 23:47
@sugmanue sugmanue requested a review from a team as a code owner September 5, 2024 23:47
/**
* Creates an L1 traits map. This map is for fast lookup of known traits.
*/
private static Map<TraitType, Trait> createL1Traits(Map<Class<? extends Trait>, Trait> traits) {
Copy link
Contributor

@joviegas joviegas Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be nice if we rename it with something more meaningful like createTraitsWithType and createTraitsWithNoType or something which tells the difference

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like level 1 and level 2 (similar wording as used to describe caches, CPU and others) is more meaningful. Type vs NoType does not say much about why those are different.

*/
private static Map<Class<? extends Trait>, Trait> createL2Traits(Map<Class<? extends Trait>, Trait> traits) {
Map<Class<? extends Trait>, Trait> result = new HashMap<>();
for (Map.Entry<Class<? extends Trait>, Trait> kvp : traits.entrySet()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional : We can concise this as below ,

private static Map<Class<? extends Trait>, Trait> createL2Traits(Map<Class<? extends Trait>, Trait> traits) {
    return traits.entrySet().stream()
        .filter(entry -> entry.getValue().type() == null) 
        .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm purposefully avoiding streams, in the same way that I removed some code using them before, those tend to create garbage and pressure on the garbage collection which is reflected in the benchmarks. For critical code paths I think is better to just use plain old for loops.

DOCUMENT,
;

public static MarshallingKnownType from(Class<?> clazz) {
Copy link
Contributor

@joviegas joviegas Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import java.math.BigDecimal;
import java.time.Instant;
import java.util.*;

public enum MarshallingKnownType {
    NULL,
    STRING,
    INTEGER,
    LONG,
    FLOAT,
    DOUBLE,
    BIG_DECIMAL,
    BOOLEAN,
    INSTANT,
    SDK_BYTES,
    SDK_POJO,
    LIST,
    MAP,
    SHORT,
    BYTE,
    DOCUMENT;


    private static final Map<Class<?>, MarshallingKnownType> TYPE_MAP = new HashMap<>();

    static {
        // Populate the map with class-to-known-type mappings
        TYPE_MAP.put(Void.class, NULL);
        TYPE_MAP.put(String.class, STRING);
        TYPE_MAP.put(Integer.class, INTEGER);
        TYPE_MAP.put(Long.class, LONG);
        TYPE_MAP.put(Float.class, FLOAT);
        TYPE_MAP.put(Double.class, DOUBLE);
        TYPE_MAP.put(BigDecimal.class, BIG_DECIMAL);
        TYPE_MAP.put(Boolean.class, BOOLEAN);
        TYPE_MAP.put(Instant.class, INSTANT);
        TYPE_MAP.put(SdkBytes.class, SDK_BYTES);
        TYPE_MAP.put(SdkPojo.class, SDK_POJO);
        TYPE_MAP.put(List.class, LIST);
        TYPE_MAP.put(Map.class, MAP);
        TYPE_MAP.put(Short.class, SHORT);
        TYPE_MAP.put(Byte.class, BYTE);
        TYPE_MAP.put(Document.class, DOCUMENT);
    }


    public static MarshallingKnownType from(Class<?> clazz) {
        return TYPE_MAP.getOrDefault(clazz, null); // Returns null if class is not found
    }
}

We can avoid the if-else checks by using Hashmap for constant time look up, same for the other Enum

WDYT ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see much difference, using a map would consume memory which will be used only once (when the SdkPojo class is loaded and not longer after that), in that sense using an multi-branch if feels better. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one I suggested also enhances the readability , thus recommended it.

if (clazz == Document.class) {
return DOCUMENT;
}
return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When will we enternull case ?
Why aren't we throwing Exception in this case ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately throwing is not an option, this was released as open ended class, throwing would break backwards compatibility, and, there are uses of this class outside of the known types. For instance, this one.

@@ -69,20 +69,28 @@ public interface MarshallingType<T> {

Class<? super T> getTargetClass();

default MarshallingKnownType getKnownType() {
Copy link
Contributor

@joviegas joviegas Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing something , while doing a get we do below

        public Object get(MarshallingType<?> marshallingType) {
            MarshallingKnownType knownType = marshallingType.getKnownType();
            if (knownType != null) {
                return l1registry.get(knownType);
            }
            return l2registry.get(marshallingType);
        }

since predefined (line 38 to 68) MarshallingType are also predefined in MarshallingKnownType , what cases we will end up in predefined type not present in l1registry but present in l2registry ?
I was thinking if we can remove (line 38 to 68) is redendant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be nice, unfortunately this class is already released as open, so customers can define their own which would go into the L2 registry. See also this other example in our code base.

@sugmanue sugmanue merged commit 7855c93 into feature/master/rpcv2-cbor Sep 10, 2024
16 of 17 checks passed
@sugmanue sugmanue deleted the sugmanue/rpcv2-improve-cbor-performance-04 branch September 10, 2024 16:23
sugmanue added a commit to sugmanue/aws-sdk-java-v2 that referenced this pull request Sep 25, 2024
* Add RPCv2 module

* Add RPCv2 module (aws#5445)

* Comment out empty rpcv2 dependency

* Sync version to 2.26.31-SNAPSHOT

* Sugmanue/add byte support (aws#5477)

* Add support to serialize byte values

* Add tests for byte support

* Address PR comments

* Add rpcv2 protocol core (aws#5496)

* Add support to serialize byte values

* Add RPCv2 protocol core marshalling/unmarshalling

* Address PR comments

* Address PR comments 2

* Address PR comments 3

* Support for operation without input defined (aws#5512)

* Support for operation without input defined

* Fix a checkstyle issue

* Code clean up

* Code clean up 2

* Rewrite the condition to conjunctive normal form

* Add codgen tests (aws#5517)

* Add codgen tests

* Address PR comments

* Address PR comments 2

* Add missing class rename

* Add missing AWS_JSON protocol facts

* Account for null protocol case

* Add RPCv2 benchmark tests (aws#5526)

* Add RPCv2 benchmark tests

* Give the constants name a meaningful name

* Avoid parsing numbers when using RPCv2 protocol (aws#5539)

* Avoid parsing numbers when using RPCv2 protocol

* Refactor to avoid impacting JSON with RPCv2 logic (aws#5544)

* Refactor to avoid impacting JSON with RPCv2 logic

* Avoid making the unmarshallers depend on timestamp formats

* Avoid streams while unmarshalling

* Fix build failures

* Fix build failures 2

* Avoid growing copies of collections of known size (aws#5551)

* Add the new Smithy RPCv2 package

* Sugmanue/rpcv2 improve cbor performance 04 (aws#5564)

* Improve lookup by marshalling type

* Improve trait lookup using TraitType

* Add support for Smithy RPCv2 to the new service scripts (aws#5613)

* Add changelog for the release

* Fix typo in changelog

* Update to next SNAPSHOT version
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants