storage: adapt Netty Reactor HTTP client as GCS storage client #410

jjaakola-aiven · 2025-09-23T12:51:33Z

Notes:

Uses direct memory buffers. Recommend to run Diskless with "-Dio.netty.maxDirectMemory=0" to have the Netty cleaner running.
Has static 96 max connections pool.
Has static 32 worker thread pool.
"SO_KEEPALIVE" set for sockets and keep alive header for HTTP.
Compression disabled, producer compression recommended and compressing again likely not beneficial.
GCS client handles redirects, Netty Reactor client following disabled.
Can use static BoringSSL library to offload SSL to OpenSSL.
Zero-copy until the response handling where direct memory buffer bytes are copied to heap manager byte array.

Notes: * Uses direct memory buffers. Recommend to run Diskless with "-Dio.netty.maxDirectMemory=0" to have the Netty cleaner running. * Has static 96 max connections pool. * Has static 32 worker thread pool. * "SO_KEEPALIVE" set for sockets and keep alive header for HTTP. * Compression disabled, producer compression recommended and compressing again likely not beneficial. * GCS client handles redirects, Netty Reactor client following disabled. * Can use static BoringSSL library to offload SSL to OpenSSL. * Zero-copy until the response handling where direct memory buffer bytes are copied to heap manager byte array.

agrawal-siddharth

Here's some feedback and a suggested approach from Google GCS SDK team.

GCS has a gRPC based API, which the GCS Java SDK supports[1] out of the box. My recommendation would be for them to use that if they are wanting an async transport core where we've already done the work to bridge async to streaming (and sometimes non-blocking) channels and to use less CPU compared to the default NetHttpTransport.

I'm curious what they hope to gain by using netty as the lower level transport layer. Netty can be a great http client, but the GCS java SDK is not async in the vast majority of operations and will block the invoking thread as necessary, additionally we've already done a notable amount of work the past few years to reduce memory usage and unnecessary allocations to the heap.

That said, if I were to attempt something like this, I would attempt to get it working in the test suite[1] we already have for the GCS Java SDK. Create a branch of the repo, then modify HttpStorageOptions.HttpStorageDefault#getDefaultTransportOptions()[2] to return their Netty based transport. Then running the integration test suite mvn -Dmaven.test.skip.exec=true -Penable-integration-tests clean verify

From a superficial standpoint, one challenge they will likely run into is the need for streaming of large amount of bytes. The GCS Java SDK does not have any client side limits on how large objects or their streams can be, this can be a challenge for Netty due to Netty operating primarily on ByteBufs. I know of users who are uploading multiple gigabyte objects on a single stream, and similarly reading many gigabytes in a streaming mode often with application backpressure.

[1] https://cloud.google.com/storage/docs/enable-grpc-api
[2] https://github.com/googleapis/java-storage/tree/main/google-cloud-storage/src/test/java/com/google/cloud/storage
[3] https://github.com/googleapis/java-storage/blob/main/google-cloud-storage/src/main/java/com/google/cloud/storage/HttpStorageOptions.java#L341-L343

jjaakola-aiven · 2025-10-10T13:12:52Z

@agrawal-siddharth Thank you for the comments. The intent here is to reduce CPU usage and byte copying when loading from GCS, so main change is the possibility to offload the SSL handling to OpenSSL. The SSL handling is dominant in the CPU usage graphs. Also the Java HTTP client used by the GCS client by default is not very easy to control, so this provides better control. I'll definitely look the gRPC option which seems to be generally available now and would provide single socket and multiple streams.

The size of the files is not very large, based on the benchmarks I have run they vary between 2 MiB and 8 MiB, upper limit is 16 MiB. But I agree that for general use case this must be able to handle large files.

github-actions · 2026-01-09T03:53:12Z

This PR is being marked as stale since it has not had any activity in 90 days. If you
would like to keep this PR alive, please leave a comment asking for a review. If the PR has
merge conflicts, update it with the latest from the base branch.

If you are having difficulty finding a reviewer, please reach out on the [mailing list](https://kafka.apache.org/contact).

If this PR is no longer valid or desired, please feel free to close it. If no activity occurs in the next 30 days, it will be automatically closed.

jjaakola-aiven force-pushed the jjaakola-aiven-gcs-storage-use-reactor-netty-http-client branch from 239c92c to 697fdfd Compare September 23, 2025 12:52

agrawal-siddharth reviewed Oct 7, 2025

View reviewed changes

github-actions bot added the stale label Jan 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

storage: adapt Netty Reactor HTTP client as GCS storage client #410

storage: adapt Netty Reactor HTTP client as GCS storage client #410

Uh oh!

jjaakola-aiven commented Sep 23, 2025 •

edited

Loading

Uh oh!

agrawal-siddharth left a comment

Uh oh!

jjaakola-aiven commented Oct 10, 2025

Uh oh!

github-actions bot commented Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

storage: adapt Netty Reactor HTTP client as GCS storage client #410

Are you sure you want to change the base?

storage: adapt Netty Reactor HTTP client as GCS storage client #410

Uh oh!

Conversation

jjaakola-aiven commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

agrawal-siddharth left a comment

Choose a reason for hiding this comment

Uh oh!

jjaakola-aiven commented Oct 10, 2025

Uh oh!

github-actions bot commented Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jjaakola-aiven commented Sep 23, 2025 •

edited

Loading