ensure WinSyncHttpClient aws-chunk size is at least 8 KB #2893
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue #, if available: N/A
Description of changes:
aws-chunked encoding documentation is pretty scarce, the best article that I could find - https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming.html - says the chunk size must be at least 8 KB.
In WinSyncHttpClient, chunk size essentially is defined by
bytesToRead
.Before the fix, the client subtracted 8 bytes for chunk metadata from an already allocated 8 KB buffer, resulting in a chunk size of less than 8 KB (8192 - 8 = 8184 bytes).
And, for example, AWS S3 service endpoint indeed returned an error when SDK PutObject() made multiple chunk upload (if using flexible checksums and object size > 8184):
trace log snippet
Proposed fix adds 8 bytes at the moment of buffer allocation.
Because WinSyncHttpClient::StreamPayloadToRequest() currently handles aws-chunked encoding, its streamBuffer should be prepared to handle both chunked and non-chunked payloads. A tradeoff for non-chunked is extra 8 bytes allocated on the stack, addressing this using alloca() or vector would create more problems than it solves.
Possible alternative is to use multiple DoWriteData() calls or even move aws-chunked content encoding handling to DoWriteData(), as is already done for chunked transfer encoding. But this would be a bigger change.
Another alternative, easier to implement but probably harder to justify, is simply:
Basically, use any buffer length that’s bigger than 8 KB + chunked metadata length. This is how CURL client works despite having the same logic. I just took 64 KB because it’s recommended in the linked AWS doc.
I think this can be tested only with integration tests. Personally, I tested with modified BucketAndObjectOperationTest.TestFlexibleChecksums. If it’s ok then I can add that to this PR as well.
Check all that applies:
Check which platforms you have built SDK on to verify the correctness of this PR.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.