-
Notifications
You must be signed in to change notification settings - Fork 1
IA-4893: don't materialize the entire response #62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
when(targetHttpResponse.getStatusCode()).thenReturn(200); | ||
when(context.getResponse()).thenReturn(listenerResponse); | ||
when(targetHttpResponse.getCallerResponseOutputStream()).thenReturn(responseStream); | ||
|
||
processor.writeTargetResponseOnCaller(targetHttpResponse); | ||
|
||
verify(responseStream).close(); | ||
verify(mockBody).close(); | ||
verify(spyBody).close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what was going on with this test. It previous used mockBody
, which was a @Mock InputStream
… but at no point was that mock set up to contain any content or have its .read()
etc. methods give any responses. This is also the only test that used mockBody
.
So, I changed it to use a spy with some actual content, and made the spy local to the test.
EDIT: I should note that before the spy-related changes, this test failed in the presence of the runtime StreamUtils.copy
change
|
||
assertThat(new String(responseData.getValue()), equalTo(BODY_CONTENT)); | ||
// StreamUtils.copy buffers its writes in arrays of 4096 bytes (by default). Since this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naive question, but would it be possible / useful to set up a test response that is on the order of size of what triggered the OOM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added new unit tests for this.
This approach looks good to me. |
|
logging: | ||
level: | ||
com.microsoft.azure.relay.RelayLogger: WARN | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewers: how do you feel about this change?
Before this PR, downloading my test TSV generated ~24 lines of logs:
-05-14 15:06:54.496 INFO 1 --- [t@1995042520-31] com.microsoft.azure.relay.RelayLogger : ClientWebSocket(TrackingId:f815e59d-af78-4777-979c-06d0a0651e6e, Address:sb://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): received text from remote. Total length: 2900
2024-05-14 15:06:54.496 INFO 1 --- [t@1995042520-31] com.microsoft.azure.relay.RelayLogger : HybridHttpConnection(TrackingId:384de61a-5f6a-42ab-a838-dba2ed2da761_G3, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): request initializing.
2024-05-14 15:06:54.498 INFO 1 --- [ Thread-35] com.microsoft.azure.relay.RelayLogger : HybridHttpConnection(TrackingId:384de61a-5f6a-42ab-a838-dba2ed2da761_G3, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): Request method: GET.
2024-05-14 15:06:54.498 INFO 1 --- [ Thread-35] com.microsoft.azure.relay.RelayLogger : HybridHttpConnection(TrackingId:384de61a-5f6a-42ab-a838-dba2ed2da761_G3, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): Invoking user RequestHandler.
2024-05-14 15:06:54.499 INFO 1 --- [oundedElastic-2] o.b.l.r.inspectors.InspectorsProcessor : Inspection result for the HTTP request. Result: true, URI:sb://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net/wds-fb5608c8-66ee-47f6-9ce2-e663fec12442-fb5608c8-66ee-47f6-9ce2-e663fec12442/fb5608c8-66ee-47f6-9ce2-e663fec12442/tsv/v0.2/sample
2024-05-14 15:06:54.542 INFO 1 --- [oundedElastic-5] o.b.l.relay.inspectors.RequestLogger : RELAY_REQUEST_RESPONSE - 70.16.76.24 "9ec01dd3-ddec-4273-8e34-39fdd2a480ab" "-" "davidan.dev@gmail.com" "-" [2024-05-14T15:06:54.540322994Z] "GET sb://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net/wds-fb5608c8-66ee-47f6-9ce2-e663fec12442-fb5608c8-66ee-47f6-9ce2-e663fec12442/fb5608c8-66ee-47f6-9ce2-e663fec12442/tsv/v0.2/sample" 200 "https://bvdp-saturn-dev.appspot.com/" "https://bvdp-saturn-dev.appspot.com" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36"
2024-05-14 15:06:55.007 INFO 1 --- [oundedElastic-5] com.microsoft.azure.relay.RelayLogger : HybridHttpConnection(TrackingId:384de61a-5f6a-42ab-a838-dba2ed2da761_G3, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net)+ResponseStream+ResponseStream: WriteAsync(count=1963561)
2024-05-14 15:06:55.007 INFO 1 --- [oundedElastic-5] com.microsoft.azure.relay.RelayLogger : HybridHttpConnection(TrackingId:384de61a-5f6a-42ab-a838-dba2ed2da761_G3, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net)+ResponseStream+ResponseStream: FlushCoreAsync(reason=BUFFER_FULL)
2024-05-14 15:06:55.007 INFO 1 --- [oundedElastic-5] com.microsoft.azure.relay.RelayLogger : HybridHttpConnection(TrackingId:384de61a-5f6a-42ab-a838-dba2ed2da761_G3, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): Creating the rendezvous connection.
2024-05-14 15:06:55.009 INFO 1 --- [ool-1-thread-16] com.microsoft.azure.relay.RelayLogger : ClientWebSocket(TrackingId:384de61a-5f6a-42ab-a838-dba2ed2da761_G3, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net) is connecting.
2024-05-14 15:06:55.406 INFO 1 --- [t@263943838-114] com.microsoft.azure.relay.RelayLogger : ClientWebSocket(TrackingId:384de61a-5f6a-42ab-a838-dba2ed2da761_G3, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net) is connected.
2024-05-14 15:06:55.407 INFO 1 --- [ Thread-44] com.microsoft.azure.relay.RelayLogger : HybridHttpConnection(TrackingId:384de61a-5f6a-42ab-a838-dba2ed2da761_G3, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): Sending the response command on the rendezvous connection, status: 200.
2024-05-14 15:06:55.407 INFO 1 --- [ Thread-44] com.microsoft.azure.relay.RelayLogger : ClientWebSocket(TrackingId:384de61a-5f6a-42ab-a838-dba2ed2da761_G3, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): starting to write to remote. Writemode: TEXT
2024-05-14 15:06:55.408 INFO 1 --- [ool-1-thread-18] com.microsoft.azure.relay.RelayLogger : ClientWebSocket(TrackingId:384de61a-5f6a-42ab-a838-dba2ed2da761_G3, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): finished writing 774 bytes to remote.
2024-05-14 15:06:55.504 INFO 1 --- [ool-1-thread-18] com.microsoft.azure.relay.RelayLogger : HybridHttpConnection(TrackingId:384de61a-5f6a-42ab-a838-dba2ed2da761_G3, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): Finished sending the response command on the rendezvous connection, status: 200.
2024-05-14 15:06:55.504 INFO 1 --- [ool-1-thread-18] com.microsoft.azure.relay.RelayLogger : ClientWebSocket(TrackingId:384de61a-5f6a-42ab-a838-dba2ed2da761_G3, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): starting to write to remote. Writemode: BINARY
2024-05-14 15:06:56.010 INFO 1 --- [ool-1-thread-18] com.microsoft.azure.relay.RelayLogger : ClientWebSocket(TrackingId:384de61a-5f6a-42ab-a838-dba2ed2da761_G3, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): finished writing 1963561 bytes to remote.
2024-05-14 15:06:56.010 INFO 1 --- [ool-1-thread-18] com.microsoft.azure.relay.RelayLogger : HybridHttpConnection(TrackingId:384de61a-5f6a-42ab-a838-dba2ed2da761_G3, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): Sending 0 bytes on the rendezvous connection
2024-05-14 15:06:56.010 INFO 1 --- [oundedElastic-5] com.microsoft.azure.relay.RelayLogger : HybridHttpConnection(TrackingId:384de61a-5f6a-42ab-a838-dba2ed2da761_G3, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net)+ResponseStream is closing.
2024-05-14 15:06:56.010 INFO 1 --- [oundedElastic-5] com.microsoft.azure.relay.RelayLogger : HybridHttpConnection(TrackingId:384de61a-5f6a-42ab-a838-dba2ed2da761_G3, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net) is closing.
2024-05-14 15:06:56.010 INFO 1 --- [oundedElastic-5] com.microsoft.azure.relay.RelayLogger : ClientWebSocket(TrackingId:384de61a-5f6a-42ab-a838-dba2ed2da761_G3, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): is closing. Close reason: NormalClosure
2024-05-14 15:06:57.425 INFO 1 --- [t@263943838-112] com.microsoft.azure.relay.RelayLogger : ClientWebSocket(TrackingId:384de61a-5f6a-42ab-a838-dba2ed2da761_G3, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): is closed. Close reason: Normal Closure
2024-05-14 15:06:57.425 INFO 1 --- [t@263943838-112] com.microsoft.azure.relay.RelayLogger : HybridHttpConnection(TrackingId:384de61a-5f6a-42ab-a838-dba2ed2da761_G3, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net) is closed.
2024-05-14 15:06:57.426 INFO 1 --- [oundedElastic-5] o.b.l.r.t.RelayedRequestPipeline : Processed request with the following result: SUCCESS
After this PR, since we buffer writes to the caller, my test TSV exploded into 8359 lines of logs, with a ton of repetition like:
2024-05-14 15:03:27.710 INFO 1 --- [pool-1-thread-8] com.microsoft.azure.relay.RelayLogger : HybridHttpConnection(TrackingId:1e926a8c-88ad-443f-ac2d-9d13828cda7f_G26, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): Sending 0 bytes on the rendezvous connection
2024-05-14 15:03:27.710 INFO 1 --- [oundedElastic-3] com.microsoft.azure.relay.RelayLogger : HybridHttpConnection(TrackingId:1e926a8c-88ad-443f-ac2d-9d13828cda7f_G26, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net)+ResponseStream+ResponseStream: WriteAsync(count=975)
2024-05-14 15:03:27.710 INFO 1 --- [oundedElastic-3] com.microsoft.azure.relay.RelayLogger : ClientWebSocket(TrackingId:1e926a8c-88ad-443f-ac2d-9d13828cda7f_G26, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): starting to write to remote. Writemode: BINARY
2024-05-14 15:03:27.710 INFO 1 --- [pool-1-thread-7] com.microsoft.azure.relay.RelayLogger : ClientWebSocket(TrackingId:1e926a8c-88ad-443f-ac2d-9d13828cda7f_G26, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): finished writing 975 bytes to remote.
2024-05-14 15:03:27.710 INFO 1 --- [pool-1-thread-7] com.microsoft.azure.relay.RelayLogger : HybridHttpConnection(TrackingId:1e926a8c-88ad-443f-ac2d-9d13828cda7f_G26, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): Sending 0 bytes on the rendezvous connection
2024-05-14 15:03:27.710 INFO 1 --- [oundedElastic-3] com.microsoft.azure.relay.RelayLogger : HybridHttpConnection(TrackingId:1e926a8c-88ad-443f-ac2d-9d13828cda7f_G26, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net)+ResponseStream+ResponseStream: WriteAsync(count=973)
2024-05-14 15:03:27.710 INFO 1 --- [oundedElastic-3] com.microsoft.azure.relay.RelayLogger : ClientWebSocket(TrackingId:1e926a8c-88ad-443f-ac2d-9d13828cda7f_G26, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): starting to write to remote. Writemode: BINARY
2024-05-14 15:03:27.710 INFO 1 --- [pool-1-thread-8] com.microsoft.azure.relay.RelayLogger : ClientWebSocket(TrackingId:1e926a8c-88ad-443f-ac2d-9d13828cda7f_G26, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): finished writing 973 bytes to remote.
2024-05-14 15:03:27.710 INFO 1 --- [pool-1-thread-8] com.microsoft.azure.relay.RelayLogger : HybridHttpConnection(TrackingId:1e926a8c-88ad-443f-ac2d-9d13828cda7f_G26, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): Sending 0 bytes on the rendezvous connection
2024-05-14 15:03:27.710 INFO 1 --- [oundedElastic-3] com.microsoft.azure.relay.RelayLogger : HybridHttpConnection(TrackingId:1e926a8c-88ad-443f-ac2d-9d13828cda7f_G26, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net)+ResponseStream+ResponseStream: WriteAsync(count=975)
2024-05-14 15:03:27.710 INFO 1 --- [oundedElastic-3] com.microsoft.azure.relay.RelayLogger : ClientWebSocket(TrackingId:1e926a8c-88ad-443f-ac2d-9d13828cda7f_G26, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): starting to write to remote. Writemode: BINARY
2024-05-14 15:03:27.804 INFO 1 --- [pool-1-thread-7] com.microsoft.azure.relay.RelayLogger : ClientWebSocket(TrackingId:1e926a8c-88ad-443f-ac2d-9d13828cda7f_G26, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): finished writing 975 bytes to remote.
2024-05-14 15:03:27.804 INFO 1 --- [pool-1-thread-7] com.microsoft.azure.relay.RelayLogger : HybridHttpConnection(TrackingId:1e926a8c-88ad-443f-ac2d-9d13828cda7f_G26, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): Sending 0 bytes on the rendezvous connection
2024-05-14 15:03:27.804 INFO 1 --- [oundedElastic-3] com.microsoft.azure.relay.RelayLogger : HybridHttpConnection(TrackingId:1e926a8c-88ad-443f-ac2d-9d13828cda7f_G26, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net)+ResponseStream+ResponseStream: WriteAsync(count=452)
2024-05-14 15:03:27.804 INFO 1 --- [oundedElastic-3] com.microsoft.azure.relay.RelayLogger : ClientWebSocket(TrackingId:1e926a8c-88ad-443f-ac2d-9d13828cda7f_G26, Address:https://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net): starting to write to remote. Writemode: BINARY
But with this change to logging level, it is now 3 lines:
2024-05-14 16:46:32.015 INFO 1 --- [oundedElastic-2] o.b.l.r.inspectors.InspectorsProcessor : Inspection result for the HTTP request. Result: true, URI:sb://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net/wds-fb5608c8-66ee-47f6-9ce2-e663fec12442-fb5608c8-66ee-47f6-9ce2-e663fec12442/fb5608c8-66ee-47f6-9ce2-e663fec12442/tsv/v0.2/sample
2024-05-14 16:46:32.049 INFO 1 --- [oundedElastic-4] o.b.l.relay.inspectors.RequestLogger : RELAY_REQUEST_RESPONSE - 70.16.76.24 "9ec01dd3-ddec-4273-8e34-39fdd2a480ab" "-" "davidan.dev@gmail.com" "-" [2024-05-14T16:46:32.047737387Z] "GET sb://lz44191250ee6bd7010acb08e426ed275d637ed9148c147a2e.servicebus.windows.net/wds-fb5608c8-66ee-47f6-9ce2-e663fec12442-fb5608c8-66ee-47f6-9ce2-e663fec12442/fb5608c8-66ee-47f6-9ce2-e663fec12442/tsv/v0.2/sample" 200 "https://bvdp-saturn-dev.appspot.com/" "https://bvdp-saturn-dev.appspot.com" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36"
2024-05-14 16:46:38.367 INFO 1 --- [oundedElastic-4] o.b.l.r.t.RelayedRequestPipeline : Processed request with the following result: SUCCESS
We have less info for debugging, but also we reduce log ingestion cost to the end user and keep our logs less cluttered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the other recent issues around log costs, this seems like a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense to me, there is no way that 8000+ lines of repetitive logs will be useful for debugging anyways
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for adding the couple of test cases based on the size of the request, and the change to only log warnings make sense to me too.
Jira Ticket: https://broadworkbench.atlassian.net/browse/IA-4893
Summary of changes:
Eliminate code that materializes the entire HTTP response body into a Java variable; hopefully this eliminates OutOfMemory errors in the listener.
What
When writing the HTTP response to the caller, the previous code used
InputStream.readAllBytes()
, which creates a single byte array containing the entire HTTP response. In this PR, we useStreamUtils.copy
instead, which works in 4096-byte buffers. Each buffer is read from the response, written to the caller, and then discarded.Why
We have seen multiple instances of the listener crashing with OOMs, especially when WDS has a large response such as a TSV file. This code change is a hopeful fix for that.
Testing strategy
kubectl edit
-ing these changes into a running WDS, downloading a largeish TSV, and watching listener logs