Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix connection timeout when retries are set to none #1650

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

afeick
Copy link

@afeick afeick commented Sep 20, 2023

Motivation:

Connection timeout is ignored on last retry or when retries are set to none (issue #1649)

Modifications:

Use the minimumConnectionTimeout value from the ConnectionBackoff directly instead of the ConnectionBackoffIterator

Results:

Connection timeout is honored across all connection attempts.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Sep 20, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

Copy link
Collaborator

@glbrntt glbrntt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the delay in taking a look at this.

This is slightly more awkward than I first realised. As you pointed out there are two cases we need to handle:

  • never retrying, and
  • the final retry

Dealing with never retrying is fine: we can do as you've proposed by getting the minimumConnectionTimeout in the idle case and using that instead of the value from the iterator. That works because the value from the iterator will be the same as minimumConnectionTimeout.

For the final retry it's a bit difficult: we want the iterator to provide the connect timeout but not the backoff. The API for the iterator will always return both to us unfortunately and we can't change that without breaking API. Two options come to mind:

  1. Accept that the connect timeout for the last retry is still wrong and that we're only fixing the case where we never retry. (This is okay: strictly an improvement on what we have now.)
  2. Return a sentinel value from the iterator and pick that up in startConnecting and set the Reconnect to none in that case.

I'm happy with either.

@@ -985,7 +992,7 @@ extension ConnectionManager {
let channel: EventLoopFuture<Channel> = self.channelProvider.makeChannel(
managedBy: self,
onEventLoop: self.eventLoop,
connectTimeout: timeoutAndBackoff.map { .seconds(timeInterval: $0.timeout) },
connectTimeout: connectTimeout,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't have a connectTimeout we should fallback to using timeoutAndBackoff.map { .seconds(timeInterval: $0.timeout) }

)

case let .transientFailure(pending):
self.startConnecting(
backoffIterator: pending.backoffIterator,
muxPromise: pending.readyChannelMuxPromise
muxPromise: pending.readyChannelMuxPromise,
connectTimeout: timeout
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to do this for the transientFailure case: transientFailure means we tried to connect and fail, in which case we'll already have the backoffIterator and we should use the values from that instead.

@afeick
Copy link
Author

afeick commented Sep 22, 2023

This is slightly more awkward than I first realised.

Same. I missed that the timeout can increase with the backoff. Not sure how I didn't see that because I was looking for where it might change, given the variable name minimumConnectionTimeout.

@glbrntt
Copy link
Collaborator

glbrntt commented Sep 25, 2023

This is slightly more awkward than I first realised.

Same. I missed that the timeout can increase with the backoff. Not sure how I didn't see that because I was looking for where it might change, given the variable name minimumConnectionTimeout.

Yeah, it's only really an issue for the final value though; the solution I outlined above should be sufficient here.

Comment on lines +138 to +141
// limit is reached, return an element with only a timeout.
case let .limited(limit) where limit == 0:
self.connectionBackoff.retries.limit = .limited(limit - 1)
return self.makeElement(backoff: nil)
Copy link
Author

@afeick afeick Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glbrntt What do you think about using a nil Element.backoff as the sentinel for the iterator?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It changes the Element type which is an API breaking change so we can't do it unfortunately. Otherwise it'd be a good solution.

@@ -79,19 +79,19 @@ class ConnectionBackoffTests: GRPCTestCase {
for limit in [1, 3, 5] {
let backoff = ConnectionBackoff(retries: .upTo(limit))
let values = Array(backoff)
XCTAssertEqual(values.count, limit)
XCTAssertEqual(values.count, limit+1)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value count would make more sense if the ConnectionBackoff took a value called tries or attempts instead of retries, but I didn't think it was a good idea to make a breaking change like that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants