-
Notifications
You must be signed in to change notification settings - Fork 9.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add timeout to SSE #8565
base: master
Are you sure you want to change the base?
Add timeout to SSE #8565
Conversation
/** | ||
* Seconds elapsed between 2 events until connection failed. Doesn't timeout if null | ||
*/ | ||
open var timeout: Long? = null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From Call, I think there is a convention on timeout names. Such as readTimeoutMillis
. So perhaps receiveTimeoutMillis
?
I wonder if we should prefer Kotlin Duration here also?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Internally we use callTimeout in RealEventSource to track establishment, and it's obviously an existing top level think in OkHttp which extends to the end of a request.
read feel to I/O socket focused.
@@ -40,6 +44,18 @@ internal class RealEventSource( | |||
} | |||
} | |||
|
|||
private fun updateTimeout(call: Call?, duration: Duration) { | |||
if (call?.timeout() is AsyncTimeout) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit nervous about reusing call.timeout()
here. I guess two reasons
- it isn't clear we are the owner of it, perhaps some other component or an interceptor is using it.
- the casting seems unsafe, but if we create it, we know the type and can set the duration ourselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you suggest then ?
For the first point: currently (without the PR), we cancel this exact timeout (L69), so the point would be valid for the current state.
For the 2nd, the check prevent unsafe casting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gentle ping for the question, this PR is something I'm using on a project
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
result: ### 改动目的
此次PR的主要目的是为EventSourceListener
类添加一个超时机制,允许用户设置两个事件之间的最大时间间隔。如果超过这个时间间隔没有收到新的事件,连接将被视为失败。此外,PR还对代码中的一些注释进行了完善,增加了对onEvent
和onClosed
方法的描述,使其更清晰易懂。
发现问题
- 超时机制的实现可能不够完善:在
RealEventSource
类中,超时机制的实现依赖于AsyncTimeout
,并且只在onResponse
和processNextEvent
中更新超时时间。如果事件处理逻辑复杂或耗时较长,可能会导致超时机制失效。 - 超时时间的单位不明确:在
EventSourceListener
中,timeout
属性的单位是Long
,但没有明确说明是秒还是毫秒,可能会导致使用时的混淆。 - 超时机制的取消逻辑可能存在问题:在
onResponse
中,如果listener.timeout
为null
,则会调用call?.timeout()?.cancel()
取消超时。然而,如果timeout
在后续操作中被设置为非null
值,可能会导致超时机制无法正常工作。
优化建议
- 明确超时时间的单位:建议在
EventSourceListener
的timeout
属性的注释中明确说明单位是秒,或者在代码中使用Duration
类型来避免混淆。 - 优化超时机制的实现:建议在
RealEventSource
类中增加对超时机制的全面检查,确保在所有可能的事件处理路径中都正确更新超时时间。可以考虑在processNextEvent
中增加超时检查的逻辑,确保即使事件处理耗时较长,超时机制也能正常工作。 - 改进超时取消逻辑:建议在
onResponse
中增加对timeout
属性的检查,确保在timeout
为null
时正确取消超时,并在后续操作中正确处理timeout
的变化。可以考虑在timeout
属性发生变化时重新设置超时时间。
通过这些优化,可以确保超时机制的可靠性和代码的可维护性。
This PR adds the possibility to timeout the SSE connection. It is now possible to fail as soon as an event-free timeout is not reached, making it easier to detect a lost connection.
For a real world example: I've an app connected to a server. It sends a ping at a keepalive interval. I've a job checking from time to time that everything works as expected (using a worker, so the period is 15min, the minimum). It happens that the connection is lost at the beginning of this interval, therefore the users are disconnected for several minutes.
With this PR we can directly catch a lost connection: https://codeberg.org/NextPush/nextpush-android/commit/f6edb178ddee8255e9de6e548ea75f5d3eb6b32a
PS: I've also added some comments, but I can remove this commit from the branch