-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: an edge case of kill connctions casued by unnecessary shared_ptr ref count #2904
fix: an edge case of kill connctions casued by unnecessary shared_ptr ref count #2904
Conversation
WalkthroughThe changes involve the modification of the Changes
Sequence Diagram(s)sequenceDiagram
participant WorkerThread
participant NetConn
WorkerThread->>NetConn: Create instance (previous flow)
Note over WorkerThread: `in_conn` is accessible throughout the function
WorkerThread->>NetConn: Create instance (new flow)
Note over WorkerThread: `in_conn` is only accessible within the `else` block
Poem
Recent review detailsConfiguration used: CodeRabbit UI Files selected for processing (1)
Additional comments not posted (1)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
观察到一个edge case:
最后发生活动那个conn,在发生timeout kill,或者被client kill命令杀连接之后,client确实无法继续通信了,但是server端也没有及时close fd,需要等到后面有新的网络收发包发生,前面这个conn的fd才会被关闭,进一步观察发现,最后发生活动的这个conn其会拥有一个额外的shared_ptr引用计数, 导致pika无法及时将其关闭(目前pika判断shared_ptr的ref count为1后才会close connction.fd)
原因:
发现罪魁祸首是epoll循环中的std::shared_ptr in_conn,该对象用于每次有事件就绪时,临时去指向就绪的conn对象,但处理完事件之后又不置空,所以哪怕离开了事件处理分支(处理完了本次事件),他也会指向最后发生事件的那个conn对象,导致其引用计数加1。
解决:
更改in_conn定义的位置,该变量只在事件处理分支中会被用到,那就在该分支内定义它即可,确保离开该分支后in_conn对象也会析构,释放引用计数。
Observation of an edge case: After a connection (conn) that was last active experiences a timeout or is killed by a client command, the client indeed cannot continue communication. However, the server side does not immediately close the file descriptor (fd); it has to wait until there is a new connection or network packet transfer, at which point the fd for the previous conn will be closed. Further observation reveals that the last active conn has an additional
shared_ptr
reference count, which prevents Pika from closing it promptly (currently, Pika only closes the fd when the ref count is 1).Cause: The culprit was identified as the
std::shared_ptr<NetConn> in_conn
in the epoll loop. This object is used to temporarily point to the ready conn object whenever an event is ready. However, it is not reset after processing the event, so even after leaving the event handling branch (after processing the current event), it continues to point to the last active conn, increasing its reference count by 1.Solution: Change the definition of
in_conn
to be within the event handling branch. This way, it will only be used in that branch, ensuring that once it exits the branch, thein_conn
object will be destructed, releasing the reference count.