-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid accessing a null ClientConnection instance #317
Merged
BewareMyPower
merged 1 commit into
apache:main
from
BewareMyPower:bewaremypower/refactor-cnx-ref
Sep 20, 2023
Merged
Avoid accessing a null ClientConnection instance #317
BewareMyPower
merged 1 commit into
apache:main
from
BewareMyPower:bewaremypower/refactor-cnx-ref
Sep 20, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
BewareMyPower
requested review from
merlimat,
shibd and
RobertIndie
and removed request for
merlimat and
shibd
September 18, 2023 13:45
### Motivation We observed server null `ClientConnection` accesses in test environment. See the `this=0x0` outputs in the following two typical stacks. ``` #8 bytesWritten (this=0xb8, size=371) at lib/SharedBuffer.h:166 #9 pulsar::ClientConnection::handleRead (this=0x0, err=..., bytesTransferred=371, minReadSize=4) at lib/ClientConnection.cc:609 ``` ``` #12 0x00007f33202933d2 in unique_lock (__m=..., this=0x7f3311c82800) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/std_mutex.h:197 #13 pulsar::ClientConnection::sendPendingCommands (this=0x0) at lib/ClientConnection.cc:1071 #14 0x00007f3320293d2d in pulsar::ClientConnection::handleSendPair (this=0x0, err=...) at lib/ClientConnection.cc:1066 ``` Though `shared_from_this()` is always passed to the `std::bind` function, when the method of `ClientConnection` is called, the pointer is still `null`. ### Modifications First, replace all `std::bind` calls with the lambda expression that catches `std::weak_ptr<ClientConnection>` and perform null checks explicitly on the value returned by the `lock()` method. Since now all asio callbacks don't hold a `shared_ptr`, the owner of the `ClientConnection` object should be `ConnectionPool`, i.e. the pool maintains some connections, while all asio callbacks use `weak_ptr` to test if the connection is present. Second, make `ClientConnection::getConnection` return `shared_ptr` rather than `weak_ptr` so that the caller side does not need to check if `lock()` returns null in the callback of this future. We cannot make `ConnectionPool::getConnectionAsync` return `shared_ptr` because it could return the future of `connectPromise_`, which is hold by `ClientConnection` itself. We should avoid holding a `shared_ptr` of `ClientConnection` because its owner is `ConnectionPool`.
BewareMyPower
force-pushed
the
bewaremypower/refactor-cnx-ref
branch
from
September 19, 2023 07:08
6407c6e
to
52e1531
Compare
shibd
reviewed
Sep 19, 2023
shibd
approved these changes
Sep 19, 2023
merlimat
approved these changes
Sep 19, 2023
2 tasks
BewareMyPower
added a commit
to BewareMyPower/pulsar-client-cpp
that referenced
this pull request
Oct 8, 2023
Fixes apache#325 ### Motivation apache#317 introduces a bug that might cause segmentation fault when sending messages after receiving an error, see apache#325 (comment) for the detailed explanation. ### Modifications When calling `asyncWrite`, capture the `shared_ptr` instead of the `weak_ptr` to extend the lifetime of the `socket_` or `tlsSocket_` field in `ClientConnection`. Since the lifetime is extended, in some callbacks, check `isClosed()` before other logic. Add a `ChunkDedupTest` to reproduce this issue based on Pulsar 3.1.0. Run the test for 10 times to ensure it won't crash after this patch.
BewareMyPower
added a commit
to streamnative/pulsar-client-cpp
that referenced
this pull request
Oct 8, 2023
…apache#326) Fixes apache#325 ### Motivation apache#317 introduces a bug that might cause segmentation fault when sending messages after receiving an error, see apache#325 (comment) for the detailed explanation. ### Modifications When calling `asyncWrite`, capture the `shared_ptr` instead of the `weak_ptr` to extend the lifetime of the `socket_` or `tlsSocket_` field in `ClientConnection`. Since the lifetime is extended, in some callbacks, check `isClosed()` before other logic. Add a `ChunkDedupTest` to reproduce this issue based on Pulsar 3.1.0. Run the test for 10 times to ensure it won't crash after this patch.
BewareMyPower
added a commit
that referenced
this pull request
Oct 9, 2023
…#326) Fixes #325 ### Motivation #317 introduces a bug that might cause segmentation fault when sending messages after receiving an error, see #325 (comment) for the detailed explanation. ### Modifications When calling `asyncWrite`, capture the `shared_ptr` instead of the `weak_ptr` to extend the lifetime of the `socket_` or `tlsSocket_` field in `ClientConnection`. Since the lifetime is extended, in some callbacks, check `isClosed()` before other logic. Add a `ChunkDedupTest` to reproduce this issue based on Pulsar 3.1.0. Run the test for 10 times to ensure it won't crash after this patch.
BewareMyPower
added a commit
to BewareMyPower/pulsar-client-cpp
that referenced
this pull request
Nov 16, 2023
Fixes apache#348 Fixes apache#349 ### Motivation When `close` is called in `ClientConnection`'s constructor, `shared_from_this()` will be called, which results in a `std::bad_weak_ptr` error. This error does not happen before apache#317 because `shared_from_this()` could only be called when the `producers` or `consumers` field is not empty. ### Modifications Modify the 2nd parameter of `ClientConnection::close` to represent if the construction completes. If not, just set the state to `Disconnected` and complete the future to the result. Then `ConnectionPool::getConnectionAsync` will return a future that completes with the failed result. In addition, check `authentication_` even for non-TLS URLs. Otherwise, the null authentication will be used to construct `CommandConnect`. Add `testInvalidPlugin` and `testTlsConfigError` to verify the changes.
BewareMyPower
added a commit
to BewareMyPower/pulsar-client-cpp
that referenced
this pull request
Nov 17, 2023
Fixes apache#348 Fixes apache#349 ### Motivation When `close` is called in `ClientConnection`'s constructor, `shared_from_this()` will be called, which results in a `std::bad_weak_ptr` error. This error does not happen before apache#317 because `shared_from_this()` could only be called when the `producers` or `consumers` field is not empty. ### Modifications Modify the 2nd parameter of `ClientConnection::close` to represent if the construction completes. If not, just set the state to `Disconnected` and complete the future to the result. Then `ConnectionPool::getConnectionAsync` will return a future that completes with the failed result. In addition, check `authentication_` even for non-TLS URLs. Otherwise, the null authentication will be used to construct `CommandConnect`. Add `testInvalidPlugin` and `testTlsConfigError` to verify the changes.
BewareMyPower
added a commit
to BewareMyPower/pulsar-client-cpp
that referenced
this pull request
Nov 17, 2023
Fixes apache#348 Fixes apache#349 ### Motivation When `close` is called in `ClientConnection`'s constructor, `shared_from_this()` will be called, which results in a `std::bad_weak_ptr` error. This error does not happen before apache#317 because `shared_from_this()` could only be called when the `producers` or `consumers` field is not empty. ### Modifications Add a `ResultException` and throw it if there is a failure in `ClientConnection`'s constructor and catch it in `ConnectionPool::getConnectionAsync`. Then retrieve the result and return the failed future. Since `close` is now always called on a constructed `ClientConnection` object, remove the 2nd parameter. In addition, check `authentication_` even for non-TLS URLs. Otherwise, the null authentication will be used to construct `CommandConnect`. Add `testInvalidPlugin` and `testTlsConfigError` to verify the changes.
BewareMyPower
added a commit
to BewareMyPower/pulsar-client-cpp
that referenced
this pull request
Nov 17, 2023
Fixes apache#348 Fixes apache#349 ### Motivation When `close` is called in `ClientConnection`'s constructor, `shared_from_this()` will be called, which results in a `std::bad_weak_ptr` error. This error does not happen before apache#317 because `shared_from_this()` could only be called when the `producers` or `consumers` field is not empty. ### Modifications Throw `ResultAuthenticationError` when there is anything wrong with authentication in `ClientConnection`'s constructor. Then catch the result in `ConnectionPool::getConnectionAsync` and returned the failed future. In addition, check `authentication_` even for non-TLS URLs. Otherwise, the null authentication will be used to construct `CommandConnect`. Add `testInvalidPlugin` and `testTlsConfigError` to verify the changes.
BewareMyPower
added a commit
that referenced
this pull request
Nov 17, 2023
) Fixes #348 Fixes #349 ### Motivation When `close` is called in `ClientConnection`'s constructor, `shared_from_this()` will be called, which results in a `std::bad_weak_ptr` error. This error does not happen before #317 because `shared_from_this()` could only be called when the `producers` or `consumers` field is not empty. ### Modifications Throw `ResultAuthenticationError` when there is anything wrong with authentication in `ClientConnection`'s constructor. Then catch the result in `ConnectionPool::getConnectionAsync` and returned the failed future. In addition, check `authentication_` even for non-TLS URLs. Otherwise, the null authentication will be used to construct `CommandConnect`. Add `testInvalidPlugin` and `testTlsConfigError` to verify the changes.
BewareMyPower
added a commit
that referenced
this pull request
Nov 17, 2023
) Fixes #348 Fixes #349 ### Motivation When `close` is called in `ClientConnection`'s constructor, `shared_from_this()` will be called, which results in a `std::bad_weak_ptr` error. This error does not happen before #317 because `shared_from_this()` could only be called when the `producers` or `consumers` field is not empty. ### Modifications Throw `ResultAuthenticationError` when there is anything wrong with authentication in `ClientConnection`'s constructor. Then catch the result in `ConnectionPool::getConnectionAsync` and returned the failed future. In addition, check `authentication_` even for non-TLS URLs. Otherwise, the null authentication will be used to construct `CommandConnect`. Add `testInvalidPlugin` and `testTlsConfigError` to verify the changes. (cherry picked from commit 7bb94f4)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
We observed server null
ClientConnection
accesses in test environment. See thethis=0x0
outputs in the following two typical stacks.Though
shared_from_this()
is always passed to thestd::bind
function, when the method ofClientConnection
is called, the pointer is stillnull
.Modifications
First, replace all
std::bind
calls with the lambda expression that catchesstd::weak_ptr<ClientConnection>
and perform null checks explicitly on the value returned by thelock()
method.Since now all asio callbacks don't hold a
shared_ptr
, the owner of theClientConnection
object should beConnectionPool
, i.e. the pool maintains some connections, while all asio callbacks useweak_ptr
to test if the connection is present.Second, make
ClientConnection::getConnection
returnshared_ptr
rather thanweak_ptr
so that the caller side does not need to check iflock()
returns null in the callback of this future.We cannot make
ConnectionPool::getConnectionAsync
returnshared_ptr
because it could return the future ofconnectPromise_
, which is hold byClientConnection
itself. We should avoid holding ashared_ptr
ofClientConnection
because its owner isConnectionPool
.