-
Notifications
You must be signed in to change notification settings - Fork 102
No way to detect a connection loss or recover a connection #188
Comments
There are both TCP and AMQP level timeouts that will be reported. Can you please provide some more details about what you're experiencing? |
Thank you for the quick response. How do you set those timeouts? Are there defaults? |
The AMQP timeout is configured via https://godoc.org/pack.ag/amqp#ConnIdleTimeout, default of 1 minute. TCP is more complex and is the default set by |
Great, thank you for that. How do you listen for the timeouts? |
Currently you have to do some sort of operation. This typically isn't a problem with receivers, but may not be ideal for senders. I feel there's a high bar to justify exposing channels in an external API, but perhaps it's warranted here. The alternative would be to add a blocking @alanconway You've thought a lot about these APIs, do you have an opinion? |
The simplest solution is an identified exception like: Electron has the Done()/Err() API on all endpoints: Connection, Session, Sender and Receiver. I'd avoid it until you're sure you need it, it risks giving a false sense of completion - Done() closing means the AMQP thing has closed but it does not tell you anything about the state of the goroutines that were calling Send(), Receive() etc. at the time - which may still be executing. So if you are using multiple goroutines you really should use your own sync.Group or errgroup.Group to wait till they all see the ErrConnAborted or are otherwise in a safe place. If you're not using multiple goroutines, you don't need Done() |
Great ideas. What I'm struggling with at the moment is:
The operations return an error of "EOF". Not very clear what is happening without lots of digging. Some sort of mechanism to clearly convey from the API that a connection is lost would be desired. Second, why is this not a problem with receivers? Do they try to reconnect or something? Lastly, it appears that the keep-alive just swallows the error if a keep-alive is unsuccessful. I would think that is a great opportunity to convey to the user of the API that the connection has gone stale, or attempt to reconnect. Thank you both for exploring this and giving me more insight. |
@alanconway Thanks! @borlinp I should have been more clear. Since send operations may happen infrequently you may not find out that the connection has been disconnected until one is tried. Receiving tends to happen constantly and will block until a message is received or an error occurs. I agree that more context should be added to errors where relevant.
It should not be ignored. Have you observed this and can you describe the details and/or share code? |
Line 600 of conn.go gets the error from the keepalive frame. |
Once This is intended to be a low-level library, reconnection is up to the user. |
Understandable that you wouldn't want to reconnect; that makes sense. I would love to see something like Streadway's AMQP 0.9 library where you can subscribe and be notified of a connection closure: https://godoc.org/github.com/streadway/amqp#Channel.NotifyClose We have a work around that is functioning for us. I'll leave this issue up to you to close or not, depending if you think the effort/feature would be valuable. Thanks for the conversation about it. |
I'm happy to keep this open since I think there are some things to be done to allow detecting the errors more proactively and include more information about where the error originated from. Though I don't know when I'll get around to making any changes. (I try to respond to issues promptly and fix bugs, but as it's a personal side project for me I don't dedicate a lot of time to other improvements.) To add a little more detail as to why I defer to users to handle reconnection: Doing a connection reconnect probably wouldn't be all that difficult (even that may be involved depending on the auth method), but recreating the sessions and links is non-trivial since reusing the initial parameters may or may not be appropriate depending on the broker semantics. |
If I get some time, I might be willing to come up with a patch to contribute to the project. Thanks. |
Contributions are welcome! If you do decide to do that please share what you have in mind before putting in too much effort. The streadway API takes a very different approach from this library and would not fit in well. I'd like to see something more along the lines of #188 (comment) |
When a connection is lost to the server, there is no way for an application to know the connection is lost. The client, conn, and session sit idle as if nothing is wrong. Even publishing gives no error.
Would recommend a way to detect a closed connection internally, and providing a channel to a user of the amp library which notifies of a connection loss to the user.
The text was updated successfully, but these errors were encountered: