Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in amqp-0.5 #17

Open
Gabriella439 opened this issue Sep 3, 2013 · 13 comments
Open

Bug in amqp-0.5 #17

Gabriella439 opened this issue Sep 3, 2013 · 13 comments

Comments

@Gabriella439
Copy link

When I upgrade my code from 0.4.2 to 0.5 I get the following mysterious error when connecting to a message queue:

Data.Binary.Get.runGet at position 0: demandInput: not enough bytes

I'm limited for time at the moment, so I haven't had time to narrow down this test case from within the application that this occurred in, but my best guess from studying the diff between 0.4.2 and 0.5 is that this exception probably originated in your internal readFrame function.

@hreinhardt
Copy link
Owner

I think it might be the call to peekFrameSize in readFrame. Could you check if we're passing 7 bytes into peekFrameSize?

@Gabriella439
Copy link
Author

Your intuition was right: the dat value returned by BL.hGet handle 7 ends up being Empty, so peekFrameSize receives an empty ByteString/

@hreinhardt
Copy link
Owner

According to the documentation 0 should only be returned on EOF. Can we rule out that the socket has been closed? You may want to look at the RabbitMQ log file.

@Gabriella439
Copy link
Author

Yes, there is an error in the RabbitMQ log file each time this happens:

=INFO REPORT==== 3-Sep-2013::12:56:08 ===
starting TCP connection <0.14297.3> from <removed>

=ERROR REPORT==== 3-Sep-2013::12:56:18 ===
exception on TCP connection <0.14297.3> from <removed>
{handshake_timeout,handshake}

=INFO REPORT==== 3-Sep-2013::12:56:18 ===
closing TCP connection <0.14297.3> from <removed>

I also forgot to mention that there was in fact a long delay before the exception occurred (~10 seconds, as you can see from the log). When I use amqp-0.4.2 the connection succeeds instantly. So it seems that something about amqp-0.5 is triggering a timeout in the hand shake which in turn causes hGet to return an empty string.

@hreinhardt
Copy link
Owner

That's very strange. You may want to add some print statements to openConnection' to figure out where it fails.

@hreinhardt
Copy link
Owner

The most likely explanation is that we need to call hFlush after every write to the handle. But that makes me wonder why it works on my computer.

@michaelklishin
Copy link
Contributor

It may be necessary to use flush, at least some clients do but I also suspect the issue may be with lazy evaluation.

@Gabriel439 the error message means that RabbitMQ did not receive initial protocol handshake (which is 7 bytes, AMQP\x00\x00\x09\x01). Please try recording what's coming from the client to the server using Wireshark.

@Gabriella439
Copy link
Author

I will do that. Just give me until Thursday as I'm busy through tomorrow.

@gseitz
Copy link
Contributor

gseitz commented Sep 10, 2013

FWIW, I can reproduce the exception by supplying a non-existing user/password combination. Sticking

closed <- hIsEOF handle
when closed $ error "connection is closed"

at the start of readFrame yields the "connection is closed" exception, but still only after a timeout of roughly 10 seconds.

@hreinhardt
Copy link
Owner

I added some calls to hFlush in a patch to the master branch. Can you try if that removes the 10 second delay?

@gseitz
Copy link
Contributor

gseitz commented Sep 10, 2013

Hm, just to be sure, I did a couple of runs with and without the call to hFlush and it's now more or less consistently around 3-4 seconds. The other attempt might have been an outlier.

Looking at the server log, the connection is obviously properly closed with the wrong, compared to @Gabriel439's handshake_timeout, so in hindsight, my claim to being able to reproduce his error is wrong. Apologies.

@gseitz
Copy link
Contributor

gseitz commented Sep 10, 2013

In any case, I think it'd be better if a closed connection wouldn't manifest in error at runGet.

@hreinhardt
Copy link
Owner

I agree. The exception handling needs a lot of improvement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants