Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCPSocket specs hang if client doesn't connect #925

Open
headius opened this issue Mar 24, 2022 · 4 comments
Open

TCPSocket specs hang if client doesn't connect #925

headius opened this issue Mar 24, 2022 · 4 comments

Comments

@headius
Copy link
Contributor

headius commented Mar 24, 2022

The TCPSocket#initialize specs will hang if the client socket does not connect to the server, since the shutdown for the server expects that it will have handled a request and be wrapping up here:

https://github.com/ruby/spec/blob/master/library/socket/fixtures/classes.rb#L125-L129

I found this while adding support for the new connect_timeout keyword; we do not support hash arguments on TCPSocket#initialize at the moment, so the socket never connects. This leaves the server waiting for an incoming connection, and the Thread#join above will hang the suite.

Perhaps we should actively try to close the server socket before we join the thread, and ignore any errors on the thread from a double-close? It would allow these specs to be a bit more robust when there are errors setting up the TCPSocket before connection.

@eregon
Copy link
Member

eregon commented Mar 24, 2022

Good find, PR welcome.
Ideally we'd find a way which is not racy and doesn't involve Thread#kill (tends to be messy with open IO).

@headius
Copy link
Contributor Author

headius commented Mar 24, 2022

I think simply reversing the order of these and ignoring shutdown errors would be fine (or don't have the server's normal loop do its own close).

headius added a commit to headius/jruby that referenced this issue Mar 25, 2022
If the server never accepts a client, joining on this thread will
hang the spec and never be terminated because it happens in an
"after" section. Instead, we try to close the server if it has not
already been closed, so it will terminate gracefully.

See ruby/spec#925
@eregon
Copy link
Member

eregon commented Mar 28, 2022

jruby/jruby@14945c6 doesn't quite work correctly.
There was a trivial typo at jruby/jruby@14945c6#diff-6c84b4cfeb3db4be2729c9368e853f9d13f61533d241defc6e68ae28bb011e88R26 which I fixed locally.
But even after that there are transient failures like:

/bin/zsh -c chruby 2.6.9 && ../mspec/bin/mspec -j
ruby 2.6.9p207 (2021-11-24 revision 67954) [x86_64-linux]
[- | ==================86%=============       | 00:00:00]      0F      0E #<Thread:0x000000000156fb70@/home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:103 run> terminated with exception (report_on_exception is true):
/home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `accept': closed stream (IOError)
	from /home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `block in initialize'
[\ | ==================87%=============       | 00:00:00]      0F      0E #<Thread:0x00000000021a4698@/home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:103 run> terminated with exception (report_on_exception is true):
/home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `accept': stream closed in another thread (IOError)
	from /home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `block in initialize'
[| | ==================87%=============       | 00:00:00]      0F      0E #<Thread:0x00000000021a87c0@/home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:103 run> terminated with exception (report_on_exception is true):
/home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `accept': stream closed in another thread (IOError)
[/ | ==================87%=============       | 00:00:00]      0F      0E 	from /home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `block in initialize'
[| | ==================100%================== | 00:00:00]      0F      0E 

1)
An exception occurred during: after :each
TCPSocket#setsockopt using symbols without prefix sets the TCP nodelay to 1 ERROR
IOError: stream closed in another thread
/home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `accept'
/home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `block in initialize'

2)
An exception occurred during: after :each
TCPSocket#setsockopt using strings without prefix sets the TCP nodelay to 1 ERROR
IOError: stream closed in another thread
/home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `accept'
/home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `block in initialize'

3)
An exception occurred during: after :each
TCPSocket#initialize with a running server connects to a listening server with host and port ERROR
IOError: closed stream
/home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `accept'
/home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `block in initialize'

Finished in 6.969854 seconds

and

mspec library/socket
$ ruby /home/eregon/code/mspec/bin/mspec-run library/socket
ruby 3.0.3p157 (2021-11-24 revision 3fb7d2cadc) [x86_64-linux]
[- | ==================77%=========           | 00:00:00]      0F      0E #<Thread:0x0000000000e4d7e8 /home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:103 run> terminated with exception (report_on_exception is true):
/home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `accept': stream closed in another thread (IOError)
	from /home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `block in initialize'
                                                                                             
1)
An exception occurred during: after :each
TCPSocket#initialize with a running server silently ignores 'nil' as the third parameter ERROR
IOError: stream closed in another thread
/home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `accept'
/home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `block in initialize'
[\ | ==================81%===========         | 00:00:00]      0F      1E #<Thread:0x0000000000b93f98 /home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:103 run> terminated with exception (report_on_exception is true):
/home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `accept': stream closed in another thread (IOError)
	from /home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `block in initialize'
                                                                                             
2)
An exception occurred during: after :each
TCPSocket#setsockopt using symbols sets the TCP nodelay to 1 ERROR
IOError: stream closed in another thread
/home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `accept'
/home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `block in initialize'
[/ | ==================81%===========         | 00:00:00]      0F      2E #<Thread:0x0000000000b83eb8 /home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:103 run> terminated with exception (report_on_exception is true):
/home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `accept': stream closed in another thread (IOError)
	from /home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `block in initialize'
                                                                                             
3)
An exception occurred during: after :each
TCPSocket#setsockopt using symbols without prefix sets the TCP nodelay to 1 ERROR
IOError: stream closed in another thread
/home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `accept'
/home/eregon/code/rubyspec/library/socket/fixtures/classes.rb:104:in `block in initialize'
[\ | ==================100%================== | 00:00:00]      0F      3E 

Finished in 0.379546 seconds

188 files, 1775 examples, 2596 expectations, 0 failures, 3 errors, 0 tagged

So I will revert that commit when sync'ing to ruby/spec.

I think in general we should not try too hard to avoid hangs in specs (when the Ruby impl misbehaves), because the --timeout MSpec is a more general and reliable way to deal with that.

eregon pushed a commit that referenced this issue Mar 28, 2022
If the server never accepts a client, joining on this thread will
hang the spec and never be terminated because it happens in an
"after" section. Instead, we try to close the server if it has not
already been closed, so it will terminate gracefully.

See #925
eregon added a commit that referenced this issue Mar 28, 2022
* This reverts commit c96dc4f.
* It causes transient errors: #925
@headius
Copy link
Contributor Author

headius commented Mar 31, 2022

Thanks for fixing the typos. I had hoped to avoid swallowing errors during the accept or close calls but given the races involved I think that may be unavoidable.

I believe it should be expected of all Ruby specs that they can set up and tear down cleanly without any expectation that the spec body runs successfully. That is clearly not the case here, since numerous kinds of failures in the spec body will lead to the teardown logic hanging. That is not a timeout due to a bug, it is a timeout due to a badly written spec teardown.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants