Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WEBrick has an unsafe shutdown process it tries to concurrently write and close the @shutdown_pipe #102

Open
eregon opened this issue Jan 27, 2023 · 4 comments

Comments

@eregon
Copy link
Member

eregon commented Jan 27, 2023

When WEBrick shutdowns, it tries to concurrently write and close a file descriptor, and even tries to close it from multiple threads:

closing it from the main webrick thread:

cleanup_shutdown_pipe(shutdown_pipe)

closing it from an arbitrary thread:

alarm_shutdown_pipe(&:close)

writing to it (from an arbitrary thread):

alarm_shutdown_pipe {|f| f.write_nonblock("\0")}

A hack:

rescue IOError # closed by another thread.

The problem is if the write_nonblock which calls write(2) ends up happening once the fd is close(2)d then it's EBADF, or worse writing to the wrong file descriptor.
This became such an issue that ruby/spec stopped using WEBrick and rewrote to make its own HTTP server to avoid this issue. Also the commit message of ruby/spec@d8ead5d may be interesting.

Only one thread (e.g. the main webrick thread) should close it, and it should wait all sub-threads before closing it so there are concurrent writes to the close.

CRuby has some very complex logic in IO#close which avoids the issue in most cases but it's not clear if it's fully reliable: https://ruby.slack.com/archives/C02A3SL0S/p1636604027275700?thread_ts=1636592668.266300&cid=C02A3SL0S
IIRC I've seen it fail for ruby/spec too on CRuby.

cc @ioquatix

@ioquatix
Copy link
Member

Ah, okay, thanks, I might have a go at fixing this.

@eregon
Copy link
Member Author

eregon commented Jan 27, 2023

Also see ruby/spec#843 which has some notes about this

@eregon
Copy link
Member Author

eregon commented Jan 27, 2023

Ah and I should mention, on TruffleRuby which doesn't have that complex logic in IO#close, it seems so far this issue only happened with WEBrick. In other words other gems seem to use IO more responsibly in terms of concurrent read/write+close :)

@headius
Copy link
Contributor

headius commented Nov 8, 2023

I am investigating intermittent failures in some of the net/http tests on JRuby and this looks suspiciously similar. In my case the tests appear to pass, but the teardown of the server occasionally will raise EPIPE or EBADF.

I have been excluding such tests for now but it would be best to fix this so they can run safely again.

headius added a commit to headius/jruby that referenced this issue Nov 8, 2023
There are concurrent shutdown issues with WEBrick that have never
been fixed (ruby/webrick#102 which may explain why we get these
sporadic failures in several of the net/http tests:

```
  1) Error:
TestNetHTTP_v1_2#test_s_start:
Errno::EPIPE: Broken pipe - No message available
    org/jruby/RubyIO.java:1391:in `write_nonblock'
    /home/runner/work/jruby/jruby/test/tool/lib/webrick/server.rb:227:in `block in stop'
    /home/runner/work/jruby/jruby/test/tool/lib/webrick/server.rb:352:in `alarm_shutdown_pipe'
    /home/runner/work/jruby/jruby/test/tool/lib/webrick/server.rb:227:in `stop'
    /home/runner/work/jruby/jruby/test/tool/lib/webrick/server.rb:235:in `shutdown'
    /home/runner/work/jruby/jruby/test/mri/net/http/utils.rb:37:in `teardown'
```

I have commented on the issue and this commit adds more exceptions
to ignore in an attempt to prevent unsafe shutdown from causing
any tests to fail.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants