* shutdown not affecting connection? @ 2020-02-08 13:55 Glauber Costa 2020-02-08 14:26 ` Pavel Begunkov 2020-02-08 18:48 ` Andres Freund 0 siblings, 2 replies; 10+ messages in thread From: Glauber Costa @ 2020-02-08 13:55 UTC (permalink / raw) To: io-uring, Jens Axboe, Avi Kivity Hi I've been trying to make sense of some weird behavior with the seastar implementation of io_uring, and started to suspect a bug in io_uring's connect. The situation is as follows: - A connect() call is issued (and in the backend I can choose if I use uring or not) - The connection is supposed to take a while to establish. - I call shutdown on the file descriptor If io_uring is not used: - connect() starts by returning EINPROGRESS as expected, and after the shutdown the file descriptor is finally made ready for epoll. I call getsockopt(SOL_SOCKET, SO_ERROR), and see the error (104) if io_uring is used: - if the SQE has the IOSQE_ASYNC flag on, connect() never returns. - if the SQE *does not* have the IOSQE_ASYNC flag on, then most of the time the test works as intended and connect() returns 104, but occasionally it hangs too. Note that, seastar may choose not to call io_uring_enter immediately and batch sqes. Sounds like some kind of race? I know C++ probably stinks like the devil for you guys, but if you are curious to see the code, this fails one of our unit tests: https://github.com/scylladb/seastar/blob/master/tests/unit/connect_test.cc See test_connection_attempt_is_shutdown (above is the master seastar tree, not including the io_uring implementation) Please let me know if this rings a bell and if there is anything I should be verifying here ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: shutdown not affecting connection? 2020-02-08 13:55 shutdown not affecting connection? Glauber Costa @ 2020-02-08 14:26 ` Pavel Begunkov 2020-02-08 18:42 ` Glauber Costa 2020-02-08 18:48 ` Andres Freund 1 sibling, 1 reply; 10+ messages in thread From: Pavel Begunkov @ 2020-02-08 14:26 UTC (permalink / raw) To: Glauber Costa, io-uring, Jens Axboe, Avi Kivity Hi On 2/8/2020 4:55 PM, Glauber Costa wrote: > Hi > > I've been trying to make sense of some weird behavior with the seastar > implementation of io_uring, and started to suspect a bug in io_uring's > connect. > > The situation is as follows: > > - A connect() call is issued (and in the backend I can choose if I use > uring or not) > - The connection is supposed to take a while to establish. > - I call shutdown on the file descriptor > > If io_uring is not used: > - connect() starts by returning EINPROGRESS as expected, and after > the shutdown the file descriptor is finally made ready for epoll. I > call getsockopt(SOL_SOCKET, SO_ERROR), and see the error (104) > > if io_uring is used: > - if the SQE has the IOSQE_ASYNC flag on, connect() never returns. > - if the SQE *does not* have the IOSQE_ASYNC flag on, then most of the > time the test works as intended and connect() returns 104, but > occasionally it hangs too. Note that, seastar may choose not to call > io_uring_enter immediately and batch sqes. > > Sounds like some kind of race? > > I know C++ probably stinks like the devil for you guys, but if you are > curious to see the code, this fails one of our unit tests: > > https://github.com/scylladb/seastar/blob/master/tests/unit/connect_test.cc > See test_connection_attempt_is_shutdown > (above is the master seastar tree, not including the io_uring implementation) > Is this chaining with connect().then_wrapped() asynchronous? Like kind of future/promise stuff? I wonder, if connect() and shutdown() there may be executed in the reverse order. The hung with IOSQE_ASYNC sounds strange anyway. > Please let me know if this rings a bell and if there is anything I > should be verifying here > -- Pavel Begunkov ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: shutdown not affecting connection? 2020-02-08 14:26 ` Pavel Begunkov @ 2020-02-08 18:42 ` Glauber Costa 2020-02-08 18:48 ` Avi Kivity 0 siblings, 1 reply; 10+ messages in thread From: Glauber Costa @ 2020-02-08 18:42 UTC (permalink / raw) To: Pavel Begunkov; +Cc: io-uring, Jens Axboe, Avi Kivity Hi BTW, my apologies but I should have specified the kernel I am running: 90206ac99c1f25b7f7a4c2c40a0b9d4561ffa9bf On Sat, Feb 8, 2020 at 9:26 AM Pavel Begunkov <[email protected]> wrote: > > Hi > > On 2/8/2020 4:55 PM, Glauber Costa wrote: > > Hi > > > > I've been trying to make sense of some weird behavior with the seastar > > implementation of io_uring, and started to suspect a bug in io_uring's > > connect. > > > > The situation is as follows: > > > > - A connect() call is issued (and in the backend I can choose if I use > > uring or not) > > - The connection is supposed to take a while to establish. > > - I call shutdown on the file descriptor > > > > If io_uring is not used: > > - connect() starts by returning EINPROGRESS as expected, and after > > the shutdown the file descriptor is finally made ready for epoll. I > > call getsockopt(SOL_SOCKET, SO_ERROR), and see the error (104) > > > > if io_uring is used: > > - if the SQE has the IOSQE_ASYNC flag on, connect() never returns. > > - if the SQE *does not* have the IOSQE_ASYNC flag on, then most of the > > time the test works as intended and connect() returns 104, but > > occasionally it hangs too. Note that, seastar may choose not to call > > io_uring_enter immediately and batch sqes. > > > > Sounds like some kind of race? > > > > I know C++ probably stinks like the devil for you guys, but if you are > > curious to see the code, this fails one of our unit tests: > > > > https://github.com/scylladb/seastar/blob/master/tests/unit/connect_test.cc > > See test_connection_attempt_is_shutdown > > (above is the master seastar tree, not including the io_uring implementation) > > > Is this chaining with connect().then_wrapped() asynchronous? Like kind > of future/promise stuff? Correct. then_wrapped executes eventually when connect returns either success or failure > I wonder, if connect() and shutdown() there may > be executed in the reverse order. The methods connect and shutdown will execute in this order. But connect will just queue something that will later be sent down to the kernel. I initially suspected an ordering issue on my side. What made me start suspecting a bug are two reasons: - I can force the code to grab an sqe and call io_uring_enter at the moment the connect() call happens : I see no change. - that IOSQE_ASYNC changes this behavior, as you acknowledged yourself. It seems to me that if shutdown happens when the sqe is sitting on a kernel queue somewhere the connection will hang forever instead of failing right away as I would expect - if shutdown happens after the call to io_uring_enter > > The hung with IOSQE_ASYNC sounds strange anyway. > > > > Please let me know if this rings a bell and if there is anything I > > should be verifying here > > > > -- > Pavel Begunkov ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: shutdown not affecting connection? 2020-02-08 18:42 ` Glauber Costa @ 2020-02-08 18:48 ` Avi Kivity 2020-02-08 18:57 ` Glauber Costa 0 siblings, 1 reply; 10+ messages in thread From: Avi Kivity @ 2020-02-08 18:48 UTC (permalink / raw) To: Glauber Costa, Pavel Begunkov; +Cc: io-uring, Jens Axboe On 2/8/20 8:42 PM, Glauber Costa wrote: > Hi > > BTW, my apologies but I should have specified the kernel I am running: > 90206ac99c1f25b7f7a4c2c40a0b9d4561ffa9bf > > On Sat, Feb 8, 2020 at 9:26 AM Pavel Begunkov <[email protected]> wrote: >> Hi >> >> On 2/8/2020 4:55 PM, Glauber Costa wrote: >>> Hi >>> >>> I've been trying to make sense of some weird behavior with the seastar >>> implementation of io_uring, and started to suspect a bug in io_uring's >>> connect. >>> >>> The situation is as follows: >>> >>> - A connect() call is issued (and in the backend I can choose if I use >>> uring or not) >>> - The connection is supposed to take a while to establish. >>> - I call shutdown on the file descriptor >>> >>> If io_uring is not used: >>> - connect() starts by returning EINPROGRESS as expected, and after >>> the shutdown the file descriptor is finally made ready for epoll. I >>> call getsockopt(SOL_SOCKET, SO_ERROR), and see the error (104) >>> >>> if io_uring is used: >>> - if the SQE has the IOSQE_ASYNC flag on, connect() never returns. >>> - if the SQE *does not* have the IOSQE_ASYNC flag on, then most of the >>> time the test works as intended and connect() returns 104, but >>> occasionally it hangs too. Note that, seastar may choose not to call >>> io_uring_enter immediately and batch sqes. >>> >>> Sounds like some kind of race? >>> >>> I know C++ probably stinks like the devil for you guys, but if you are >>> curious to see the code, this fails one of our unit tests: >>> >>> https://github.com/scylladb/seastar/blob/master/tests/unit/connect_test.cc >>> See test_connection_attempt_is_shutdown >>> (above is the master seastar tree, not including the io_uring implementation) >>> >> Is this chaining with connect().then_wrapped() asynchronous? Like kind >> of future/promise stuff? > Correct. > then_wrapped executes eventually when connect returns either success or failure > >> I wonder, if connect() and shutdown() there may >> be executed in the reverse order. > The methods connect and shutdown will execute in this order. > But connect will just queue something that will later be sent down to > the kernel. > > I initially suspected an ordering issue on my side. What made me start > suspecting a bug > are two reasons: > - I can force the code to grab an sqe and call io_uring_enter at the > moment the connect() > call happens : I see no change. > - that IOSQE_ASYNC changes this behavior, as you acknowledged yourself. > > It seems to me that if shutdown happens when the sqe is sitting on a > kernel queue somewhere > the connection will hang forever instead of failing right away as I would expect > - if shutdown happens after the call to io_uring_enter You can try to cancel the sqe before you shutdown the socket. This will flush the queue (even if the cancellation fails). However, if you io_uring_enter before calling shutdown and connect does not return, I'd consider that a kernel bug. Perhaps you can reduce the problem to a small C reproducer? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: shutdown not affecting connection? 2020-02-08 18:48 ` Avi Kivity @ 2020-02-08 18:57 ` Glauber Costa 2020-02-08 20:20 ` Glauber Costa 0 siblings, 1 reply; 10+ messages in thread From: Glauber Costa @ 2020-02-08 18:57 UTC (permalink / raw) To: Avi Kivity; +Cc: Pavel Begunkov, io-uring, Jens Axboe On Sat, Feb 8, 2020 at 1:48 PM Avi Kivity <[email protected]> wrote: > > On 2/8/20 8:42 PM, Glauber Costa wrote: > > Hi > > > > BTW, my apologies but I should have specified the kernel I am running: > > 90206ac99c1f25b7f7a4c2c40a0b9d4561ffa9bf > > > > On Sat, Feb 8, 2020 at 9:26 AM Pavel Begunkov <[email protected]> wrote: > >> Hi > >> > >> On 2/8/2020 4:55 PM, Glauber Costa wrote: > >>> Hi > >>> > >>> I've been trying to make sense of some weird behavior with the seastar > >>> implementation of io_uring, and started to suspect a bug in io_uring's > >>> connect. > >>> > >>> The situation is as follows: > >>> > >>> - A connect() call is issued (and in the backend I can choose if I use > >>> uring or not) > >>> - The connection is supposed to take a while to establish. > >>> - I call shutdown on the file descriptor > >>> > >>> If io_uring is not used: > >>> - connect() starts by returning EINPROGRESS as expected, and after > >>> the shutdown the file descriptor is finally made ready for epoll. I > >>> call getsockopt(SOL_SOCKET, SO_ERROR), and see the error (104) > >>> > >>> if io_uring is used: > >>> - if the SQE has the IOSQE_ASYNC flag on, connect() never returns. > >>> - if the SQE *does not* have the IOSQE_ASYNC flag on, then most of the > >>> time the test works as intended and connect() returns 104, but > >>> occasionally it hangs too. Note that, seastar may choose not to call > >>> io_uring_enter immediately and batch sqes. > >>> > >>> Sounds like some kind of race? > >>> > >>> I know C++ probably stinks like the devil for you guys, but if you are > >>> curious to see the code, this fails one of our unit tests: > >>> > >>> https://github.com/scylladb/seastar/blob/master/tests/unit/connect_test.cc > >>> See test_connection_attempt_is_shutdown > >>> (above is the master seastar tree, not including the io_uring implementation) > >>> > >> Is this chaining with connect().then_wrapped() asynchronous? Like kind > >> of future/promise stuff? > > Correct. > > then_wrapped executes eventually when connect returns either success or failure > > > >> I wonder, if connect() and shutdown() there may > >> be executed in the reverse order. > > The methods connect and shutdown will execute in this order. > > But connect will just queue something that will later be sent down to > > the kernel. > > > > I initially suspected an ordering issue on my side. What made me start > > suspecting a bug > > are two reasons: > > - I can force the code to grab an sqe and call io_uring_enter at the > > moment the connect() > > call happens : I see no change. > > - that IOSQE_ASYNC changes this behavior, as you acknowledged yourself. > > > > It seems to me that if shutdown happens when the sqe is sitting on a > > kernel queue somewhere > > the connection will hang forever instead of failing right away as I would expect > > - if shutdown happens after the call to io_uring_enter > > > > You can try to cancel the sqe before you shutdown the socket. This will > flush the queue (even if the cancellation fails). > > > However, if you io_uring_enter before calling shutdown and connect does > not return, I'd consider that a kernel bug. That is very definitely what it happens, since I've changed the code to do it synchronously (with our flush() implementation, which will loop until an sqe can be acquired and io_uring_enter returns success), so at this point I am sure this is in the kernel. > Perhaps you can reduce the > problem to a small C reproducer? > That was my intended next step, yes > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: shutdown not affecting connection? 2020-02-08 18:57 ` Glauber Costa @ 2020-02-08 20:20 ` Glauber Costa 2020-02-08 20:28 ` Avi Kivity 0 siblings, 1 reply; 10+ messages in thread From: Glauber Costa @ 2020-02-08 20:20 UTC (permalink / raw) To: Avi Kivity; +Cc: Pavel Begunkov, io-uring, Jens Axboe [-- Attachment #1: Type: text/plain, Size: 766 bytes --] > > > > Perhaps you can reduce the > > problem to a small C reproducer? > > > That was my intended next step, yes s***, I didn't resist and I had to explain to my wife that no, I don't like io_uring more than I like her. But here it is. This is a modification of test/connect.c. I added a pthread comparison example that should achieve the same sequence of events: - try to sync connect - wait a bit - shutdown I added a fixed wait for pthread to make sure that shutdown is not called before connect. For io_uring, the shutdown is configurable with the program argument. This works just fine if I sleep before shutdown (as I would expect from a race). This hangs every time if I don't. Unless I am missing something I don't think this is the expected behavior [-- Attachment #2: connect.c --] [-- Type: text/x-csrc, Size: 3491 bytes --] /* * Check that IORING_OP_CONNECT works, with and without other side * being open. */ #include <stdio.h> #include <stdlib.h> #include <string.h> #include <errno.h> #include <fcntl.h> #include <unistd.h> #include <poll.h> #include <sys/socket.h> #include <netinet/in.h> #include <netinet/tcp.h> #include "liburing.h" #include <pthread.h> static int create_socket(void) { int fd; fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP); if (fd == -1) { perror("socket()"); return -1; } return fd; } static int submit_and_wait(struct io_uring *ring, int fd, int *res, int sleep_for) { struct io_uring_cqe *cqe; int ret; ret = io_uring_submit(ring); if (ret != 1) { fprintf(stderr, "io_using_submit: got %d\n", ret); return 1; } if (sleep_for) { sleep(sleep_for); } shutdown(fd, SHUT_RDWR); while (!io_uring_cq_ready(ring)) {} ret = io_uring_peek_cqe(ring, &cqe); if (ret < 0) { printf("peek cqe?\n"); return -1; } *res = cqe->res; io_uring_cqe_seen(ring, cqe); return 0; } static void * thread_start(void *arg) { int fd = (int)(uint64_t)arg; sleep(1); shutdown(fd, SHUT_RDWR); return NULL; } static int pthread_connect_socket(int fd, int *code) { struct sockaddr_in addr; int ret; memset(&addr, 0, sizeof(addr)); addr.sin_family = AF_INET; addr.sin_port = 0; addr.sin_addr.s_addr = 0x10010ac; pthread_attr_t attr; int s = pthread_attr_init(&attr); if (s != 0) { perror("pthread_attr_init"); return -1; } pthread_t thr; s = pthread_create(&thr, &attr, &thread_start, (void*)(uint64_t)fd); if (s != 0) { perror("pthread_create"); return -1; } ret = connect(fd, (struct sockaddr*)&addr, sizeof(addr)); return (ret < 0); } static int connect_socket(struct io_uring *ring, int fd, int *code, int sleep_for) { struct io_uring_sqe *sqe; struct sockaddr_in addr; int res; memset(&addr, 0, sizeof(addr)); addr.sin_family = AF_INET; addr.sin_port = 0; addr.sin_addr.s_addr = 0x10010ac; sqe = io_uring_get_sqe(ring); if (!sqe) { fprintf(stderr, "unable to get sqe\n"); return -1; } io_uring_prep_connect(sqe, fd, (struct sockaddr*)&addr, sizeof(addr)); sqe->user_data = 1; io_uring_sqe_set_flags(sqe, IOSQE_ASYNC); return submit_and_wait(ring, fd, &res, sleep_for); } static int test_connect(struct io_uring *ring, int use_uring, int sleep_for) { int connect_fd; int ret, code = 0; connect_fd = create_socket(); if (connect_fd == -1) return -1; if (use_uring) { ret = connect_socket(ring, connect_fd, &code, sleep_for); } else { ret = pthread_connect_socket(connect_fd, &code); } if (ret == -1) goto err; if (code != 0) { fprintf(stderr, "connect failed with %d\n", code); goto err; } close(connect_fd); return 0; err: close(connect_fd); return -1; } int main(int argc, char *argv[]) { struct io_uring ring; int ret; int sleep_for; if (argc == 1) { sleep_for = 0; } else { sleep_for = atoi(argv[1]); } ret = io_uring_queue_init(8, &ring, 0); if (ret == -1) { perror("io_uring_queue_setup()"); return 1; } ret = test_connect(&ring, 0, sleep_for); if (ret == 0) { printf("shutting down a socket trying to connect works with pthread\n"); } else { return -1; } ret = test_connect(&ring, 1, sleep_for); if (ret == 0) { printf("shutting down a socket trying to connect with uring works too, waited %d s before shutdown\n", sleep_for); } else { return -1; } io_uring_queue_exit(&ring); return 0; } ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: shutdown not affecting connection? 2020-02-08 20:20 ` Glauber Costa @ 2020-02-08 20:28 ` Avi Kivity 2020-02-08 20:43 ` Glauber Costa 0 siblings, 1 reply; 10+ messages in thread From: Avi Kivity @ 2020-02-08 20:28 UTC (permalink / raw) To: Glauber Costa; +Cc: Pavel Begunkov, io-uring, Jens Axboe On 2/8/20 10:20 PM, Glauber Costa wrote: >> >>> Perhaps you can reduce the >>> problem to a small C reproducer? >>> >> That was my intended next step, yes > s***, I didn't resist and I had to explain to my wife that no, I don't > like io_uring more than I like her. > > But here it is. > > This is a modification of test/connect.c. > I added a pthread comparison example that should achieve the same > sequence of events: > - try to sync connect > - wait a bit > - shutdown > > I added a fixed wait for pthread to make sure that shutdown is not > called before connect. > > For io_uring, the shutdown is configurable with the program argument. > This works just fine if I sleep before shutdown (as I would expect from a race). > This hangs every time if I don't. > > Unless I am missing something I don't think this is the expected behavior I think it is understandable. Since the socket is blocking uring moves the work to a workqueue, and the shutdown() happens before the workqueue has had a chance to process the connection attempt. So we'll have to cancel the sqe. Jens, does the blocking connect doesn't consume a kernel thread while it's waiting for a connection? Or does it just set things up and move on? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: shutdown not affecting connection? 2020-02-08 20:28 ` Avi Kivity @ 2020-02-08 20:43 ` Glauber Costa 0 siblings, 0 replies; 10+ messages in thread From: Glauber Costa @ 2020-02-08 20:43 UTC (permalink / raw) To: Avi Kivity; +Cc: Pavel Begunkov, io-uring, Jens Axboe On Sat, Feb 8, 2020 at 3:29 PM Avi Kivity <[email protected]> wrote: > > On 2/8/20 10:20 PM, Glauber Costa wrote: > >> > >>> Perhaps you can reduce the > >>> problem to a small C reproducer? > >>> > >> That was my intended next step, yes > > s***, I didn't resist and I had to explain to my wife that no, I don't > > like io_uring more than I like her. > > > > But here it is. > > > > This is a modification of test/connect.c. > > I added a pthread comparison example that should achieve the same > > sequence of events: > > - try to sync connect > > - wait a bit > > - shutdown > > > > I added a fixed wait for pthread to make sure that shutdown is not > > called before connect. > > > > For io_uring, the shutdown is configurable with the program argument. > > This works just fine if I sleep before shutdown (as I would expect from a race). > > This hangs every time if I don't. > > > > Unless I am missing something I don't think this is the expected behavior > > > I think it is understandable. Since the socket is blocking uring moves > the work to a workqueue, and the shutdown() happens before the workqueue > has had a chance to process the connection attempt. So we'll have to > cancel the sqe. It does seem to me that this implies that every shutdown must imply a cancel to a connection. From the user's perspective, this still feels like a bug to me: the fact that we had to move this to a work queue is an implementation detail: 1) we asked the kernel to do something 2) the kernel returned 3) we called shutdown() to expecting that cancel to go away and never returned. If cancel-after-connect to avoid these races is the intended behavior, it would be nice to get this documented somehow in the io_uring fantastic documentation. In hindsight, cancel-on-shutdown is quite obvious and natural. But I just spent two days to make this obvious and natural. > > > Jens, does the blocking connect doesn't consume a kernel thread while > it's waiting for a connection? Or does it just set things up and move on? > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: shutdown not affecting connection? 2020-02-08 13:55 shutdown not affecting connection? Glauber Costa 2020-02-08 14:26 ` Pavel Begunkov @ 2020-02-08 18:48 ` Andres Freund 2020-02-08 18:54 ` Glauber Costa 1 sibling, 1 reply; 10+ messages in thread From: Andres Freund @ 2020-02-08 18:48 UTC (permalink / raw) To: Glauber Costa; +Cc: io-uring, Jens Axboe, Avi Kivity Hi, On 2020-02-08 08:55:25 -0500, Glauber Costa wrote: > - A connect() call is issued (and in the backend I can choose if I use > uring or not) > - The connection is supposed to take a while to establish. > - I call shutdown on the file descriptor > > If io_uring is not used: > - connect() starts by returning EINPROGRESS as expected, and after > the shutdown the file descriptor is finally made ready for epoll. I > call getsockopt(SOL_SOCKET, SO_ERROR), and see the error (104) > > if io_uring is used: > - if the SQE has the IOSQE_ASYNC flag on, connect() never returns. That should be easy enough to reproduce without seastar as it sounds deterministic - how about modifying liburing's test/connect.c test to behave this way? Hm, any chance you set O_NONBLOCK on the fd, before calling the async connect? Wonder if io_connect() file_flags = force_nonblock ? O_NONBLOCK : 0; ret = __sys_connect_file(req->file, &io->connect.address, req->connect.addr_len, file_flags); if ((ret == -EAGAIN || ret == -EINPROGRESS) && force_nonblock) { fully takes into account that __sys_connect_file err = sock->ops->connect(sock, (struct sockaddr *)address, addrlen, sock->file->f_flags | file_flags); appears to leave O_NONBLOCK set on the file in place, which'd then not block in the wq? Greetings, Andres Freund ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: shutdown not affecting connection? 2020-02-08 18:48 ` Andres Freund @ 2020-02-08 18:54 ` Glauber Costa 0 siblings, 0 replies; 10+ messages in thread From: Glauber Costa @ 2020-02-08 18:54 UTC (permalink / raw) To: Andres Freund; +Cc: io-uring, Jens Axboe, Avi Kivity On Sat, Feb 8, 2020 at 1:48 PM Andres Freund <[email protected]> wrote: > > Hi, > > On 2020-02-08 08:55:25 -0500, Glauber Costa wrote: > > - A connect() call is issued (and in the backend I can choose if I use > > uring or not) > > - The connection is supposed to take a while to establish. > > - I call shutdown on the file descriptor > > > > If io_uring is not used: > > - connect() starts by returning EINPROGRESS as expected, and after > > the shutdown the file descriptor is finally made ready for epoll. I > > call getsockopt(SOL_SOCKET, SO_ERROR), and see the error (104) > > > > if io_uring is used: > > - if the SQE has the IOSQE_ASYNC flag on, connect() never returns. > > That should be easy enough to reproduce without seastar as it sounds > deterministic - how about modifying liburing's test/connect.c test to > behave this way? My plan was to work on that on Monday, but I wanted to get the message earlier in case it was a known issue or rang an obvious bell. It seems like it's not, so I'll stick to my plan. > > Hm, any chance you set O_NONBLOCK on the fd, before calling the async > connect? > In fact I do the opposite, and I force-remove the O_NONBLOCK flag. But I actually played around with it while chasing this, and I did, at some point set O_NONBLOCK. This is what the seastar code for connect (without uring) looks like: // socket is non-block here pfd->get_file_desc().connect(sa.u.sa, sa.length()); return pfd->writeable().then([pfd]() mutable { auto err = pfd->get_file_desc().getsockopt<int>(SOL_SOCKET, SO_ERROR); if (err != 0) { throw std::system_error(err, std::system_category()); } return make_ready_future<>(); }); So it essentially issues a nonblock connect, writes for the fd to be writeable, and then uses getsockopt to figure out what happened. With io_uring, what I see on an unblocked socket is: - it returns EINPROGRESS as I would expect - it is not ever made writeable. > Wonder if io_connect() > file_flags = force_nonblock ? O_NONBLOCK : 0; > > ret = __sys_connect_file(req->file, &io->connect.address, > req->connect.addr_len, file_flags); > if ((ret == -EAGAIN || ret == -EINPROGRESS) && force_nonblock) { > fully takes into account that __sys_connect_file > err = sock->ops->connect(sock, (struct sockaddr *)address, addrlen, > sock->file->f_flags | file_flags); > appears to leave O_NONBLOCK set on the file in place, which'd then > not block in the wq? > Isn't not-block the exact opposite of I am seeing ? If this was really not blocking, I'd expect that to return me something immediately, even if it was the wrong thing > Greetings, > > Andres Freund ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2020-02-08 20:43 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-02-08 13:55 shutdown not affecting connection? Glauber Costa 2020-02-08 14:26 ` Pavel Begunkov 2020-02-08 18:42 ` Glauber Costa 2020-02-08 18:48 ` Avi Kivity 2020-02-08 18:57 ` Glauber Costa 2020-02-08 20:20 ` Glauber Costa 2020-02-08 20:28 ` Avi Kivity 2020-02-08 20:43 ` Glauber Costa 2020-02-08 18:48 ` Andres Freund 2020-02-08 18:54 ` Glauber Costa
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox