* Spurious/undocumented EINTR from io_uring_enter
@ 2020-04-07 20:36 Joseph Christopher Sible
2020-04-07 21:41 ` Jens Axboe
0 siblings, 1 reply; 5+ messages in thread
From: Joseph Christopher Sible @ 2020-04-07 20:36 UTC (permalink / raw)
To: '[email protected]'
When a process is blocking in io_uring_enter, and a signal stops it for
any reason, it returns -EINTR to userspace. Two comments about this:
1. https://github.com/axboe/liburing/blob/master/man/io_uring_enter.2
doesn't mention EINTR as a possible error that it can return.
2. When there's no signal handler, and a signal stopped the syscall for
some other reason (e.g., SIGSTOP, SIGTSTP, or any signal when the
process is being traced), other syscalls (e.g., read) will be
restarted transparently, but this one will return to userspace
with -EINTR just as if there were a signal handler.
Point 1 seems like a no-brainer. I'm not sure if point 2 is possible
to fix, though, especially since some other syscalls (e.g., epoll_wait)
have the same problem as this one.
Joseph C. Sible
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Spurious/undocumented EINTR from io_uring_enter 2020-04-07 20:36 Spurious/undocumented EINTR from io_uring_enter Joseph Christopher Sible @ 2020-04-07 21:41 ` Jens Axboe 2020-04-08 16:41 ` Joseph Christopher Sible 0 siblings, 1 reply; 5+ messages in thread From: Jens Axboe @ 2020-04-07 21:41 UTC (permalink / raw) To: Joseph Christopher Sible, '[email protected]' On 4/7/20 1:36 PM, Joseph Christopher Sible wrote: > When a process is blocking in io_uring_enter, and a signal stops it for > any reason, it returns -EINTR to userspace. Two comments about this: > > 1. https://github.com/axboe/liburing/blob/master/man/io_uring_enter.2 > doesn't mention EINTR as a possible error that it can return. I'll add it to the man page. > 2. When there's no signal handler, and a signal stopped the syscall for > some other reason (e.g., SIGSTOP, SIGTSTP, or any signal when the > process is being traced), other syscalls (e.g., read) will be > restarted transparently, but this one will return to userspace > with -EINTR just as if there were a signal handler. > > Point 1 seems like a no-brainer. I'm not sure if point 2 is possible > to fix, though, especially since some other syscalls (e.g., epoll_wait) > have the same problem as this one. Lots of system calls return -EINTR if interrupted by a signal, don't think there's anything worth fixing there. For the wait part, the application may want to handle the signal before we can wait again. We can't go to sleep with a pending signal. -- Jens Axboe ^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: Spurious/undocumented EINTR from io_uring_enter 2020-04-07 21:41 ` Jens Axboe @ 2020-04-08 16:41 ` Joseph Christopher Sible 2020-04-08 17:49 ` Jens Axboe 0 siblings, 1 reply; 5+ messages in thread From: Joseph Christopher Sible @ 2020-04-08 16:41 UTC (permalink / raw) To: 'Jens Axboe', '[email protected]' On 4/7/20 5:42 PM, Jens Axboe wrote: > Lots of system calls return -EINTR if interrupted by a signal, don't > think there's anything worth fixing there. For the wait part, the > application may want to handle the signal before we can wait again. > We can't go to sleep with a pending signal. This seems to be an unambiguous bug, at least according to the BUGS section of the ptrace man page. The behavior of epoll_wait is explicitly called out as being buggy/wrong, and we're emulating its behavior. As for the application wanting to handle the signal, in those cases, it would choose to install a signal handler, in which case I absolutely agree that returning -EINTR is the right thing to do. I'm only talking about the case where the application didn't choose to install a signal handler (and the signal would have been completely invisible to the process had it not been being traced). Joseph C. Sible ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Spurious/undocumented EINTR from io_uring_enter 2020-04-08 16:41 ` Joseph Christopher Sible @ 2020-04-08 17:49 ` Jens Axboe 2020-04-08 18:57 ` Joseph Christopher Sible 0 siblings, 1 reply; 5+ messages in thread From: Jens Axboe @ 2020-04-08 17:49 UTC (permalink / raw) To: Joseph Christopher Sible, '[email protected]' On 4/8/20 9:41 AM, Joseph Christopher Sible wrote: > On 4/7/20 5:42 PM, Jens Axboe wrote: >> Lots of system calls return -EINTR if interrupted by a signal, don't >> think there's anything worth fixing there. For the wait part, the >> application may want to handle the signal before we can wait again. >> We can't go to sleep with a pending signal. > > This seems to be an unambiguous bug, at least according to the BUGS > section of the ptrace man page. The behavior of epoll_wait is explicitly > called out as being buggy/wrong, and we're emulating its behavior. As > for the application wanting to handle the signal, in those cases, it > would choose to install a signal handler, in which case I absolutely > agree that returning -EINTR is the right thing to do. I'm only talking > about the case where the application didn't choose to install a signal > handler (and the signal would have been completely invisible to the > process had it not been being traced). So what do you suggest? The only recurse the kernel has is to flush signals, which would just delete the signal completely. It's a wait operation, and you cannot wait with signals pending. The only wait to retry is to return the number of events we already got, or -EINTR if we got none, and return to userspace. That'll ensure the signal gets handled, and the app must then call wait again if it wants to wait for more. There's no "emulating behavior" here, you make it sound like we're trying to be bug compatible with some random other system call. That's not the case at all. -- Jens Axboe ^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: Spurious/undocumented EINTR from io_uring_enter 2020-04-08 17:49 ` Jens Axboe @ 2020-04-08 18:57 ` Joseph Christopher Sible 0 siblings, 0 replies; 5+ messages in thread From: Joseph Christopher Sible @ 2020-04-08 18:57 UTC (permalink / raw) To: 'Jens Axboe', '[email protected]' On 4/8/20 1:49 PM, Jens Axboe wrote: > On 4/8/20 9:41 AM, Joseph Christopher Sible wrote: > > On 4/7/20 5:42 PM, Jens Axboe wrote: > >> Lots of system calls return -EINTR if interrupted by a signal, don't > >> think there's anything worth fixing there. For the wait part, the > >> application may want to handle the signal before we can wait again. > >> We can't go to sleep with a pending signal. > > > > This seems to be an unambiguous bug, at least according to the BUGS > > section of the ptrace man page. The behavior of epoll_wait is > > explicitly called out as being buggy/wrong, and we're emulating its > > behavior. As for the application wanting to handle the signal, in > > those cases, it would choose to install a signal handler, in which > > case I absolutely agree that returning -EINTR is the right thing to > > do. I'm only talking about the case where the application didn't > > choose to install a signal handler (and the signal would have been > > completely invisible to the process had it not been being traced). > > So what do you suggest? The only recurse the kernel has is to flush signals, > which would just delete the signal completely. It's a wait operation, and you > cannot wait with signals pending. The only wait to retry is to return the > number of events we already got, or -EINTR if we got none, and return to > userspace. That'll ensure the signal gets handled, and the app must then call > wait again if it wants to wait for more. > > There's no "emulating behavior" here, you make it sound like we're trying to > be bug compatible with some random other system call. > That's not the case at all. Sorry, I used "emulating" in the informal sense. I just meant that we happen to have the same bug that epoll_wait does, that most other syscalls don't. Anyway, I'd like it to work like the select syscall works. Consider this program: #include <signal.h> #include <stdio.h> #include <sys/select.h> #include <unistd.h> void handle(int s) { write(1, "In signal handler\n", 18); } int main(void) { struct sigaction act = { .sa_handler = handle }; struct timeval t = { .tv_sec = 10 }; fd_set set; FD_ZERO(&set); sigaction(SIGUSR1, &act, NULL); select(0, NULL, NULL, NULL, &t); perror("select"); return 0; } You can do any of the following to that program and it will still finish its full 10 seconds and output "Success": * Stop it with Ctrl+Z then resume it with fg * Attach to it with strace or gdb while it's already running * Start it under strace or gdb, then resize the terminal window while it's running The only thing that it will make it output "Interrupted system call" is if you "kill -USR1" it, resulting in its signal handler being called. It looks like what's happening is that the syscall really is getting stopped by the other signals, but once the kernel determines that the process isn't going to see that particular signal, it restarts the syscall from where it left off automatically. I think this is how almost all of the syscalls in the kernel work. However, if you run a similar program to that one, but that uses io_uring_enter instead of select, then doing any of the 3 things on that list will make it output "Interrupted system call", even though no signal handler ran. This is the behavior that I'm saying is buggy and similar to epoll_wait, and that I'd like to see changed. Joseph C. Sible ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-04-08 18:58 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-04-07 20:36 Spurious/undocumented EINTR from io_uring_enter Joseph Christopher Sible 2020-04-07 21:41 ` Jens Axboe 2020-04-08 16:41 ` Joseph Christopher Sible 2020-04-08 17:49 ` Jens Axboe 2020-04-08 18:57 ` Joseph Christopher Sible
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox