* Feature request: Please implement IORING_OP_TEE @ 2020-04-27 15:40 Clay Harris 2020-04-27 15:55 ` Jens Axboe 0 siblings, 1 reply; 8+ messages in thread From: Clay Harris @ 2020-04-27 15:40 UTC (permalink / raw) To: io-uring I was excited to see IORING_OP_SPLICE go in, but disappointed that tee didn't go in at the same time. It would be very useful to copy pipe buffers in an async program. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Feature request: Please implement IORING_OP_TEE 2020-04-27 15:40 Feature request: Please implement IORING_OP_TEE Clay Harris @ 2020-04-27 15:55 ` Jens Axboe 2020-04-27 18:03 ` Pavel Begunkov 2020-04-27 18:22 ` Jann Horn 0 siblings, 2 replies; 8+ messages in thread From: Jens Axboe @ 2020-04-27 15:55 UTC (permalink / raw) To: Clay Harris; +Cc: io-uring, Pavel Begunkov On 4/27/20 9:40 AM, Clay Harris wrote: > I was excited to see IORING_OP_SPLICE go in, but disappointed that tee > didn't go in at the same time. It would be very useful to copy pipe > buffers in an async program. Pavel, care to wire up tee? From a quick look, looks like just exposing do_tee() and calling that, so should be trivial. -- Jens Axboe ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Feature request: Please implement IORING_OP_TEE 2020-04-27 15:55 ` Jens Axboe @ 2020-04-27 18:03 ` Pavel Begunkov 2020-04-27 18:11 ` Jens Axboe 2020-04-27 18:22 ` Jann Horn 1 sibling, 1 reply; 8+ messages in thread From: Pavel Begunkov @ 2020-04-27 18:03 UTC (permalink / raw) To: Jens Axboe, Clay Harris; +Cc: io-uring On 27/04/2020 18:55, Jens Axboe wrote: > On 4/27/20 9:40 AM, Clay Harris wrote: >> I was excited to see IORING_OP_SPLICE go in, but disappointed that tee >> didn't go in at the same time. It would be very useful to copy pipe >> buffers in an async program. > > Pavel, care to wire up tee? From a quick look, looks like just exposing > do_tee() and calling that, so should be trivial. Yes, should be, I'll add it -- Pavel Begunkov ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Feature request: Please implement IORING_OP_TEE 2020-04-27 18:03 ` Pavel Begunkov @ 2020-04-27 18:11 ` Jens Axboe 0 siblings, 0 replies; 8+ messages in thread From: Jens Axboe @ 2020-04-27 18:11 UTC (permalink / raw) To: Pavel Begunkov, Clay Harris; +Cc: io-uring On 4/27/20 12:03 PM, Pavel Begunkov wrote: > On 27/04/2020 18:55, Jens Axboe wrote: >> On 4/27/20 9:40 AM, Clay Harris wrote: >>> I was excited to see IORING_OP_SPLICE go in, but disappointed that tee >>> didn't go in at the same time. It would be very useful to copy pipe >>> buffers in an async program. >> >> Pavel, care to wire up tee? From a quick look, looks like just exposing >> do_tee() and calling that, so should be trivial. > > Yes, should be, I'll add it Only other thing I spotted is making the inode lock / double lock honor nowait, which is separate from the SPLICE_F_NONBLOCK flag. -- Jens Axboe ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Feature request: Please implement IORING_OP_TEE 2020-04-27 15:55 ` Jens Axboe 2020-04-27 18:03 ` Pavel Begunkov @ 2020-04-27 18:22 ` Jann Horn 2020-04-27 20:02 ` Jens Axboe 2020-04-27 20:17 ` Clay Harris 1 sibling, 2 replies; 8+ messages in thread From: Jann Horn @ 2020-04-27 18:22 UTC (permalink / raw) To: Jens Axboe; +Cc: Clay Harris, io-uring, Pavel Begunkov On Mon, Apr 27, 2020 at 5:56 PM Jens Axboe <[email protected]> wrote: > On 4/27/20 9:40 AM, Clay Harris wrote: > > I was excited to see IORING_OP_SPLICE go in, but disappointed that tee > > didn't go in at the same time. It would be very useful to copy pipe > > buffers in an async program. > > Pavel, care to wire up tee? From a quick look, looks like just exposing > do_tee() and calling that, so should be trivial. Just out of curiosity: What's the purpose of doing that via io_uring? Non-blocking sys_tee() just shoves around some metadata, it doesn't do any I/O, right? Is this purely for syscall-batching reasons? (And does that mean that you would also add syscalls like epoll_wait() and futex() to io_uring?) Or is this because you're worried about blocking on the pipe mutex? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Feature request: Please implement IORING_OP_TEE 2020-04-27 18:22 ` Jann Horn @ 2020-04-27 20:02 ` Jens Axboe 2020-04-29 15:57 ` Pavel Begunkov 2020-04-27 20:17 ` Clay Harris 1 sibling, 1 reply; 8+ messages in thread From: Jens Axboe @ 2020-04-27 20:02 UTC (permalink / raw) To: Jann Horn; +Cc: Clay Harris, io-uring, Pavel Begunkov On 4/27/20 12:22 PM, Jann Horn wrote: > On Mon, Apr 27, 2020 at 5:56 PM Jens Axboe <[email protected]> wrote: >> On 4/27/20 9:40 AM, Clay Harris wrote: >>> I was excited to see IORING_OP_SPLICE go in, but disappointed that tee >>> didn't go in at the same time. It would be very useful to copy pipe >>> buffers in an async program. >> >> Pavel, care to wire up tee? From a quick look, looks like just exposing >> do_tee() and calling that, so should be trivial. > > Just out of curiosity: > > What's the purpose of doing that via io_uring? Non-blocking sys_tee() > just shoves around some metadata, it doesn't do any I/O, right? Is > this purely for syscall-batching reasons? (And does that mean that you > would also add syscalls like epoll_wait() and futex() to io_uring?) Or > is this because you're worried about blocking on the pipe mutex? Right, it doesn't do any IO. It does potentially block on the inode mutex, but that's about it. I think the reasons are mainly: - Keep the interfaces the same, instead of using both sync and async calls. - Bundling/batch reasons, either in same submission, or chained. Some folks have talked about futex, and epoll_wait would also be a natural extension as well, since we already have the ctl part. -- Jens Axboe ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Feature request: Please implement IORING_OP_TEE 2020-04-27 20:02 ` Jens Axboe @ 2020-04-29 15:57 ` Pavel Begunkov 0 siblings, 0 replies; 8+ messages in thread From: Pavel Begunkov @ 2020-04-29 15:57 UTC (permalink / raw) To: Jens Axboe, Jann Horn; +Cc: Clay Harris, io-uring On 27/04/2020 23:02, Jens Axboe wrote: > On 4/27/20 12:22 PM, Jann Horn wrote: >> On Mon, Apr 27, 2020 at 5:56 PM Jens Axboe <[email protected]> wrote: >>> On 4/27/20 9:40 AM, Clay Harris wrote: >>>> I was excited to see IORING_OP_SPLICE go in, but disappointed that tee >>>> didn't go in at the same time. It would be very useful to copy pipe >>>> buffers in an async program. >>> >>> Pavel, care to wire up tee? From a quick look, looks like just exposing >>> do_tee() and calling that, so should be trivial. >> >> Just out of curiosity: >> >> What's the purpose of doing that via io_uring? Non-blocking sys_tee() >> just shoves around some metadata, it doesn't do any I/O, right? Is >> this purely for syscall-batching reasons? (And does that mean that you >> would also add syscalls like epoll_wait() and futex() to io_uring?) Or >> is this because you're worried about blocking on the pipe mutex? > > Right, it doesn't do any IO. It does potentially block on the inode > mutex, but that's about it. I think the reasons are mainly: Good catch, the waiting probably can happen with splice as well. I need to read it through, but looks strange that it just ignores O_NONBLOCK, is there some upper bound for holding it or something? > > - Keep the interfaces the same, instead of using both sync and async > calls. > - Bundling/batch reasons, either in same submission, or chained. > > Some folks have talked about futex, and epoll_wait would also be a > natural extension as well, since we already have the ctl part. -- Pavel Begunkov ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Feature request: Please implement IORING_OP_TEE 2020-04-27 18:22 ` Jann Horn 2020-04-27 20:02 ` Jens Axboe @ 2020-04-27 20:17 ` Clay Harris 1 sibling, 0 replies; 8+ messages in thread From: Clay Harris @ 2020-04-27 20:17 UTC (permalink / raw) To: Jann Horn; +Cc: Jens Axboe, io-uring, Pavel Begunkov On Mon, Apr 27 2020 at 20:22:18 +0200, Jann Horn quoth thus: > On Mon, Apr 27, 2020 at 5:56 PM Jens Axboe <[email protected]> wrote: > > On 4/27/20 9:40 AM, Clay Harris wrote: > > > I was excited to see IORING_OP_SPLICE go in, but disappointed that tee > > > didn't go in at the same time. It would be very useful to copy pipe > > > buffers in an async program. > > > > Pavel, care to wire up tee? From a quick look, looks like just exposing > > do_tee() and calling that, so should be trivial. > > Just out of curiosity: > > What's the purpose of doing that via io_uring? Non-blocking sys_tee() > just shoves around some metadata, it doesn't do any I/O, right? Is > this purely for syscall-batching reasons? (And does that mean that you > would also add syscalls like epoll_wait() and futex() to io_uring?) Or > is this because you're worried about blocking on the pipe mutex? From my perspective -- syscall-batching. But, if you're going to be working with a very large number of file descriptors, you'll need to have epoll(). You could do this by building epoll_wait into io_uring and/or having a separate uring only for IO and never waiting for completions there, but instead calling epoll() when there are no ready cqe's. I'd had assumed that this was already being looked at because of the definition of IORING_OP_EPOLL_CTL. ---- So, I'd like to take this opportunity to bounce a related thought off of all of you. Even with the advent of io_uring, I think the approach of handling a bunch of IO by marking all of the fds non-blocking and using epoll() in edge-triggered mode is still valuable. But, there is an impedance mismatch between splice() / tee() and using epoll() this way. (In fact, this applies to all requests that take both an input and output fd.) That is the request is working on two fds, but returning only one status. In the IO loop, we want to do IO until we receive an EAGAIN and mark the fd as blocked. We then unblock it when epoll() says we can do IO again. This doesn't work well when we don't know which fd the EAGAIN was for. So, we have to issue a seperate poll() request on the involved fds to find out. Logically, we'd like to get the status of both fds back from the initial request, but that's not practical because once an error is detected on one, the other is not further examined. So, the idea is to introduce a new flag which could be passed to any request that takes both an input and output fd. If the flag is clear, errors are returned exactly as they are now. If the flag is set, and the error occured with the output fd, add 1 << 30 to the error number. As it would be very rare for errors to concurrently be on both fds, this would be practically as good as simultaneously getting the status of both fds back. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-04-29 15:58 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-04-27 15:40 Feature request: Please implement IORING_OP_TEE Clay Harris 2020-04-27 15:55 ` Jens Axboe 2020-04-27 18:03 ` Pavel Begunkov 2020-04-27 18:11 ` Jens Axboe 2020-04-27 18:22 ` Jann Horn 2020-04-27 20:02 ` Jens Axboe 2020-04-29 15:57 ` Pavel Begunkov 2020-04-27 20:17 ` Clay Harris
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox