* Re: [PATCH] io_uring: store back buffer in case of failure [not found] <[email protected]> @ 2021-06-15 10:01 ` Pavel Begunkov 2021-06-15 21:51 ` Jens Axboe 0 siblings, 1 reply; 7+ messages in thread From: Pavel Begunkov @ 2021-06-15 10:01 UTC (permalink / raw) To: Olivier Langlois, Jens Axboe, io-uring, linux-kernel On 6/11/21 8:46 PM, Olivier Langlois wrote: > the man page says the following: > > If succesful, the resulting CQE will have IORING_CQE_F_BUFFER set > in the flags part of the struct, and the upper IORING_CQE_BUFFER_SHIFT > bits will contain the ID of the selected buffers. > > in order to respect this contract, the buffer is stored back in case of > an error. There are several reasons to do that: > > 1. doing otherwise is counter-intuitive and error-prone (I cannot think > of a single example of a syscall failing and still require the user to > free the allocated resources). Especially when the documention > explicitly mention that this is the behavior to expect. > > 2. it is inefficient because the buffer is unneeded since there is no > data to transfer back to the user and the buffer will need to be > returned back to io_uring to avoid a leak. > > Signed-off-by: Olivier Langlois <[email protected]> > --- > fs/io_uring.c | 97 +++++++++++++++++++++++++++++++++------------------ > 1 file changed, 64 insertions(+), 33 deletions(-) > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index 42380ed563c4..502d7cd81a8c 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c [...] > +static unsigned int io_put_kbuf(struct io_kiocb *req, struct io_buffer *kbuf, > + u16 bgid, long res, unsigned int issue_flags) > { > unsigned int cflags; > + struct io_ring_ctx *ctx = req->ctx; > + struct io_buffer *head; > + bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK; > > cflags = kbuf->bid << IORING_CQE_BUFFER_SHIFT; > cflags |= IORING_CQE_F_BUFFER; > req->flags &= ~REQ_F_BUFFER_SELECTED; > - kfree(kbuf); > + > + /* > + * Theoritically, res == 0 could be included as well but that would > + * break the contract established in the man page saying that > + * a buffer is returned if the operation is successful. > + */ > + if (unlikely(res < 0)) { > + io_ring_submit_lock(ctx, !force_nonblock); io_complete_rw() is called from an IRQ context, so it can't sleep/wait. > + > + lockdep_assert_held(&ctx->uring_lock); > + > + head = xa_load(&ctx->io_buffers, bgid); > + if (head) { > + list_add_tail(&kbuf->list, &head->list); > + cflags = 0; > + } else { > + INIT_LIST_HEAD(&kbuf->list); > + if (!xa_insert(&ctx->io_buffers, bgid, kbuf, GFP_KERNEL)) > + cflags = 0; > + } > + io_ring_submit_unlock(ctx, !force_nonblock); > + } > + if (cflags) > + kfree(kbuf); > return cflags; > } > [...] > -static inline void io_rw_done(struct kiocb *kiocb, ssize_t ret) > +static inline void io_rw_done(struct kiocb *kiocb, ssize_t ret, > + unsigned int issue_flags) > { > switch (ret) { > case -EIOCBQUEUED: > @@ -2728,7 +2775,7 @@ static inline void io_rw_done(struct kiocb *kiocb, ssize_t ret) > ret = -EINTR; > fallthrough; > default: > - kiocb->ki_complete(kiocb, ret, 0); > + kiocb->ki_complete(kiocb, ret, issue_flags); Don't remember what the second argument of .ki_complete is for, but it definitely should not be used for issue_flags. E.g. because we get two different meanings for it depending on a context -- either a result from the block layer or issue_flags. -- Pavel Begunkov ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] io_uring: store back buffer in case of failure 2021-06-15 10:01 ` [PATCH] io_uring: store back buffer in case of failure Pavel Begunkov @ 2021-06-15 21:51 ` Jens Axboe 2021-06-16 13:42 ` Olivier Langlois 0 siblings, 1 reply; 7+ messages in thread From: Jens Axboe @ 2021-06-15 21:51 UTC (permalink / raw) To: Pavel Begunkov, Olivier Langlois, io-uring, linux-kernel Ditto for this one, don't see it in my email nor on the list. -- Jens Axboe ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] io_uring: store back buffer in case of failure 2021-06-15 21:51 ` Jens Axboe @ 2021-06-16 13:42 ` Olivier Langlois 2021-06-16 14:01 ` Pavel Begunkov 0 siblings, 1 reply; 7+ messages in thread From: Olivier Langlois @ 2021-06-16 13:42 UTC (permalink / raw) To: Jens Axboe, Pavel Begunkov, io-uring, linux-kernel On Tue, 2021-06-15 at 15:51 -0600, Jens Axboe wrote: > Ditto for this one, don't see it in my email nor on the list. > I can resend you a private copy of this one but as Pavel pointed out, it contains fatal flaws. So unless someone can tell me that the idea is interesting and has potential and can give me some a hint or 2 about how to address the challenges to fix the current flaws, it is pretty much a show stopper to me and I think that I am going to let it go... ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] io_uring: store back buffer in case of failure 2021-06-16 13:42 ` Olivier Langlois @ 2021-06-16 14:01 ` Pavel Begunkov 2021-06-16 14:44 ` Jens Axboe 0 siblings, 1 reply; 7+ messages in thread From: Pavel Begunkov @ 2021-06-16 14:01 UTC (permalink / raw) To: Olivier Langlois, Jens Axboe, io-uring, linux-kernel On 6/16/21 2:42 PM, Olivier Langlois wrote: > On Tue, 2021-06-15 at 15:51 -0600, Jens Axboe wrote: >> Ditto for this one, don't see it in my email nor on the list. >> > I can resend you a private copy of this one but as Pavel pointed out, > it contains fatal flaws. > > So unless someone can tell me that the idea is interesting and has > potential and can give me some a hint or 2 about how to address the > challenges to fix the current flaws, it is pretty much a show stopper > to me and I think that I am going to let it go... It'd need to go through some other context, e.g. task context. task_work_add() + custom handler would work, either buf-select synchronisation can be reworked, but both would rather be bulky and not great. I'm more curious if we ever hit REQ_F_BUFFER_SELECTED branch in io_clean_op(), because it would just drop the id on the floor... It might use a WARN_ONCE. -- Pavel Begunkov ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] io_uring: store back buffer in case of failure 2021-06-16 14:01 ` Pavel Begunkov @ 2021-06-16 14:44 ` Jens Axboe 2021-06-16 15:37 ` Pavel Begunkov 0 siblings, 1 reply; 7+ messages in thread From: Jens Axboe @ 2021-06-16 14:44 UTC (permalink / raw) To: Pavel Begunkov, Olivier Langlois, io-uring, linux-kernel On 6/16/21 8:01 AM, Pavel Begunkov wrote: > On 6/16/21 2:42 PM, Olivier Langlois wrote: >> On Tue, 2021-06-15 at 15:51 -0600, Jens Axboe wrote: >>> Ditto for this one, don't see it in my email nor on the list. >>> >> I can resend you a private copy of this one but as Pavel pointed out, >> it contains fatal flaws. >> >> So unless someone can tell me that the idea is interesting and has >> potential and can give me some a hint or 2 about how to address the >> challenges to fix the current flaws, it is pretty much a show stopper >> to me and I think that I am going to let it go... > > It'd need to go through some other context, e.g. task context. > task_work_add() + custom handler would work, either buf-select > synchronisation can be reworked, but both would rather be > bulky and not great. Indeed - that'd solve both the passing around of locking state which I really don't like, and make it much simpler. Just use task work for the re-insert, and you can grab the ring lock unconditionally from there. -- Jens Axboe ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] io_uring: store back buffer in case of failure 2021-06-16 14:44 ` Jens Axboe @ 2021-06-16 15:37 ` Pavel Begunkov 2021-06-18 21:26 ` Olivier Langlois 0 siblings, 1 reply; 7+ messages in thread From: Pavel Begunkov @ 2021-06-16 15:37 UTC (permalink / raw) To: Jens Axboe, Olivier Langlois, io-uring, linux-kernel On 6/16/21 3:44 PM, Jens Axboe wrote: > On 6/16/21 8:01 AM, Pavel Begunkov wrote: >> On 6/16/21 2:42 PM, Olivier Langlois wrote: >>> On Tue, 2021-06-15 at 15:51 -0600, Jens Axboe wrote: >>>> Ditto for this one, don't see it in my email nor on the list. >>>> >>> I can resend you a private copy of this one but as Pavel pointed out, >>> it contains fatal flaws. >>> >>> So unless someone can tell me that the idea is interesting and has >>> potential and can give me some a hint or 2 about how to address the >>> challenges to fix the current flaws, it is pretty much a show stopper >>> to me and I think that I am going to let it go... >> >> It'd need to go through some other context, e.g. task context. >> task_work_add() + custom handler would work, either buf-select >> synchronisation can be reworked, but both would rather be >> bulky and not great. > > Indeed - that'd solve both the passing around of locking state which > I really don't like, and make it much simpler. Just use task work for > the re-insert, and you can grab the ring lock unconditionally from > there. Hmm, it might be much simpler than I thought if we allocate a separate struct callback_head, i.e. task_work, queued it with exactly task_work_add() but not io_req_task_work_add(), and continue with the request handler. -- Pavel Begunkov ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] io_uring: store back buffer in case of failure 2021-06-16 15:37 ` Pavel Begunkov @ 2021-06-18 21:26 ` Olivier Langlois 0 siblings, 0 replies; 7+ messages in thread From: Olivier Langlois @ 2021-06-18 21:26 UTC (permalink / raw) To: Pavel Begunkov, Jens Axboe, io-uring, linux-kernel On Wed, 2021-06-16 at 16:37 +0100, Pavel Begunkov wrote: > On 6/16/21 3:44 PM, Jens Axboe wrote: > > On 6/16/21 8:01 AM, Pavel Begunkov wrote: > > > On 6/16/21 2:42 PM, Olivier Langlois wrote: > > > > On Tue, 2021-06-15 at 15:51 -0600, Jens Axboe wrote: > > > > > Ditto for this one, don't see it in my email nor on the list. > > > > > > > > > I can resend you a private copy of this one but as Pavel > > > > pointed out, > > > > it contains fatal flaws. > > > > > > > > So unless someone can tell me that the idea is interesting and > > > > has > > > > potential and can give me some a hint or 2 about how to address > > > > the > > > > challenges to fix the current flaws, it is pretty much a show > > > > stopper > > > > to me and I think that I am going to let it go... > > > > > > It'd need to go through some other context, e.g. task context. > > > task_work_add() + custom handler would work, either buf-select > > > synchronisation can be reworked, but both would rather be > > > bulky and not great. > > > > Indeed - that'd solve both the passing around of locking state > > which > > I really don't like, and make it much simpler. Just use task work > > for > > the re-insert, and you can grab the ring lock unconditionally from > > there. > > Hmm, it might be much simpler than I thought if we allocate > a separate struct callback_head, i.e. task_work, queued it > with exactly task_work_add() but not io_req_task_work_add(), > and continue with the request handler. > ok thx a lot for the excellent suggestions! I think that you have provided me everything that I need to give a shot for a second version of this patch. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-06-18 21:26 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <[email protected]>
2021-06-15 10:01 ` [PATCH] io_uring: store back buffer in case of failure Pavel Begunkov
2021-06-15 21:51 ` Jens Axboe
2021-06-16 13:42 ` Olivier Langlois
2021-06-16 14:01 ` Pavel Begunkov
2021-06-16 14:44 ` Jens Axboe
2021-06-16 15:37 ` Pavel Begunkov
2021-06-18 21:26 ` Olivier Langlois
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox