public inbox for [email protected]
 help / color / mirror / Atom feed
From: Ming Lei <[email protected]>
To: Pavel Begunkov <[email protected]>
Cc: Keith Busch <[email protected]>, Keith Busch <[email protected]>,
	[email protected], [email protected],
	[email protected], [email protected]
Subject: Re: [PATCHv2 0/6] ublk zero-copy support
Date: Thu, 13 Feb 2025 09:52:19 +0800	[thread overview]
Message-ID: <Z61QU-qxgYhFGQwl@fedora> (raw)
In-Reply-To: <[email protected]>

On Wed, Feb 12, 2025 at 04:06:58PM +0000, Pavel Begunkov wrote:
> On 2/12/25 15:28, Keith Busch wrote:
> > On Wed, Feb 12, 2025 at 10:29:32AM +0800, Ming Lei wrote:
> > > It is explained in the following links:
> > > 
> > > https://lore.kernel.org/linux-block/[email protected]/
> > > 
> > > - node kbuffer is registered in ublk uring_cmd's ->issue(), but lookup
> > >    in RW_FIXED OP's ->prep(), and ->prep() is always called before calling
> > >    ->issue() when the two are submitted in same io_uring_enter(), so you
> > >    need to move io_rsrc_node_lookup() & buffer importing from RW_FIXED's ->prep()
> > >    to ->issue() first.
> > 
> > I don't think that's accurate, at least in practice. In a normal flow,
> > we'll have this sequence:
> > 
> >   io_submit_sqes
> >     io_submit_sqe (uring_cmd ublk register)
> >       io_init_req
> >         ->prep()
> >       io_queue_sqe
> >         ->issue()
> >     io_submit_sqe (read/write_fixed)
> >       io_init_req
> >         ->prep()
> >       io_queue_sqe
> >        ->issue()
> > 
> > The first SQE is handled in its entirety before even looking at the
> > subsequent SQE. Since the register is first, then the read/write_fixed's
> > prep will have a valid index. Testing this patch series appears to show
> > this reliably works.
> 
> Ming describes how it works for links. This one is indeed how
> non links are normally executed. Though I'd repeat it's an
> implementation detail and not a part of the uapi. Interestingly,
> Keith, you sent some patches changing the ordering here quite a
> while ago, just as an example of how it can change.

My fault, I should have provided the link or async background.

> 
> 
> > > - secondly, ->issue() order is only respected by IO_LINK, and io_uring
> > >    can't provide such guarantee without using IO_LINK:
> > > 
> > >    Pavel explained it in the following link:
> > > 
> > >    https://lore.kernel.org/linux-block/[email protected]/
> > > 
> > >    There are also other examples, such as, register buffer stays in one
> > >    link chain, and the consumer OP isn't in this chain, the consumer OP
> > >    can still be issued before issuing register_buffer.
> > 
> > Yep, I got that. Linking is just something I was hoping to avoid. I
> > understand there are conditions that can break the normal flow I'm
> > relying on regarding  the ordering. This hasn't appeared to be a problem
> > in practice, but I agree this needs to be handled.

LINK/ASYNC needs to be supported, and sometimes they are useful.

- IO_LINK is the only way for respecting IO order

  io_uring only supports non-link or link all in one batch

- ASYNC sometimes can avoid to call two ->issue() unnecessarily if you
  know that the OP can't be dealt with async way in advance, maybe not
  one problem for ublk uring_cmd, but it is helpful for some FS write
  (un-allocated write)


Thanks,
Ming


  reply	other threads:[~2025-02-13  1:52 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-11  0:56 [PATCHv2 0/6] ublk zero-copy support Keith Busch
2025-02-11  0:56 ` [PATCHv2 1/6] io_uring: use node for import Keith Busch
2025-02-11  0:56 ` [PATCHv2 2/6] io_uring: create resource release callback Keith Busch
2025-02-13  1:31   ` Pavel Begunkov
2025-02-13  1:58     ` Keith Busch
2025-02-13 13:06       ` Pavel Begunkov
2025-02-11  0:56 ` [PATCHv2 3/6] io_uring: add support for kernel registered bvecs Keith Busch
2025-02-13  1:33   ` Pavel Begunkov
2025-02-14  3:30   ` Ming Lei
2025-02-14 15:26     ` Keith Busch
2025-02-15  1:34       ` Ming Lei
2025-02-18 20:34         ` Keith Busch
2025-02-11  0:56 ` [PATCHv2 4/6] ublk: zc register/unregister bvec Keith Busch
2025-02-12  2:49   ` Ming Lei
2025-02-12  4:11     ` Keith Busch
2025-02-12  9:24       ` Ming Lei
2025-02-12 14:59         ` Keith Busch
2025-02-13  2:12   ` Pavel Begunkov
2025-02-11  0:56 ` [PATCHv2 5/6] io_uring: add abstraction for buf_table rsrc data Keith Busch
2025-02-11  0:56 ` [PATCHv2 6/6] io_uring: cache nodes and mapped buffers Keith Busch
2025-02-11 16:47   ` Keith Busch
2025-02-12  2:29 ` [PATCHv2 0/6] ublk zero-copy support Ming Lei
2025-02-12 15:28   ` Keith Busch
2025-02-12 16:06     ` Pavel Begunkov
2025-02-13  1:52       ` Ming Lei [this message]
2025-02-13 15:12 ` lizetao
2025-02-13 16:06   ` Keith Busch
2025-02-14  3:39     ` lizetao
2025-02-14  2:41   ` Ming Lei
2025-02-14  4:21     ` lizetao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z61QU-qxgYhFGQwl@fedora \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox