public inbox for [email protected]
 help / color / mirror / Atom feed
From: Ming Lei <[email protected]>
To: Pavel Begunkov <[email protected]>
Cc: Jens Axboe <[email protected]>,
	[email protected], [email protected],
	Miklos Szeredi <[email protected]>,
	ZiyangZhang <[email protected]>,
	Xiaoguang Wang <[email protected]>,
	Bernd Schubert <[email protected]>,
	[email protected]
Subject: Re: [PATCH V2 00/17] io_uring/ublk: add IORING_OP_FUSED_CMD
Date: Thu, 9 Mar 2023 10:05:37 +0800	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On Wed, Mar 08, 2023 at 04:22:15PM +0000, Pavel Begunkov wrote:
> On 3/8/23 01:08, Ming Lei wrote:
> > On Tue, Mar 07, 2023 at 03:37:21PM +0000, Pavel Begunkov wrote:
> > > On 3/7/23 14:15, Ming Lei wrote:
> > > > Hello,
> > > > 
> > > > Add IORING_OP_FUSED_CMD, it is one special URING_CMD, which has to
> > > > be SQE128. The 1st SQE(master) is one 64byte URING_CMD, and the 2nd
> > > > 64byte SQE(slave) is another normal 64byte OP. For any OP which needs
> > > > to support slave OP, io_issue_defs[op].fused_slave needs to be set as 1,
> > > > and its ->issue() can retrieve/import buffer from master request's
> > > > fused_cmd_kbuf. The slave OP is actually submitted from kernel, part of
> > > > this idea is from Xiaoguang's ublk ebpf patchset, but this patchset
> > > > submits slave OP just like normal OP issued from userspace, that said,
> > > > SQE order is kept, and batching handling is done too.
> > > 
> > >  From a quick look through patches it all looks a bit complicated
> > > and intrusive, all over generic hot paths. I think instead we
> > 
> > Really? The main change to generic hot paths are adding one 'true/false'
> > parameter to io_init_req(). For others, the change is just check on
> > req->flags or issue_flags, which is basically zero cost.
> 
> Extra flag in io_init_req() but also exporting it, which is an
> internal function, to non-core code. Additionally it un-inlines it

We can make it inline for core code only.

> and even looks recurse calls it (max depth 2). From a quick look,

The reurse call is only done for fused command, and won't be one
issue for normal OPs.

> there is some hand coded ->cached_refs manipulations, it takes extra
> space in generic sections of io_kiocb.

Yeah, but it is still done on fused command only. I think people
is happy to pay the cost for the benefit, and we do not cause trouble
for others.

> It makes all cmd users to
> check for IO_URING_F_FUSED. There is also a two-way dependency b/w

The check is zero cost, and just for avoiding to add ->fused_cmd() callback,
otherwise the check can be killed.

> requests, which never plays out well, e.g. I still hate how linked
> timeouts stick out in generic paths.

I appreciate you may explain it in details.

Yeah, part of fused command's job is to submit one new io and wait its completion.
slave request is actually invisible in the linked list, and only fused
command can be in the linked list.

> 
> Depending on SQE128 also doesn't seem right, though it can be dealt
> with, e.g. sth like how it's done with links requests.

I thought about handling it by linked request, but we need fused command to be
completed after the slave request is done, and that becomes one deadlock if
the two are linked together.

SQE128 is per-context feature, when we need to submit uring SQE128 command, the
same ring is required to handle IO, then IMO it is perfect for this
case, at least for ublk.

> 
> > > should be able to use registered buffer table as intermediary and
> > > reuse splicing. Let me try it out
> > 
> > I will take a look at you patch, but last time, Linus has pointed out that
> > splice isn't one good way, in which buffer ownership transferring is one big
> > issue for writing data to page retrieved from pipe.
> 
> There are no real pipes, better to say io_uring replaces a pipe,
> and splice bits are used to get pages from a file. Though, there
> will be some common problems. Thanks for the link, I'll need to
> get through it first, thanks for the link

Yeah, here the only value of pipe is to reuse ->splice_read() interface,
that is why I figure out fused command for this job. I am open for
other approaches, if the problem can be solved(reliably and efficiently).

Thanks, 
Ming


  reply	other threads:[~2023-03-09  2:06 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-07 14:15 [PATCH V2 00/17] io_uring/ublk: add IORING_OP_FUSED_CMD Ming Lei
2023-03-07 14:15 ` [PATCH V2 01/17] io_uring: add IO_URING_F_FUSED and prepare for supporting OP_FUSED_CMD Ming Lei
2023-03-07 14:15 ` [PATCH V2 02/17] io_uring: increase io_kiocb->flags into 64bit Ming Lei
2023-03-07 14:15 ` [PATCH V2 03/17] io_uring: add IORING_OP_FUSED_CMD Ming Lei
2023-03-07 14:15 ` [PATCH V2 04/17] io_uring: support OP_READ/OP_WRITE for fused slave request Ming Lei
2023-03-07 14:15 ` [PATCH V2 05/17] io_uring: support OP_SEND_ZC/OP_RECV " Ming Lei
2023-03-09  7:46   ` kernel test robot
2023-03-09 17:22   ` kernel test robot
2023-03-07 14:15 ` [PATCH V2 06/17] block: ublk_drv: mark device as LIVE before adding disk Ming Lei
2023-03-08  3:48   ` Ziyang Zhang
2023-03-08  7:44     ` Ming Lei
2023-03-07 14:15 ` [PATCH V2 07/17] block: ublk_drv: add common exit handling Ming Lei
2023-03-14 17:15   ` kernel test robot
2023-03-07 14:15 ` [PATCH V2 08/17] block: ublk_drv: don't consider flush request in map/unmap io Ming Lei
2023-03-08  3:50   ` Ziyang Zhang
2023-03-07 14:15 ` [PATCH V2 09/17] block: ublk_drv: add two helpers to clean up map/unmap request Ming Lei
2023-03-09  3:12   ` Ziyang Zhang
2023-03-07 14:15 ` [PATCH V2 10/17] block: ublk_drv: clean up several helpers Ming Lei
2023-03-09  3:12   ` Ziyang Zhang
2023-03-07 14:15 ` [PATCH V2 11/17] block: ublk_drv: cleanup 'struct ublk_map_data' Ming Lei
2023-03-09  3:16   ` Ziyang Zhang
2023-03-07 14:15 ` [PATCH V2 12/17] block: ublk_drv: cleanup ublk_copy_user_pages Ming Lei
2023-03-07 23:57   ` kernel test robot
2023-03-15  7:05   ` Ziyang Zhang
2023-03-07 14:15 ` [PATCH V2 13/17] block: ublk_drv: grab request reference when the request is handled by userspace Ming Lei
2023-03-15  5:20   ` kernel test robot
2023-03-07 14:15 ` [PATCH V2 14/17] block: ublk_drv: support to copy any part of request pages Ming Lei
2023-03-07 14:15 ` [PATCH V2 15/17] block: ublk_drv: add read()/write() support for ublk char device Ming Lei
2023-03-07 14:15 ` [PATCH V2 16/17] block: ublk_drv: don't check buffer in case of zero copy Ming Lei
2023-03-07 14:15 ` [PATCH V2 17/17] block: ublk_drv: apply io_uring FUSED_CMD for supporting " Ming Lei
2023-03-07 15:37 ` [PATCH V2 00/17] io_uring/ublk: add IORING_OP_FUSED_CMD Pavel Begunkov
2023-03-07 17:17   ` Pavel Begunkov
2023-03-08  2:10     ` Ming Lei
2023-03-08 14:46       ` Pavel Begunkov
2023-03-08 16:17         ` Ming Lei
2023-03-08 16:54           ` Pavel Begunkov
2023-03-09  1:44             ` Ming Lei
2023-03-08  1:08   ` Ming Lei
2023-03-08 16:22     ` Pavel Begunkov
2023-03-09  2:05       ` Ming Lei [this message]
2023-03-15  7:08 ` Ziyang Zhang
2023-03-15  7:54   ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox