public inbox for [email protected]
 help / color / mirror / Atom feed
From: Pavel Begunkov <[email protected]>
To: Ming Lei <[email protected]>, Jens Axboe <[email protected]>,
	[email protected]
Cc: [email protected], Miklos Szeredi <[email protected]>,
	ZiyangZhang <[email protected]>,
	Xiaoguang Wang <[email protected]>,
	Bernd Schubert <[email protected]>
Subject: Re: [PATCH V2 00/17] io_uring/ublk: add IORING_OP_FUSED_CMD
Date: Tue, 7 Mar 2023 15:37:21 +0000	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 3/7/23 14:15, Ming Lei wrote:
> Hello,
> 
> Add IORING_OP_FUSED_CMD, it is one special URING_CMD, which has to
> be SQE128. The 1st SQE(master) is one 64byte URING_CMD, and the 2nd
> 64byte SQE(slave) is another normal 64byte OP. For any OP which needs
> to support slave OP, io_issue_defs[op].fused_slave needs to be set as 1,
> and its ->issue() can retrieve/import buffer from master request's
> fused_cmd_kbuf. The slave OP is actually submitted from kernel, part of
> this idea is from Xiaoguang's ublk ebpf patchset, but this patchset
> submits slave OP just like normal OP issued from userspace, that said,
> SQE order is kept, and batching handling is done too.

 From a quick look through patches it all looks a bit complicated
and intrusive, all over generic hot paths. I think instead we
should be able to use registered buffer table as intermediary and
reuse splicing. Let me try it out


> Please see detailed design in commit log of the 3th patch, and one big
> point is how to handle buffer ownership.
> 
> With this way, it is easy to support zero copy for ublk/fuse device.
> 
> Basically userspace can specify any sub-buffer of the ublk block request
> buffer from the fused command just by setting 'offset/len'
> in the slave SQE for running slave OP. This way is flexible to implement
> io mapping: mirror, stripped, ...
> 
> The 4th & 5th patches enable fused slave support for the following OPs:
> 
> 	OP_READ/OP_WRITE
> 	OP_SEND/OP_RECV/OP_SEND_ZC
> 
> The other ublk patches cleans ublk driver and implement fused command
> for supporting zero copy.
> 
> Follows userspace code:
> 
> https://github.com/ming1/ubdsrv/tree/fused-cmd-zc-v2
> 
> All three(loop, nbd and qcow2) ublk targets have supported zero copy by passing:
> 
> 	ublk add -t [loop|nbd|qcow2] -z ....
> 
> Basic fs mount/kernel building and builtin test are done.
> 
> Also add liburing test case for covering fused command based on miniublk
> of blktest:
> 
> https://github.com/ming1/liburing/commits/fused_cmd_miniublk
> 
> Performance improvement is obvious on memory bandwidth
> related workloads, such as, 1~2X improvement on 64K/512K BS
> IO test on loop with ramfs backing file.
> 
> Any comments are welcome!
> 
> V2:
> 	- don't resue io_mapped_ubuf (io_uring)
> 	- remove REQ_F_FUSED_MASTER_BIT (io_uring)
> 	- fix compile warning (io_uring)
> 	- rebase on v6.3-rc1 (io_uring)
> 	- grabbing io request reference when handling fused command
> 	- simplify ublk_copy_user_pages() by iov iterator
> 	- add read()/write() for userspace to read/write ublk io buffer, so
> 	that some corner cases(read zero, passthrough request(report zones)) can
> 	be handled easily in case of zero copy; this way also helps to switch to
> 	zero copy completely
> 	- misc cleanup
> 
> Ming Lei (17):
>    io_uring: add IO_URING_F_FUSED and prepare for supporting OP_FUSED_CMD
>    io_uring: increase io_kiocb->flags into 64bit
>    io_uring: add IORING_OP_FUSED_CMD
>    io_uring: support OP_READ/OP_WRITE for fused slave request
>    io_uring: support OP_SEND_ZC/OP_RECV for fused slave request
>    block: ublk_drv: mark device as LIVE before adding disk
>    block: ublk_drv: add common exit handling
>    block: ublk_drv: don't consider flush request in map/unmap io
>    block: ublk_drv: add two helpers to clean up map/unmap request
>    block: ublk_drv: clean up several helpers
>    block: ublk_drv: cleanup 'struct ublk_map_data'
>    block: ublk_drv: cleanup ublk_copy_user_pages
>    block: ublk_drv: grab request reference when the request is handled by
>      userspace
>    block: ublk_drv: support to copy any part of request pages
>    block: ublk_drv: add read()/write() support for ublk char device
>    block: ublk_drv: don't check buffer in case of zero copy
>    block: ublk_drv: apply io_uring FUSED_CMD for supporting zero copy
> 
>   drivers/block/ublk_drv.c       | 605 ++++++++++++++++++++++++++-------
>   drivers/char/mem.c             |   4 +
>   drivers/nvme/host/ioctl.c      |   9 +
>   include/linux/io_uring.h       |  49 ++-
>   include/linux/io_uring_types.h |  18 +-
>   include/uapi/linux/io_uring.h  |   1 +
>   include/uapi/linux/ublk_cmd.h  |  37 +-
>   io_uring/Makefile              |   2 +-
>   io_uring/fused_cmd.c           | 232 +++++++++++++
>   io_uring/fused_cmd.h           |  11 +
>   io_uring/io_uring.c            |  22 +-
>   io_uring/io_uring.h            |   3 +
>   io_uring/net.c                 |  23 +-
>   io_uring/opdef.c               |  17 +
>   io_uring/opdef.h               |   2 +
>   io_uring/rw.c                  |  20 ++
>   16 files changed, 926 insertions(+), 129 deletions(-)
>   create mode 100644 io_uring/fused_cmd.c
>   create mode 100644 io_uring/fused_cmd.h
> 

-- 
Pavel Begunkov

  parent reply	other threads:[~2023-03-07 15:39 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-07 14:15 [PATCH V2 00/17] io_uring/ublk: add IORING_OP_FUSED_CMD Ming Lei
2023-03-07 14:15 ` [PATCH V2 01/17] io_uring: add IO_URING_F_FUSED and prepare for supporting OP_FUSED_CMD Ming Lei
2023-03-07 14:15 ` [PATCH V2 02/17] io_uring: increase io_kiocb->flags into 64bit Ming Lei
2023-03-07 14:15 ` [PATCH V2 03/17] io_uring: add IORING_OP_FUSED_CMD Ming Lei
2023-03-07 14:15 ` [PATCH V2 04/17] io_uring: support OP_READ/OP_WRITE for fused slave request Ming Lei
2023-03-07 14:15 ` [PATCH V2 05/17] io_uring: support OP_SEND_ZC/OP_RECV " Ming Lei
2023-03-09  7:46   ` kernel test robot
2023-03-09 17:22   ` kernel test robot
2023-03-07 14:15 ` [PATCH V2 06/17] block: ublk_drv: mark device as LIVE before adding disk Ming Lei
2023-03-08  3:48   ` Ziyang Zhang
2023-03-08  7:44     ` Ming Lei
2023-03-07 14:15 ` [PATCH V2 07/17] block: ublk_drv: add common exit handling Ming Lei
2023-03-14 17:15   ` kernel test robot
2023-03-07 14:15 ` [PATCH V2 08/17] block: ublk_drv: don't consider flush request in map/unmap io Ming Lei
2023-03-08  3:50   ` Ziyang Zhang
2023-03-07 14:15 ` [PATCH V2 09/17] block: ublk_drv: add two helpers to clean up map/unmap request Ming Lei
2023-03-09  3:12   ` Ziyang Zhang
2023-03-07 14:15 ` [PATCH V2 10/17] block: ublk_drv: clean up several helpers Ming Lei
2023-03-09  3:12   ` Ziyang Zhang
2023-03-07 14:15 ` [PATCH V2 11/17] block: ublk_drv: cleanup 'struct ublk_map_data' Ming Lei
2023-03-09  3:16   ` Ziyang Zhang
2023-03-07 14:15 ` [PATCH V2 12/17] block: ublk_drv: cleanup ublk_copy_user_pages Ming Lei
2023-03-07 23:57   ` kernel test robot
2023-03-15  7:05   ` Ziyang Zhang
2023-03-07 14:15 ` [PATCH V2 13/17] block: ublk_drv: grab request reference when the request is handled by userspace Ming Lei
2023-03-15  5:20   ` kernel test robot
2023-03-07 14:15 ` [PATCH V2 14/17] block: ublk_drv: support to copy any part of request pages Ming Lei
2023-03-07 14:15 ` [PATCH V2 15/17] block: ublk_drv: add read()/write() support for ublk char device Ming Lei
2023-03-07 14:15 ` [PATCH V2 16/17] block: ublk_drv: don't check buffer in case of zero copy Ming Lei
2023-03-07 14:15 ` [PATCH V2 17/17] block: ublk_drv: apply io_uring FUSED_CMD for supporting " Ming Lei
2023-03-07 15:37 ` Pavel Begunkov [this message]
2023-03-07 17:17   ` [PATCH V2 00/17] io_uring/ublk: add IORING_OP_FUSED_CMD Pavel Begunkov
2023-03-08  2:10     ` Ming Lei
2023-03-08 14:46       ` Pavel Begunkov
2023-03-08 16:17         ` Ming Lei
2023-03-08 16:54           ` Pavel Begunkov
2023-03-09  1:44             ` Ming Lei
2023-03-08  1:08   ` Ming Lei
2023-03-08 16:22     ` Pavel Begunkov
2023-03-09  2:05       ` Ming Lei
2023-03-15  7:08 ` Ziyang Zhang
2023-03-15  7:54   ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox