From: Ming Lei <[email protected]>
To: Jens Axboe <[email protected]>, [email protected]
Cc: [email protected],
Pavel Begunkov <[email protected]>,
Kevin Wolf <[email protected]>,
[email protected]
Subject: Re: [PATCH V3 5/9] io_uring: support SQE group
Date: Tue, 21 May 2024 10:58:15 +0800 [thread overview]
Message-ID: <ZkwNxxUM7jqzpqgg@fedora> (raw)
In-Reply-To: <[email protected]>
On Sat, May 11, 2024 at 08:12:08AM +0800, Ming Lei wrote:
> SQE group is defined as one chain of SQEs starting with the first SQE that
> has IOSQE_SQE_GROUP set, and ending with the first subsequent SQE that
> doesn't have it set, and it is similar with chain of linked SQEs.
>
> Not like linked SQEs, each sqe is issued after the previous one is completed.
> All SQEs in one group are submitted in parallel, so there isn't any dependency
> among SQEs in one group.
>
> The 1st SQE is group leader, and the other SQEs are group member. The whole
> group share single IOSQE_IO_LINK and IOSQE_IO_DRAIN from group leader, and
> the two flags are ignored for group members.
>
> When the group is in one link chain, this group isn't submitted until the
> previous SQE or group is completed. And the following SQE or group can't
> be started if this group isn't completed. Failure from any group member will
> fail the group leader, then the link chain can be terminated.
>
> When IOSQE_IO_DRAIN is set for group leader, all requests in this group and
> previous requests submitted are drained. Given IOSQE_IO_DRAIN can be set for
> group leader only, we respect IO_DRAIN by always completing group leader as
> the last one in the group.
>
> Working together with IOSQE_IO_LINK, SQE group provides flexible way to
> support N:M dependency, such as:
>
> - group A is chained with group B together
> - group A has N SQEs
> - group B has M SQEs
>
> then M SQEs in group B depend on N SQEs in group A.
>
> N:M dependency can support some interesting use cases in efficient way:
>
> 1) read from multiple files, then write the read data into single file
>
> 2) read from single file, and write the read data into multiple files
>
> 3) write same data into multiple files, and read data from multiple files and
> compare if correct data is written
>
> Also IOSQE_SQE_GROUP takes the last bit in sqe->flags, but we still can
> extend sqe->flags with one uring context flag, such as use __pad3 for
> non-uring_cmd OPs and part of uring_cmd_flags for uring_cmd OP.
>
> Suggested-by: Kevin Wolf <[email protected]>
> Signed-off-by: Ming Lei <[email protected]>
BTW, I wrote one link-grp-cp.c liburing/example which is based on sqe group,
and keep QD not changed, just re-organize IOs in the following ways:
- each group have 4 READ IOs, linked by one single write IO for writing
the read data in sqe group to destination file
- the 1st 12 groups have (4 + 1) IOs, and the last group have (3 + 1)
IOs
Run the example for copying two block device(from virtio-blk to
virtio-scsi in my test VM):
1) buffered copy:
- perf is improved by 5%
2) direct IO mode
- perf is improved by 27%
[1] link-grp-cp.c example
https://github.com/ming1/liburing/commits/sqe_group_v2/
[2] one bug fixes(top commit) against V3
https://github.com/ming1/linux/commits/io_uring_sqe_group_v3/
Thanks,
Ming
next prev parent reply other threads:[~2024-05-21 2:58 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-11 0:12 [PATCH V3 0/9] io_uring: support sqe group and provide group kbuf Ming Lei
2024-05-11 0:12 ` [PATCH V3 1/9] io_uring: add io_link_req() helper Ming Lei
2024-05-11 0:12 ` [PATCH V3 2/9] io_uring: add io_submit_fail_link() helper Ming Lei
2024-05-11 0:12 ` [PATCH V3 3/9] io_uring: add helper of io_req_commit_cqe() Ming Lei
2024-06-10 1:18 ` Pavel Begunkov
2024-06-11 13:21 ` Ming Lei
2024-05-11 0:12 ` [PATCH V3 4/9] io_uring: move marking REQ_F_CQE_SKIP out of io_free_req() Ming Lei
2024-06-10 1:23 ` Pavel Begunkov
2024-06-11 13:28 ` Ming Lei
2024-06-16 18:08 ` Pavel Begunkov
2024-05-11 0:12 ` [PATCH V3 5/9] io_uring: support SQE group Ming Lei
2024-05-21 2:58 ` Ming Lei [this message]
2024-06-10 1:55 ` Pavel Begunkov
2024-06-11 13:32 ` Ming Lei
2024-06-16 18:14 ` Pavel Begunkov
2024-06-17 1:42 ` Ming Lei
2024-06-10 2:53 ` Pavel Begunkov
2024-06-13 1:45 ` Ming Lei
2024-06-16 19:13 ` Pavel Begunkov
2024-06-17 3:54 ` Ming Lei
2024-05-11 0:12 ` [PATCH V3 6/9] io_uring: support sqe group with members depending on leader Ming Lei
2024-05-11 0:12 ` [PATCH V3 7/9] io_uring: support providing sqe group buffer Ming Lei
2024-06-10 2:00 ` Pavel Begunkov
2024-06-12 0:22 ` Ming Lei
2024-05-11 0:12 ` [PATCH V3 8/9] io_uring/uring_cmd: support provide group kernel buffer Ming Lei
2024-05-11 0:12 ` [PATCH V3 9/9] ublk: support provide io buffer Ming Lei
2024-06-03 0:05 ` [PATCH V3 0/9] io_uring: support sqe group and provide group kbuf Ming Lei
2024-06-07 12:32 ` Pavel Begunkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZkwNxxUM7jqzpqgg@fedora \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox