From: Ming Lei <[email protected]>
To: Pavel Begunkov <[email protected]>
Cc: Jens Axboe <[email protected]>,
[email protected], [email protected],
Kevin Wolf <[email protected]>,
[email protected]
Subject: Re: [PATCH V6 4/8] io_uring: support SQE group
Date: Sun, 6 Oct 2024 11:54:08 +0800 [thread overview]
Message-ID: <ZwIJ4Hn52-tm22Z8@fedora> (raw)
In-Reply-To: <[email protected]>
On Fri, Oct 04, 2024 at 02:12:28PM +0100, Pavel Begunkov wrote:
> On 9/12/24 11:49, Ming Lei wrote:
> ...
> > --- a/io_uring/io_uring.c
> > +++ b/io_uring/io_uring.c
> > @@ -111,13 +111,15 @@
> ...
> > +static void io_complete_group_member(struct io_kiocb *req)
> > +{
> > + struct io_kiocb *lead = get_group_leader(req);
> > +
> > + if (WARN_ON_ONCE(!(req->flags & REQ_F_SQE_GROUP) ||
> > + lead->grp_refs <= 0))
> > + return;
> > +
> > + /* member CQE needs to be posted first */
> > + if (!(req->flags & REQ_F_CQE_SKIP))
> > + io_req_commit_cqe(req->ctx, req);
> > +
> > + req->flags &= ~REQ_F_SQE_GROUP;
>
> I can't say I like this implicit state machine too much,
> but let's add a comment why we need to clear it. i.e.
> it seems it wouldn't be needed if not for the
> mark_last_group_member() below that puts it back to tunnel
> the leader to io_free_batch_list().
Yeah, the main purpose is for reusing the flag for marking last
member, will add comment for this usage.
>
> > +
> > + /* Set leader as failed in case of any member failed */
> > + if (unlikely((req->flags & REQ_F_FAIL)))
> > + req_set_fail(lead);
> > +
> > + if (!--lead->grp_refs) {
> > + mark_last_group_member(req);
> > + if (!(lead->flags & REQ_F_CQE_SKIP))
> > + io_req_commit_cqe(lead->ctx, lead);
> > + } else if (lead->grp_refs == 1 && (lead->flags & REQ_F_SQE_GROUP)) {
> > + /*
> > + * The single uncompleted leader will degenerate to plain
> > + * request, so group leader can be always freed via the
> > + * last completed member.
> > + */
> > + lead->flags &= ~REQ_F_SQE_GROUP_LEADER;
>
> What does this try to handle? A group with a leader but no
> members? If that's the case, io_group_sqe() and io_submit_state_end()
> just need to fail such groups (and clear REQ_F_SQE_GROUP before
> that).
The code block allows to issue leader and members concurrently, but
we have changed to always issue members after leader is completed, so
the above code can be removed now.
>
> > + }
> > +}
> > +
> > +static void io_complete_group_leader(struct io_kiocb *req)
> > +{
> > + WARN_ON_ONCE(req->grp_refs <= 1);
> > + req->flags &= ~REQ_F_SQE_GROUP;
> > + req->grp_refs -= 1;
> > +}
> > +
> > +static void io_complete_group_req(struct io_kiocb *req)
> > +{
> > + if (req_is_group_leader(req))
> > + io_complete_group_leader(req);
> > + else
> > + io_complete_group_member(req);
> > +}
> > +
> > static void io_req_complete_post(struct io_kiocb *req, unsigned issue_flags)
> > {
> > struct io_ring_ctx *ctx = req->ctx;
> > @@ -890,7 +1005,8 @@ static void io_req_complete_post(struct io_kiocb *req, unsigned issue_flags)
> > * Handle special CQ sync cases via task_work. DEFER_TASKRUN requires
> > * the submitter task context, IOPOLL protects with uring_lock.
> > */
> > - if (ctx->task_complete || (ctx->flags & IORING_SETUP_IOPOLL)) {
> > + if (ctx->task_complete || (ctx->flags & IORING_SETUP_IOPOLL) ||
> > + req_is_group_leader(req)) {
>
> We're better to push all group requests to io_req_task_complete(),
> not just a group leader. While seems to be correct, that just
> overcomplicates the request's flow, it can post a CQE here, but then
> still expect to do group stuff in the CQE posting loop
> (flush_completions -> io_complete_group_req), which might post another
> cqe for the leader, and then do yet another post processing loop in
> io_free_batch_list().
OK, it is simpler to complete all group reqs via tw.
>
>
> > req->io_task_work.func = io_req_task_complete;
> > io_req_task_work_add(req);
> > return;
> > @@ -1388,11 +1504,43 @@ static void io_free_batch_list(struct io_ring_ctx *ctx,
> > comp_list);
> > if (unlikely(req->flags & IO_REQ_CLEAN_SLOW_FLAGS)) {
> > + if (req_is_last_group_member(req) ||
> > + req_is_group_leader(req)) {
> > + struct io_kiocb *leader;
> > +
> > + /* Leader is freed via the last member */
> > + if (req_is_group_leader(req)) {
> > + if (req->grp_link)
> > + io_queue_group_members(req);
> > + node = req->comp_list.next;
> > + continue;
> > + }
> > +
> > + /*
> > + * Prepare for freeing leader since we are the
> > + * last group member
> > + */
> > + leader = get_group_leader(req);
> > + leader->flags &= ~REQ_F_SQE_GROUP_LEADER;
> > + req->flags &= ~REQ_F_SQE_GROUP;
> > + /*
> > + * Link leader to current request's next,
> > + * this way works because the iterator
> > + * always check the next node only.
> > + *
> > + * Be careful when you change the iterator
> > + * in future
> > + */
> > + wq_stack_add_head(&leader->comp_list,
> > + &req->comp_list);
> > + }
> > +
> > if (req->flags & REQ_F_REFCOUNT) {
> > node = req->comp_list.next;
> > if (!req_ref_put_and_test(req))
> > continue;
> > }
> > +
> > if ((req->flags & REQ_F_POLLED) && req->apoll) {
> > struct async_poll *apoll = req->apoll;
> > @@ -1427,8 +1575,16 @@ void __io_submit_flush_completions(struct io_ring_ctx *ctx)
> > struct io_kiocb *req = container_of(node, struct io_kiocb,
> > comp_list);
> > - if (!(req->flags & REQ_F_CQE_SKIP))
> > - io_req_commit_cqe(ctx, req);
> > + if (unlikely(req->flags & (REQ_F_CQE_SKIP | REQ_F_SQE_GROUP))) {
> > + if (req->flags & REQ_F_SQE_GROUP) {
> > + io_complete_group_req(req);
> > + continue;
> > + }
> > +
> > + if (req->flags & REQ_F_CQE_SKIP)
> > + continue;
> > + }
> > + io_req_commit_cqe(ctx, req);
> > }
> > __io_cq_unlock_post(ctx);
> > @@ -1638,8 +1794,12 @@ static u32 io_get_sequence(struct io_kiocb *req)
> > struct io_kiocb *cur;
> > /* need original cached_sq_head, but it was increased for each req */
> > - io_for_each_link(cur, req)
> > - seq--;
> > + io_for_each_link(cur, req) {
> > + if (req_is_group_leader(cur))
> > + seq -= cur->grp_refs;
> > + else
> > + seq--;
> > + }
> > return seq;
> > }
> ...
> > @@ -2217,8 +2470,22 @@ static void io_submit_state_end(struct io_ring_ctx *ctx)
> > {
> > struct io_submit_state *state = &ctx->submit_state;
> > - if (unlikely(state->link.head))
> > - io_queue_sqe_fallback(state->link.head);
> > + if (unlikely(state->group.head || state->link.head)) {
> > + /* the last member must set REQ_F_SQE_GROUP */
> > + if (state->group.head) {
> > + struct io_kiocb *lead = state->group.head;
> > +
> > + state->group.last->grp_link = NULL;
> > + if (lead->flags & IO_REQ_LINK_FLAGS)
> > + io_link_sqe(&state->link, lead);
> > + else
> > + io_queue_sqe_fallback(lead);
>
> req1(F_LINK), req2(F_GROUP), req3
>
> is supposed to be turned into
>
> req1 -> {group: req2 (lead), req3 }
>
> but note that req2 here doesn't have F_LINK set.
> I think it should be like this instead:
>
> if (state->link.head)
> io_link_sqe();
> else
> io_queue_sqe_fallback(lead);
Indeed, the above change is correct.
Thanks,
Ming
next prev parent reply other threads:[~2024-10-06 3:54 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-12 10:49 [PATCH V6 0/8] io_uring: support sqe group and provide group kbuf Ming Lei
2024-09-12 10:49 ` [PATCH V6 1/8] io_uring: add io_link_req() helper Ming Lei
2024-09-12 10:49 ` [PATCH V6 2/8] io_uring: add io_submit_fail_link() helper Ming Lei
2024-09-12 10:49 ` [PATCH V6 3/8] io_uring: add helper of io_req_commit_cqe() Ming Lei
2024-09-12 10:49 ` [PATCH V6 4/8] io_uring: support SQE group Ming Lei
2024-10-04 13:12 ` Pavel Begunkov
2024-10-06 3:54 ` Ming Lei [this message]
2024-10-09 11:53 ` Pavel Begunkov
2024-10-09 12:14 ` Ming Lei
2024-09-12 10:49 ` [PATCH V6 5/8] io_uring: support sqe group with members depending on leader Ming Lei
2024-10-04 13:18 ` Pavel Begunkov
2024-10-06 3:54 ` Ming Lei
2024-09-12 10:49 ` [PATCH V6 6/8] io_uring: support providing sqe group buffer Ming Lei
2024-10-04 15:32 ` Pavel Begunkov
2024-10-06 8:20 ` Ming Lei
2024-10-09 14:25 ` Pavel Begunkov
2024-10-10 3:00 ` Ming Lei
2024-10-10 18:51 ` Pavel Begunkov
2024-10-11 2:00 ` Ming Lei
2024-10-11 4:06 ` Ming Lei
2024-10-06 9:47 ` Ming Lei
2024-10-09 11:57 ` Pavel Begunkov
2024-10-09 12:21 ` Ming Lei
2024-09-12 10:49 ` [PATCH V6 7/8] io_uring/uring_cmd: support provide group kernel buffer Ming Lei
2024-10-04 15:44 ` Pavel Begunkov
2024-10-06 8:46 ` Ming Lei
2024-10-09 15:14 ` Pavel Begunkov
2024-10-10 3:28 ` Ming Lei
2024-10-10 15:48 ` Pavel Begunkov
2024-10-10 19:31 ` Jens Axboe
2024-10-11 2:30 ` Ming Lei
2024-10-11 2:39 ` Jens Axboe
2024-10-11 3:07 ` Ming Lei
2024-10-11 13:24 ` Jens Axboe
2024-10-11 14:20 ` Ming Lei
2024-10-11 14:41 ` Jens Axboe
2024-10-11 15:45 ` Ming Lei
2024-10-11 16:49 ` Jens Axboe
2024-10-12 3:35 ` Ming Lei
2024-10-14 18:40 ` Pavel Begunkov
2024-10-15 11:05 ` Ming Lei
2024-09-12 10:49 ` [PATCH V6 8/8] ublk: support provide io buffer Ming Lei
2024-10-17 22:31 ` Uday Shankar
2024-10-18 0:45 ` Ming Lei
2024-09-26 10:27 ` [PATCH V6 0/8] io_uring: support sqe group and provide group kbuf Ming Lei
2024-09-26 12:18 ` Jens Axboe
2024-09-26 19:46 ` Pavel Begunkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZwIJ4Hn52-tm22Z8@fedora \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox