public inbox for [email protected]
 help / color / mirror / Atom feed
From: Dylan Yudaken <[email protected]>
To: Jens Axboe <[email protected]>, Pavel Begunkov <[email protected]>
Cc: <[email protected]>, <[email protected]>,
	Dylan Yudaken <[email protected]>
Subject: [PATCH for-next 08/10] io_uring: allow defer completion for aux posted cqes
Date: Mon, 21 Nov 2022 02:03:51 -0800	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

Multishot ops cannot use the compl_reqs list as the request must stay in
the poll list, but that means they need to run each completion without
benefiting from batching.

Here introduce batching infrastructure for only small (ie 16 byte)
CQEs. This restriction is ok because there are no use cases posting 32
byte CQEs.

In the ring keep a batch of up to 16 posted results, and flush in the same
way as compl_reqs.

16 was chosen through experimentation on a microbenchmark ([1]), as well
as trying not to increase the size of the ring too much. This increases
the size to 1472 bytes from 1216.

[1]: https://github.com/DylanZA/liburing/commit/9ac66b36bcf4477bfafeff1c5f107896b7ae31cf
Run with $ make -j && ./benchmark/reg.b -s 1 -t 2000 -r 10
Gives results:
baseline	8309 k/s
8		18807 k/s
16		19338 k/s
32		20134 k/s

Signed-off-by: Dylan Yudaken <[email protected]>
---
 include/linux/io_uring_types.h |  2 ++
 io_uring/io_uring.c            | 49 +++++++++++++++++++++++++++++++---
 2 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
index f5b687a787a3..accdfecee953 100644
--- a/include/linux/io_uring_types.h
+++ b/include/linux/io_uring_types.h
@@ -174,7 +174,9 @@ struct io_submit_state {
 	bool			plug_started;
 	bool			need_plug;
 	unsigned short		submit_nr;
+	unsigned int		cqes_count;
 	struct blk_plug		plug;
+	struct io_uring_cqe	cqes[16];
 };
 
 struct io_ev_fd {
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 715ded749110..c797f9a75dfe 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -167,7 +167,8 @@ EXPORT_SYMBOL(io_uring_get_socket);
 
 static inline void io_submit_flush_completions(struct io_ring_ctx *ctx)
 {
-	if (!wq_list_empty(&ctx->submit_state.compl_reqs))
+	if (!wq_list_empty(&ctx->submit_state.compl_reqs) ||
+	    ctx->submit_state.cqes_count)
 		__io_submit_flush_completions(ctx);
 }
 
@@ -807,6 +808,43 @@ bool io_fill_cqe_aux(struct io_ring_ctx *ctx, u64 user_data, s32 res, u32 cflags
 	return io_cqring_event_overflow(ctx, user_data, res, cflags, 0, 0);
 }
 
+static bool __io_fill_cqe_small(struct io_ring_ctx *ctx,
+				 struct io_uring_cqe *cqe)
+{
+	struct io_uring_cqe *cqe_out;
+
+	cqe_out = io_get_cqe(ctx);
+	if (unlikely(!cqe_out)) {
+		return io_cqring_event_overflow(ctx, cqe->user_data,
+						cqe->res, cqe->flags,
+						0, 0);
+	}
+
+	trace_io_uring_complete(ctx, NULL, cqe->user_data,
+				cqe->res, cqe->flags,
+				0, 0);
+
+	memcpy(cqe_out, cqe, sizeof(*cqe_out));
+
+	if (ctx->flags & IORING_SETUP_CQE32) {
+		WRITE_ONCE(cqe_out->big_cqe[0], 0);
+		WRITE_ONCE(cqe_out->big_cqe[1], 0);
+	}
+	return true;
+}
+
+static void __io_flush_post_cqes(struct io_ring_ctx *ctx)
+	__must_hold(&ctx->uring_lock)
+{
+	struct io_submit_state *state = &ctx->submit_state;
+	unsigned int i;
+
+	lockdep_assert_held(&ctx->uring_lock);
+	for (i = 0; i < state->cqes_count; i++)
+		__io_fill_cqe_small(ctx, state->cqes + i);
+	state->cqes_count = 0;
+}
+
 bool io_post_aux_cqe(struct io_ring_ctx *ctx,
 		     u64 user_data, s32 res, u32 cflags)
 {
@@ -1352,6 +1390,9 @@ static void __io_submit_flush_completions(struct io_ring_ctx *ctx)
 	struct io_submit_state *state = &ctx->submit_state;
 
 	io_cq_lock(ctx);
+	/* post must come first to preserve CQE ordering */
+	if (state->cqes_count)
+		__io_flush_post_cqes(ctx);
 	wq_list_for_each(node, prev, &state->compl_reqs) {
 		struct io_kiocb *req = container_of(node, struct io_kiocb,
 					    comp_list);
@@ -1361,8 +1402,10 @@ static void __io_submit_flush_completions(struct io_ring_ctx *ctx)
 	}
 	__io_cq_unlock_post(ctx);
 
-	io_free_batch_list(ctx, state->compl_reqs.first);
-	INIT_WQ_LIST(&state->compl_reqs);
+	if (!wq_list_empty(&ctx->submit_state.compl_reqs)) {
+		io_free_batch_list(ctx, state->compl_reqs.first);
+		INIT_WQ_LIST(&state->compl_reqs);
+	}
 }
 
 /*
-- 
2.30.2


  parent reply	other threads:[~2022-11-21 10:13 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-21 10:03 [PATCH for-next 00/10] io_uring: batch multishot completions Dylan Yudaken
2022-11-21 10:03 ` [PATCH for-next 01/10] io_uring: merge io_req_tw_post and io_req_task_complete Dylan Yudaken
2022-11-21 10:03 ` [PATCH for-next 02/10] io_uring: __io_req_complete should defer if available Dylan Yudaken
2022-11-21 10:03 ` [PATCH for-next 03/10] io_uring: split io_req_complete_failed into post/defer Dylan Yudaken
2022-11-21 10:03 ` [PATCH for-next 04/10] io_uring: lock on remove in io_apoll_task_func Dylan Yudaken
2022-11-21 10:03 ` [PATCH for-next 05/10] io_uring: timeout should use io_req_task_complete Dylan Yudaken
2022-11-21 10:03 ` [PATCH for-next 06/10] io_uring: simplify io_issue_sqe Dylan Yudaken
2022-11-21 10:03 ` [PATCH for-next 07/10] io_uring: make io_req_complete_post static Dylan Yudaken
2022-11-21 10:03 ` Dylan Yudaken [this message]
2022-11-21 10:03 ` [PATCH for-next 09/10] io_uring: allow io_post_aux_cqe to defer completion Dylan Yudaken
2022-11-21 16:55   ` Jens Axboe
2022-11-21 17:31   ` Jens Axboe
2022-11-21 10:03 ` [PATCH for-next 10/10] io_uring: allow multishot polled reqs " Dylan Yudaken

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox