public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH 0/6] completion path optimisations
@ 2022-03-21 22:02 Pavel Begunkov
  2022-03-21 22:02 ` [PATCH 1/6] io_uring: small optimisation of tctx_task_work Pavel Begunkov
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Pavel Begunkov @ 2022-03-21 22:02 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence

Small optimisations and clean ups. 5/6 in particular removes some overhead
from io_free_batch_list(), which is hot enough to care about it.

Pavel Begunkov (6):
  io_uring: small optimisation of tctx_task_work
  io_uring: remove extra ifs around io_commit_cqring
  io_uring: refactor io_req_find_next
  io_uring: optimise io_free_batch_list
  io_uring: move poll recycling later in compl flushing
  io_uring: clean up io_queue_next()

 fs/io_uring.c | 64 +++++++++++++++++++++++++++------------------------
 1 file changed, 34 insertions(+), 30 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/6] io_uring: small optimisation of tctx_task_work
  2022-03-21 22:02 [PATCH 0/6] completion path optimisations Pavel Begunkov
@ 2022-03-21 22:02 ` Pavel Begunkov
  2022-03-21 22:02 ` [PATCH 2/6] io_uring: remove extra ifs around io_commit_cqring Pavel Begunkov
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Pavel Begunkov @ 2022-03-21 22:02 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence

There should be no completions stashed when we first get into
tctx_task_work(), so move completion flushing checks a bit later
after we had a chance to execute some task works.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 1a65d7880440..bb8a1362cb5e 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2492,10 +2492,6 @@ static void tctx_task_work(struct callback_head *cb)
 	while (1) {
 		struct io_wq_work_node *node1, *node2;
 
-		if (!tctx->task_list.first &&
-		    !tctx->prior_task_list.first && uring_locked)
-			io_submit_flush_completions(ctx);
-
 		spin_lock_irq(&tctx->task_lock);
 		node1 = tctx->prior_task_list.first;
 		node2 = tctx->task_list.first;
@@ -2509,10 +2505,13 @@ static void tctx_task_work(struct callback_head *cb)
 
 		if (node1)
 			handle_prev_tw_list(node1, &ctx, &uring_locked);
-
 		if (node2)
 			handle_tw_list(node2, &ctx, &uring_locked);
 		cond_resched();
+
+		if (!tctx->task_list.first &&
+		    !tctx->prior_task_list.first && uring_locked)
+			io_submit_flush_completions(ctx);
 	}
 
 	ctx_flush_and_put(ctx, &uring_locked);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/6] io_uring: remove extra ifs around io_commit_cqring
  2022-03-21 22:02 [PATCH 0/6] completion path optimisations Pavel Begunkov
  2022-03-21 22:02 ` [PATCH 1/6] io_uring: small optimisation of tctx_task_work Pavel Begunkov
@ 2022-03-21 22:02 ` Pavel Begunkov
  2022-03-21 22:02 ` [PATCH 3/6] io_uring: refactor io_req_find_next Pavel Begunkov
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Pavel Begunkov @ 2022-03-21 22:02 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence

Now io_commit_cqring() is simple and it tolerates well being called
without a new CQE filled, so kill a bunch of not needed anymore
guards.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index bb8a1362cb5e..71f3bea34e66 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1947,8 +1947,7 @@ static bool __io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force)
 			   ctx->rings->sq_flags & ~IORING_SQ_CQ_OVERFLOW);
 	}
 
-	if (posted)
-		io_commit_cqring(ctx);
+	io_commit_cqring(ctx);
 	spin_unlock(&ctx->completion_lock);
 	if (posted)
 		io_cqring_ev_posted(ctx);
@@ -2382,8 +2381,7 @@ static void __io_req_find_next_prep(struct io_kiocb *req)
 
 	spin_lock(&ctx->completion_lock);
 	posted = io_disarm_next(req);
-	if (posted)
-		io_commit_cqring(ctx);
+	io_commit_cqring(ctx);
 	spin_unlock(&ctx->completion_lock);
 	if (posted)
 		io_cqring_ev_posted(ctx);
@@ -10248,8 +10246,7 @@ static __cold bool io_kill_timeouts(struct io_ring_ctx *ctx,
 		}
 	}
 	spin_unlock_irq(&ctx->timeout_lock);
-	if (canceled != 0)
-		io_commit_cqring(ctx);
+	io_commit_cqring(ctx);
 	spin_unlock(&ctx->completion_lock);
 	if (canceled != 0)
 		io_cqring_ev_posted(ctx);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/6] io_uring: refactor io_req_find_next
  2022-03-21 22:02 [PATCH 0/6] completion path optimisations Pavel Begunkov
  2022-03-21 22:02 ` [PATCH 1/6] io_uring: small optimisation of tctx_task_work Pavel Begunkov
  2022-03-21 22:02 ` [PATCH 2/6] io_uring: remove extra ifs around io_commit_cqring Pavel Begunkov
@ 2022-03-21 22:02 ` Pavel Begunkov
  2022-03-21 22:02 ` [PATCH 4/6] io_uring: optimise io_free_batch_list Pavel Begunkov
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Pavel Begunkov @ 2022-03-21 22:02 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence

Move the fast path from io_req_find_next() into callers. It prepares us
for further changes.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 71f3bea34e66..4539461ee7b3 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2391,8 +2391,6 @@ static inline struct io_kiocb *io_req_find_next(struct io_kiocb *req)
 {
 	struct io_kiocb *nxt;
 
-	if (likely(!(req->flags & (REQ_F_LINK|REQ_F_HARDLINK))))
-		return NULL;
 	/*
 	 * If LINK is set, we have dependent requests in this chain. If we
 	 * didn't fail this request, queue the first one up, moving any other
@@ -2613,10 +2611,12 @@ static void io_req_task_queue_reissue(struct io_kiocb *req)
 
 static inline void io_queue_next(struct io_kiocb *req)
 {
-	struct io_kiocb *nxt = io_req_find_next(req);
+	if (unlikely(req->flags & (REQ_F_LINK|REQ_F_HARDLINK))) {
+		struct io_kiocb *nxt = io_req_find_next(req);
 
-	if (nxt)
-		io_req_task_queue(nxt);
+		if (nxt)
+			io_req_task_queue(nxt);
+	}
 }
 
 static void io_free_req(struct io_kiocb *req)
@@ -2710,7 +2710,8 @@ static inline struct io_kiocb *io_put_req_find_next(struct io_kiocb *req)
 	struct io_kiocb *nxt = NULL;
 
 	if (req_ref_put_and_test(req)) {
-		nxt = io_req_find_next(req);
+		if (unlikely(req->flags & (REQ_F_LINK|REQ_F_HARDLINK)))
+			nxt = io_req_find_next(req);
 		__io_free_req(req);
 	}
 	return nxt;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/6] io_uring: optimise io_free_batch_list
  2022-03-21 22:02 [PATCH 0/6] completion path optimisations Pavel Begunkov
                   ` (2 preceding siblings ...)
  2022-03-21 22:02 ` [PATCH 3/6] io_uring: refactor io_req_find_next Pavel Begunkov
@ 2022-03-21 22:02 ` Pavel Begunkov
  2022-03-21 22:02 ` [PATCH 5/6] io_uring: move poll recycling later in compl flushing Pavel Begunkov
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Pavel Begunkov @ 2022-03-21 22:02 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence

We do several req->flags checks in the fast path of
io_free_batch_list(). One explicit check of REQ_F_REFCOUNT, and two
other hidden in io_queue_next() and io_dismantle_req(). Moreover, there
is a io_req_put_rsrc_locked() call in between, so there is no hope
req->flags will be preserved in registers.

All those flags if not a slow path than definitely a slower path, so
put them all under a single flags mask check and save several mem
reloads and ifs.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 4539461ee7b3..36bcfcc23193 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -116,6 +116,9 @@
 				REQ_F_POLLED | REQ_F_INFLIGHT | REQ_F_CREDS | \
 				REQ_F_ASYNC_DATA)
 
+#define IO_REQ_CLEAN_SLOW_FLAGS (REQ_F_REFCOUNT | REQ_F_LINK | REQ_F_HARDLINK |\
+				 IO_REQ_CLEAN_FLAGS)
+
 #define IO_TCTX_REFS_CACHE_NR	(1U << 10)
 
 struct io_uring {
@@ -2641,15 +2644,20 @@ static void io_free_batch_list(struct io_ring_ctx *ctx,
 		struct io_kiocb *req = container_of(node, struct io_kiocb,
 						    comp_list);
 
-		if (unlikely(req->flags & REQ_F_REFCOUNT)) {
-			node = req->comp_list.next;
-			if (!req_ref_put_and_test(req))
-				continue;
+		if (unlikely(req->flags & IO_REQ_CLEAN_SLOW_FLAGS)) {
+			if (req->flags & REQ_F_REFCOUNT) {
+				node = req->comp_list.next;
+				if (!req_ref_put_and_test(req))
+					continue;
+			}
+			io_queue_next(req);
+			if (unlikely(req->flags & IO_REQ_CLEAN_FLAGS))
+				io_clean_op(req);
 		}
+		if (!(req->flags & REQ_F_FIXED_FILE))
+			io_put_file(req->file);
 
 		io_req_put_rsrc_locked(req, ctx);
-		io_queue_next(req);
-		io_dismantle_req(req);
 
 		if (req->task != task) {
 			if (task)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 5/6] io_uring: move poll recycling later in compl flushing
  2022-03-21 22:02 [PATCH 0/6] completion path optimisations Pavel Begunkov
                   ` (3 preceding siblings ...)
  2022-03-21 22:02 ` [PATCH 4/6] io_uring: optimise io_free_batch_list Pavel Begunkov
@ 2022-03-21 22:02 ` Pavel Begunkov
  2022-03-21 22:02 ` [PATCH 6/6] io_uring: clean up io_queue_next() Pavel Begunkov
  2022-03-22  2:52 ` [PATCH 0/6] completion path optimisations Jens Axboe
  6 siblings, 0 replies; 8+ messages in thread
From: Pavel Begunkov @ 2022-03-21 22:02 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence

There is a new (req->flags & REQ_F_POLLED) check in
__io_submit_flush_completions() for poll recycling, however
io_free_batch_list() is a much better place for it. First, we prefer it
after putting the last req ref just to avoid potential problems in the
future. Also, it'll enable the recycling for IOPOLL and also will place
it closer to all other req->flags bits clean up requests.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 36bcfcc23193..79294a7455d6 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2650,6 +2650,15 @@ static void io_free_batch_list(struct io_ring_ctx *ctx,
 				if (!req_ref_put_and_test(req))
 					continue;
 			}
+			if ((req->flags & REQ_F_POLLED) && req->apoll) {
+				struct async_poll *apoll = req->apoll;
+
+				if (apoll->double_poll)
+					kfree(apoll->double_poll);
+				list_add(&apoll->poll.wait.entry,
+						&ctx->apoll_cache);
+				req->flags &= ~REQ_F_POLLED;
+			}
 			io_queue_next(req);
 			if (unlikely(req->flags & IO_REQ_CLEAN_FLAGS))
 				io_clean_op(req);
@@ -2688,15 +2697,6 @@ static void __io_submit_flush_completions(struct io_ring_ctx *ctx)
 
 			if (!(req->flags & REQ_F_CQE_SKIP))
 				__io_fill_cqe_req(req, req->result, req->cflags);
-			if ((req->flags & REQ_F_POLLED) && req->apoll) {
-				struct async_poll *apoll = req->apoll;
-
-				if (apoll->double_poll)
-					kfree(apoll->double_poll);
-				list_add(&apoll->poll.wait.entry,
-						&ctx->apoll_cache);
-				req->flags &= ~REQ_F_POLLED;
-			}
 		}
 
 		io_commit_cqring(ctx);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 6/6] io_uring: clean up io_queue_next()
  2022-03-21 22:02 [PATCH 0/6] completion path optimisations Pavel Begunkov
                   ` (4 preceding siblings ...)
  2022-03-21 22:02 ` [PATCH 5/6] io_uring: move poll recycling later in compl flushing Pavel Begunkov
@ 2022-03-21 22:02 ` Pavel Begunkov
  2022-03-22  2:52 ` [PATCH 0/6] completion path optimisations Jens Axboe
  6 siblings, 0 replies; 8+ messages in thread
From: Pavel Begunkov @ 2022-03-21 22:02 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence

Move fast check out of io_queue_next(), it makes req->flags checks in
__io_submit_flush_completions() a bit clearer and grants us better
comtrol, e.g. can remove now not justified unlikely() in
__io_submit_flush_completions(). Also, we don't care about having this
check in io_free_req() as the function is a slow path and
io_req_find_next() handles it correctly.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 79294a7455d6..9baa120a96f9 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2612,14 +2612,12 @@ static void io_req_task_queue_reissue(struct io_kiocb *req)
 	io_req_task_work_add(req, false);
 }
 
-static inline void io_queue_next(struct io_kiocb *req)
+static void io_queue_next(struct io_kiocb *req)
 {
-	if (unlikely(req->flags & (REQ_F_LINK|REQ_F_HARDLINK))) {
-		struct io_kiocb *nxt = io_req_find_next(req);
+	struct io_kiocb *nxt = io_req_find_next(req);
 
-		if (nxt)
-			io_req_task_queue(nxt);
-	}
+	if (nxt)
+		io_req_task_queue(nxt);
 }
 
 static void io_free_req(struct io_kiocb *req)
@@ -2659,7 +2657,8 @@ static void io_free_batch_list(struct io_ring_ctx *ctx,
 						&ctx->apoll_cache);
 				req->flags &= ~REQ_F_POLLED;
 			}
-			io_queue_next(req);
+			if (req->flags & (REQ_F_LINK|REQ_F_HARDLINK))
+				io_queue_next(req);
 			if (unlikely(req->flags & IO_REQ_CLEAN_FLAGS))
 				io_clean_op(req);
 		}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/6] completion path optimisations
  2022-03-21 22:02 [PATCH 0/6] completion path optimisations Pavel Begunkov
                   ` (5 preceding siblings ...)
  2022-03-21 22:02 ` [PATCH 6/6] io_uring: clean up io_queue_next() Pavel Begunkov
@ 2022-03-22  2:52 ` Jens Axboe
  6 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2022-03-22  2:52 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On Mon, 21 Mar 2022 22:02:18 +0000, Pavel Begunkov wrote:
> Small optimisations and clean ups. 5/6 in particular removes some overhead
> from io_free_batch_list(), which is hot enough to care about it.
> 
> Pavel Begunkov (6):
>   io_uring: small optimisation of tctx_task_work
>   io_uring: remove extra ifs around io_commit_cqring
>   io_uring: refactor io_req_find_next
>   io_uring: optimise io_free_batch_list
>   io_uring: move poll recycling later in compl flushing
>   io_uring: clean up io_queue_next()
> 
> [...]

Applied, thanks!

[1/6] io_uring: small optimisation of tctx_task_work
      commit: 15071bf78bd3ee0253a3b57c2e092980dbd91a87
[2/6] io_uring: remove extra ifs around io_commit_cqring
      commit: c9be622494c012d56c71e00cb90be841820c3e34
[3/6] io_uring: refactor io_req_find_next
      commit: 096b50e6c27e3e983871a585116f2ae49a394196
[4/6] io_uring: optimise io_free_batch_list
      commit: 0eab1c46d7db864c6cb4405a4d9e6e75d541c97c
[5/6] io_uring: move poll recycling later in compl flushing
      commit: 2b095f5b2dfc49b79fdd68b03eec02ba278c7ea1
[6/6] io_uring: clean up io_queue_next()
      commit: b0b2d42c8384eb6068e751c4943452a756f433f1

Best regards,
-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-03-22  2:53 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-21 22:02 [PATCH 0/6] completion path optimisations Pavel Begunkov
2022-03-21 22:02 ` [PATCH 1/6] io_uring: small optimisation of tctx_task_work Pavel Begunkov
2022-03-21 22:02 ` [PATCH 2/6] io_uring: remove extra ifs around io_commit_cqring Pavel Begunkov
2022-03-21 22:02 ` [PATCH 3/6] io_uring: refactor io_req_find_next Pavel Begunkov
2022-03-21 22:02 ` [PATCH 4/6] io_uring: optimise io_free_batch_list Pavel Begunkov
2022-03-21 22:02 ` [PATCH 5/6] io_uring: move poll recycling later in compl flushing Pavel Begunkov
2022-03-21 22:02 ` [PATCH 6/6] io_uring: clean up io_queue_next() Pavel Begunkov
2022-03-22  2:52 ` [PATCH 0/6] completion path optimisations Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox