public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH 0/5] another round of small tinkering
@ 2021-02-12  3:23 Pavel Begunkov
  2021-02-12  3:23 ` [PATCH 1/5] io_uring: take compl state from submit state Pavel Begunkov
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-02-12  3:23 UTC (permalink / raw)
  To: Jens Axboe, io-uring

Some other small improvements with negative diffstat.
2-3 are to address acquire_mm_files overhead

Pavel Begunkov (5):
  io_uring: take compl state from submit state
  io_uring: optimise out unlikely link queue
  io_uring: optimise SQPOLL mm/files grabbing
  io_uring: don't duplicate io_req_task_queue()
  io_uring: save ctx put/get for task_work submit

 fs/io_uring.c | 103 +++++++++++++++++++-------------------------------
 1 file changed, 39 insertions(+), 64 deletions(-)

-- 
2.24.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/5] io_uring: take compl state from submit state
  2021-02-12  3:23 [PATCH 0/5] another round of small tinkering Pavel Begunkov
@ 2021-02-12  3:23 ` Pavel Begunkov
  2021-02-12  3:23 ` [PATCH 2/5] io_uring: optimise out unlikely link queue Pavel Begunkov
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-02-12  3:23 UTC (permalink / raw)
  To: Jens Axboe, io-uring

Completion and submission states are now coupled together, it's weird to
get one from argument and another from ctx, do it consistently for
io_req_free_batch(). It's also faster as we already have @state cached
in registers.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 7b1979624320..8c2613bf54d3 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2417,13 +2417,10 @@ static void io_req_free_batch(struct req_batch *rb, struct io_kiocb *req,
 	rb->ctx_refs++;
 
 	io_dismantle_req(req);
-	if (state->free_reqs != ARRAY_SIZE(state->reqs)) {
+	if (state->free_reqs != ARRAY_SIZE(state->reqs))
 		state->reqs[state->free_reqs++] = req;
-	} else {
-		struct io_comp_state *cs = &req->ctx->submit_state.comp;
-
-		list_add(&req->compl.list, &cs->free_list);
-	}
+	else
+		list_add(&req->compl.list, &state->comp.free_list);
 }
 
 static void io_submit_flush_completions(struct io_comp_state *cs,
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/5] io_uring: optimise out unlikely link queue
  2021-02-12  3:23 [PATCH 0/5] another round of small tinkering Pavel Begunkov
  2021-02-12  3:23 ` [PATCH 1/5] io_uring: take compl state from submit state Pavel Begunkov
@ 2021-02-12  3:23 ` Pavel Begunkov
  2021-02-12  3:23 ` [PATCH 3/5] io_uring: optimise SQPOLL mm/files grabbing Pavel Begunkov
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-02-12  3:23 UTC (permalink / raw)
  To: Jens Axboe, io-uring

__io_queue_sqe() tries to issue as much requests of a link as it can,
and uses io_put_req_find_next() to extract a next one, targeting inline
completed requests. As now __io_queue_sqe() is always used together with
struct io_comp_state, it leaves next propagation only a small window and
only for async reqs, that doesn't justify its existence.

Remove it, make __io_queue_sqe() to issue only a head request. It
simplifies the code and will allow other optimisations.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 42 ++++++++++--------------------------------
 1 file changed, 10 insertions(+), 32 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 8c2613bf54d3..26d1080217e5 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6563,26 +6563,20 @@ static struct io_kiocb *io_prep_linked_timeout(struct io_kiocb *req)
 
 static void __io_queue_sqe(struct io_kiocb *req)
 {
-	struct io_kiocb *linked_timeout;
+	struct io_kiocb *linked_timeout = io_prep_linked_timeout(req);
 	const struct cred *old_creds = NULL;
 	int ret;
 
-again:
-	linked_timeout = io_prep_linked_timeout(req);
-
 	if ((req->flags & REQ_F_WORK_INITIALIZED) &&
 	    (req->work.flags & IO_WQ_WORK_CREDS) &&
-	    req->work.identity->creds != current_cred()) {
-		if (old_creds)
-			revert_creds(old_creds);
-		if (old_creds == req->work.identity->creds)
-			old_creds = NULL; /* restored original creds */
-		else
-			old_creds = override_creds(req->work.identity->creds);
-	}
+	    req->work.identity->creds != current_cred())
+		old_creds = override_creds(req->work.identity->creds);
 
 	ret = io_issue_sqe(req, IO_URING_F_NONBLOCK|IO_URING_F_COMPLETE_DEFER);
 
+	if (old_creds)
+		revert_creds(old_creds);
+
 	/*
 	 * We async punt it if the file wasn't marked NOWAIT, or if the file
 	 * doesn't support non-blocking read/write attempts
@@ -6595,9 +6589,6 @@ static void __io_queue_sqe(struct io_kiocb *req)
 			 */
 			io_queue_async_work(req);
 		}
-
-		if (linked_timeout)
-			io_queue_linked_timeout(linked_timeout);
 	} else if (likely(!ret)) {
 		/* drop submission reference */
 		if (req->flags & REQ_F_COMPLETE_INLINE) {
@@ -6605,31 +6596,18 @@ static void __io_queue_sqe(struct io_kiocb *req)
 			struct io_comp_state *cs = &ctx->submit_state.comp;
 
 			cs->reqs[cs->nr++] = req;
-			if (cs->nr == IO_COMPL_BATCH)
+			if (cs->nr == ARRAY_SIZE(cs->reqs))
 				io_submit_flush_completions(cs, ctx);
-			req = NULL;
 		} else {
-			req = io_put_req_find_next(req);
-		}
-
-		if (linked_timeout)
-			io_queue_linked_timeout(linked_timeout);
-
-		if (req) {
-			if (!(req->flags & REQ_F_FORCE_ASYNC))
-				goto again;
-			io_queue_async_work(req);
+			io_put_req(req);
 		}
 	} else {
-		/* un-prep timeout, so it'll be killed as any other linked */
-		req->flags &= ~REQ_F_LINK_TIMEOUT;
 		req_set_fail_links(req);
 		io_put_req(req);
 		io_req_complete(req, ret);
 	}
-
-	if (old_creds)
-		revert_creds(old_creds);
+	if (linked_timeout)
+		io_queue_linked_timeout(linked_timeout);
 }
 
 static void io_queue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe)
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/5] io_uring: optimise SQPOLL mm/files grabbing
  2021-02-12  3:23 [PATCH 0/5] another round of small tinkering Pavel Begunkov
  2021-02-12  3:23 ` [PATCH 1/5] io_uring: take compl state from submit state Pavel Begunkov
  2021-02-12  3:23 ` [PATCH 2/5] io_uring: optimise out unlikely link queue Pavel Begunkov
@ 2021-02-12  3:23 ` Pavel Begunkov
  2021-02-12  3:23 ` [PATCH 4/5] io_uring: don't duplicate io_req_task_queue() Pavel Begunkov
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-02-12  3:23 UTC (permalink / raw)
  To: Jens Axboe, io-uring

There are two reasons for this. First is to optimise
io_sq_thread_acquire_mm_files() for non-SQPOLL case, which currently do
too many checks and function calls in the hot path, e.g. in
io_init_req().

The second is to not grab mm/files when there are not needed. As
__io_queue_sqe() issues only one request now, we can reuse
io_sq_thread_acquire_mm_files() instead of unconditional acquire
mm/files.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 27 +++++++++++++--------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 26d1080217e5..813d1ccd7a69 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1145,9 +1145,6 @@ static void io_sq_thread_drop_mm_files(void)
 
 static int __io_sq_thread_acquire_files(struct io_ring_ctx *ctx)
 {
-	if (current->flags & PF_EXITING)
-		return -EFAULT;
-
 	if (!current->files) {
 		struct files_struct *files;
 		struct nsproxy *nsproxy;
@@ -1175,15 +1172,9 @@ static int __io_sq_thread_acquire_mm(struct io_ring_ctx *ctx)
 {
 	struct mm_struct *mm;
 
-	if (current->flags & PF_EXITING)
-		return -EFAULT;
 	if (current->mm)
 		return 0;
 
-	/* Should never happen */
-	if (unlikely(!(ctx->flags & IORING_SETUP_SQPOLL)))
-		return -EFAULT;
-
 	task_lock(ctx->sqo_task);
 	mm = ctx->sqo_task->mm;
 	if (unlikely(!mm || !mmget_not_zero(mm)))
@@ -1198,8 +1189,8 @@ static int __io_sq_thread_acquire_mm(struct io_ring_ctx *ctx)
 	return -EFAULT;
 }
 
-static int io_sq_thread_acquire_mm_files(struct io_ring_ctx *ctx,
-					 struct io_kiocb *req)
+static int __io_sq_thread_acquire_mm_files(struct io_ring_ctx *ctx,
+					   struct io_kiocb *req)
 {
 	const struct io_op_def *def = &io_op_defs[req->opcode];
 	int ret;
@@ -1219,6 +1210,16 @@ static int io_sq_thread_acquire_mm_files(struct io_ring_ctx *ctx,
 	return 0;
 }
 
+static inline int io_sq_thread_acquire_mm_files(struct io_ring_ctx *ctx,
+						struct io_kiocb *req)
+{
+	if (unlikely(current->flags & PF_EXITING))
+		return -EFAULT;
+	if (!(ctx->flags & IORING_SETUP_SQPOLL))
+		return 0;
+	return __io_sq_thread_acquire_mm_files(ctx, req);
+}
+
 static void io_sq_thread_associate_blkcg(struct io_ring_ctx *ctx,
 					 struct cgroup_subsys_state **cur_css)
 
@@ -2336,9 +2337,7 @@ static void __io_req_task_submit(struct io_kiocb *req)
 	struct io_ring_ctx *ctx = req->ctx;
 
 	mutex_lock(&ctx->uring_lock);
-	if (!ctx->sqo_dead &&
-	    !__io_sq_thread_acquire_mm(ctx) &&
-	    !__io_sq_thread_acquire_files(ctx))
+	if (!ctx->sqo_dead && !io_sq_thread_acquire_mm_files(ctx, req))
 		__io_queue_sqe(req);
 	else
 		__io_req_task_cancel(req, -EFAULT);
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 4/5] io_uring: don't duplicate io_req_task_queue()
  2021-02-12  3:23 [PATCH 0/5] another round of small tinkering Pavel Begunkov
                   ` (2 preceding siblings ...)
  2021-02-12  3:23 ` [PATCH 3/5] io_uring: optimise SQPOLL mm/files grabbing Pavel Begunkov
@ 2021-02-12  3:23 ` Pavel Begunkov
  2021-02-12  3:23 ` [PATCH 5/5] io_uring: save ctx put/get for task_work submit Pavel Begunkov
  2021-02-12 12:44 ` [PATCH 0/5] another round of small tinkering Jens Axboe
  5 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-02-12  3:23 UTC (permalink / raw)
  To: Jens Axboe, io-uring

Don't hand code io_req_task_queue() inside of io_async_buf_func(), just
call it.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 813d1ccd7a69..5c0b1a7dba80 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3494,7 +3494,6 @@ static int io_async_buf_func(struct wait_queue_entry *wait, unsigned mode,
 	struct wait_page_queue *wpq;
 	struct io_kiocb *req = wait->private;
 	struct wait_page_key *key = arg;
-	int ret;
 
 	wpq = container_of(wait, struct wait_page_queue, wait);
 
@@ -3504,14 +3503,9 @@ static int io_async_buf_func(struct wait_queue_entry *wait, unsigned mode,
 	req->rw.kiocb.ki_flags &= ~IOCB_WAITQ;
 	list_del_init(&wait->entry);
 
-	req->task_work.func = io_req_task_submit;
-	percpu_ref_get(&req->ctx->refs);
-
 	/* submit ref gets dropped, acquire a new one */
 	refcount_inc(&req->refs);
-	ret = io_req_task_work_add(req);
-	if (unlikely(ret))
-		io_req_task_work_add_fallback(req, io_req_task_cancel);
+	io_req_task_queue(req);
 	return 1;
 }
 
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 5/5] io_uring: save ctx put/get for task_work submit
  2021-02-12  3:23 [PATCH 0/5] another round of small tinkering Pavel Begunkov
                   ` (3 preceding siblings ...)
  2021-02-12  3:23 ` [PATCH 4/5] io_uring: don't duplicate io_req_task_queue() Pavel Begunkov
@ 2021-02-12  3:23 ` Pavel Begunkov
  2021-02-12 12:44 ` [PATCH 0/5] another round of small tinkering Jens Axboe
  5 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-02-12  3:23 UTC (permalink / raw)
  To: Jens Axboe, io-uring

Do a little trick in io_ring_ctx_free() briefly taking uring_lock, that
will wait for everyone currently holding it, so we can skip pinning ctx
with ctx->refs for __io_req_task_submit(), which is executed and loses
its refs/reqs while holding the lock.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 5c0b1a7dba80..87f2f8e660e8 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2336,6 +2336,7 @@ static void __io_req_task_submit(struct io_kiocb *req)
 {
 	struct io_ring_ctx *ctx = req->ctx;
 
+	/* ctx stays valid until unlock, even if we drop all ours ctx->refs */
 	mutex_lock(&ctx->uring_lock);
 	if (!ctx->sqo_dead && !io_sq_thread_acquire_mm_files(ctx, req))
 		__io_queue_sqe(req);
@@ -2347,10 +2348,8 @@ static void __io_req_task_submit(struct io_kiocb *req)
 static void io_req_task_submit(struct callback_head *cb)
 {
 	struct io_kiocb *req = container_of(cb, struct io_kiocb, task_work);
-	struct io_ring_ctx *ctx = req->ctx;
 
 	__io_req_task_submit(req);
-	percpu_ref_put(&ctx->refs);
 }
 
 static void io_req_task_queue(struct io_kiocb *req)
@@ -2358,11 +2357,11 @@ static void io_req_task_queue(struct io_kiocb *req)
 	int ret;
 
 	req->task_work.func = io_req_task_submit;
-	percpu_ref_get(&req->ctx->refs);
-
 	ret = io_req_task_work_add(req);
-	if (unlikely(ret))
+	if (unlikely(ret)) {
+		percpu_ref_get(&req->ctx->refs);
 		io_req_task_work_add_fallback(req, io_req_task_cancel);
+	}
 }
 
 static inline void io_queue_next(struct io_kiocb *req)
@@ -8707,6 +8706,14 @@ static void io_ring_ctx_free(struct io_ring_ctx *ctx)
 {
 	struct io_submit_state *submit_state = &ctx->submit_state;
 
+	/*
+	 * Some may use context even when all refs and requests have been put,
+	 * and they are free to do so while still holding uring_lock, see
+	 * __io_req_task_submit(). Wait for them to finish.
+	 */
+	mutex_lock(&ctx->uring_lock);
+	mutex_unlock(&ctx->uring_lock);
+
 	io_finish_async(ctx);
 	io_sqe_buffers_unregister(ctx);
 
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/5] another round of small tinkering
  2021-02-12  3:23 [PATCH 0/5] another round of small tinkering Pavel Begunkov
                   ` (4 preceding siblings ...)
  2021-02-12  3:23 ` [PATCH 5/5] io_uring: save ctx put/get for task_work submit Pavel Begunkov
@ 2021-02-12 12:44 ` Jens Axboe
  5 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2021-02-12 12:44 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 2/11/21 8:23 PM, Pavel Begunkov wrote:
> Some other small improvements with negative diffstat.
> 2-3 are to address acquire_mm_files overhead
> 
> Pavel Begunkov (5):
>   io_uring: take compl state from submit state
>   io_uring: optimise out unlikely link queue
>   io_uring: optimise SQPOLL mm/files grabbing
>   io_uring: don't duplicate io_req_task_queue()
>   io_uring: save ctx put/get for task_work submit
> 
>  fs/io_uring.c | 103 +++++++++++++++++++-------------------------------
>  1 file changed, 39 insertions(+), 64 deletions(-)

LGTM, applied, thanks.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-02-12 12:47 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-02-12  3:23 [PATCH 0/5] another round of small tinkering Pavel Begunkov
2021-02-12  3:23 ` [PATCH 1/5] io_uring: take compl state from submit state Pavel Begunkov
2021-02-12  3:23 ` [PATCH 2/5] io_uring: optimise out unlikely link queue Pavel Begunkov
2021-02-12  3:23 ` [PATCH 3/5] io_uring: optimise SQPOLL mm/files grabbing Pavel Begunkov
2021-02-12  3:23 ` [PATCH 4/5] io_uring: don't duplicate io_req_task_queue() Pavel Begunkov
2021-02-12  3:23 ` [PATCH 5/5] io_uring: save ctx put/get for task_work submit Pavel Begunkov
2021-02-12 12:44 ` [PATCH 0/5] another round of small tinkering Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox