* [PATCH 1/5] io_uring: take compl state from submit state
2021-02-12 3:23 [PATCH 0/5] another round of small tinkering Pavel Begunkov
@ 2021-02-12 3:23 ` Pavel Begunkov
2021-02-12 3:23 ` [PATCH 2/5] io_uring: optimise out unlikely link queue Pavel Begunkov
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-02-12 3:23 UTC (permalink / raw)
To: Jens Axboe, io-uring
Completion and submission states are now coupled together, it's weird to
get one from argument and another from ctx, do it consistently for
io_req_free_batch(). It's also faster as we already have @state cached
in registers.
Signed-off-by: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 7b1979624320..8c2613bf54d3 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2417,13 +2417,10 @@ static void io_req_free_batch(struct req_batch *rb, struct io_kiocb *req,
rb->ctx_refs++;
io_dismantle_req(req);
- if (state->free_reqs != ARRAY_SIZE(state->reqs)) {
+ if (state->free_reqs != ARRAY_SIZE(state->reqs))
state->reqs[state->free_reqs++] = req;
- } else {
- struct io_comp_state *cs = &req->ctx->submit_state.comp;
-
- list_add(&req->compl.list, &cs->free_list);
- }
+ else
+ list_add(&req->compl.list, &state->comp.free_list);
}
static void io_submit_flush_completions(struct io_comp_state *cs,
--
2.24.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/5] io_uring: optimise out unlikely link queue
2021-02-12 3:23 [PATCH 0/5] another round of small tinkering Pavel Begunkov
2021-02-12 3:23 ` [PATCH 1/5] io_uring: take compl state from submit state Pavel Begunkov
@ 2021-02-12 3:23 ` Pavel Begunkov
2021-02-12 3:23 ` [PATCH 3/5] io_uring: optimise SQPOLL mm/files grabbing Pavel Begunkov
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-02-12 3:23 UTC (permalink / raw)
To: Jens Axboe, io-uring
__io_queue_sqe() tries to issue as much requests of a link as it can,
and uses io_put_req_find_next() to extract a next one, targeting inline
completed requests. As now __io_queue_sqe() is always used together with
struct io_comp_state, it leaves next propagation only a small window and
only for async reqs, that doesn't justify its existence.
Remove it, make __io_queue_sqe() to issue only a head request. It
simplifies the code and will allow other optimisations.
Signed-off-by: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 42 ++++++++++--------------------------------
1 file changed, 10 insertions(+), 32 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 8c2613bf54d3..26d1080217e5 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6563,26 +6563,20 @@ static struct io_kiocb *io_prep_linked_timeout(struct io_kiocb *req)
static void __io_queue_sqe(struct io_kiocb *req)
{
- struct io_kiocb *linked_timeout;
+ struct io_kiocb *linked_timeout = io_prep_linked_timeout(req);
const struct cred *old_creds = NULL;
int ret;
-again:
- linked_timeout = io_prep_linked_timeout(req);
-
if ((req->flags & REQ_F_WORK_INITIALIZED) &&
(req->work.flags & IO_WQ_WORK_CREDS) &&
- req->work.identity->creds != current_cred()) {
- if (old_creds)
- revert_creds(old_creds);
- if (old_creds == req->work.identity->creds)
- old_creds = NULL; /* restored original creds */
- else
- old_creds = override_creds(req->work.identity->creds);
- }
+ req->work.identity->creds != current_cred())
+ old_creds = override_creds(req->work.identity->creds);
ret = io_issue_sqe(req, IO_URING_F_NONBLOCK|IO_URING_F_COMPLETE_DEFER);
+ if (old_creds)
+ revert_creds(old_creds);
+
/*
* We async punt it if the file wasn't marked NOWAIT, or if the file
* doesn't support non-blocking read/write attempts
@@ -6595,9 +6589,6 @@ static void __io_queue_sqe(struct io_kiocb *req)
*/
io_queue_async_work(req);
}
-
- if (linked_timeout)
- io_queue_linked_timeout(linked_timeout);
} else if (likely(!ret)) {
/* drop submission reference */
if (req->flags & REQ_F_COMPLETE_INLINE) {
@@ -6605,31 +6596,18 @@ static void __io_queue_sqe(struct io_kiocb *req)
struct io_comp_state *cs = &ctx->submit_state.comp;
cs->reqs[cs->nr++] = req;
- if (cs->nr == IO_COMPL_BATCH)
+ if (cs->nr == ARRAY_SIZE(cs->reqs))
io_submit_flush_completions(cs, ctx);
- req = NULL;
} else {
- req = io_put_req_find_next(req);
- }
-
- if (linked_timeout)
- io_queue_linked_timeout(linked_timeout);
-
- if (req) {
- if (!(req->flags & REQ_F_FORCE_ASYNC))
- goto again;
- io_queue_async_work(req);
+ io_put_req(req);
}
} else {
- /* un-prep timeout, so it'll be killed as any other linked */
- req->flags &= ~REQ_F_LINK_TIMEOUT;
req_set_fail_links(req);
io_put_req(req);
io_req_complete(req, ret);
}
-
- if (old_creds)
- revert_creds(old_creds);
+ if (linked_timeout)
+ io_queue_linked_timeout(linked_timeout);
}
static void io_queue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe)
--
2.24.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/5] io_uring: optimise SQPOLL mm/files grabbing
2021-02-12 3:23 [PATCH 0/5] another round of small tinkering Pavel Begunkov
2021-02-12 3:23 ` [PATCH 1/5] io_uring: take compl state from submit state Pavel Begunkov
2021-02-12 3:23 ` [PATCH 2/5] io_uring: optimise out unlikely link queue Pavel Begunkov
@ 2021-02-12 3:23 ` Pavel Begunkov
2021-02-12 3:23 ` [PATCH 4/5] io_uring: don't duplicate io_req_task_queue() Pavel Begunkov
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-02-12 3:23 UTC (permalink / raw)
To: Jens Axboe, io-uring
There are two reasons for this. First is to optimise
io_sq_thread_acquire_mm_files() for non-SQPOLL case, which currently do
too many checks and function calls in the hot path, e.g. in
io_init_req().
The second is to not grab mm/files when there are not needed. As
__io_queue_sqe() issues only one request now, we can reuse
io_sq_thread_acquire_mm_files() instead of unconditional acquire
mm/files.
Signed-off-by: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 27 +++++++++++++--------------
1 file changed, 13 insertions(+), 14 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 26d1080217e5..813d1ccd7a69 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1145,9 +1145,6 @@ static void io_sq_thread_drop_mm_files(void)
static int __io_sq_thread_acquire_files(struct io_ring_ctx *ctx)
{
- if (current->flags & PF_EXITING)
- return -EFAULT;
-
if (!current->files) {
struct files_struct *files;
struct nsproxy *nsproxy;
@@ -1175,15 +1172,9 @@ static int __io_sq_thread_acquire_mm(struct io_ring_ctx *ctx)
{
struct mm_struct *mm;
- if (current->flags & PF_EXITING)
- return -EFAULT;
if (current->mm)
return 0;
- /* Should never happen */
- if (unlikely(!(ctx->flags & IORING_SETUP_SQPOLL)))
- return -EFAULT;
-
task_lock(ctx->sqo_task);
mm = ctx->sqo_task->mm;
if (unlikely(!mm || !mmget_not_zero(mm)))
@@ -1198,8 +1189,8 @@ static int __io_sq_thread_acquire_mm(struct io_ring_ctx *ctx)
return -EFAULT;
}
-static int io_sq_thread_acquire_mm_files(struct io_ring_ctx *ctx,
- struct io_kiocb *req)
+static int __io_sq_thread_acquire_mm_files(struct io_ring_ctx *ctx,
+ struct io_kiocb *req)
{
const struct io_op_def *def = &io_op_defs[req->opcode];
int ret;
@@ -1219,6 +1210,16 @@ static int io_sq_thread_acquire_mm_files(struct io_ring_ctx *ctx,
return 0;
}
+static inline int io_sq_thread_acquire_mm_files(struct io_ring_ctx *ctx,
+ struct io_kiocb *req)
+{
+ if (unlikely(current->flags & PF_EXITING))
+ return -EFAULT;
+ if (!(ctx->flags & IORING_SETUP_SQPOLL))
+ return 0;
+ return __io_sq_thread_acquire_mm_files(ctx, req);
+}
+
static void io_sq_thread_associate_blkcg(struct io_ring_ctx *ctx,
struct cgroup_subsys_state **cur_css)
@@ -2336,9 +2337,7 @@ static void __io_req_task_submit(struct io_kiocb *req)
struct io_ring_ctx *ctx = req->ctx;
mutex_lock(&ctx->uring_lock);
- if (!ctx->sqo_dead &&
- !__io_sq_thread_acquire_mm(ctx) &&
- !__io_sq_thread_acquire_files(ctx))
+ if (!ctx->sqo_dead && !io_sq_thread_acquire_mm_files(ctx, req))
__io_queue_sqe(req);
else
__io_req_task_cancel(req, -EFAULT);
--
2.24.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 4/5] io_uring: don't duplicate io_req_task_queue()
2021-02-12 3:23 [PATCH 0/5] another round of small tinkering Pavel Begunkov
` (2 preceding siblings ...)
2021-02-12 3:23 ` [PATCH 3/5] io_uring: optimise SQPOLL mm/files grabbing Pavel Begunkov
@ 2021-02-12 3:23 ` Pavel Begunkov
2021-02-12 3:23 ` [PATCH 5/5] io_uring: save ctx put/get for task_work submit Pavel Begunkov
2021-02-12 12:44 ` [PATCH 0/5] another round of small tinkering Jens Axboe
5 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-02-12 3:23 UTC (permalink / raw)
To: Jens Axboe, io-uring
Don't hand code io_req_task_queue() inside of io_async_buf_func(), just
call it.
Signed-off-by: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 813d1ccd7a69..5c0b1a7dba80 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3494,7 +3494,6 @@ static int io_async_buf_func(struct wait_queue_entry *wait, unsigned mode,
struct wait_page_queue *wpq;
struct io_kiocb *req = wait->private;
struct wait_page_key *key = arg;
- int ret;
wpq = container_of(wait, struct wait_page_queue, wait);
@@ -3504,14 +3503,9 @@ static int io_async_buf_func(struct wait_queue_entry *wait, unsigned mode,
req->rw.kiocb.ki_flags &= ~IOCB_WAITQ;
list_del_init(&wait->entry);
- req->task_work.func = io_req_task_submit;
- percpu_ref_get(&req->ctx->refs);
-
/* submit ref gets dropped, acquire a new one */
refcount_inc(&req->refs);
- ret = io_req_task_work_add(req);
- if (unlikely(ret))
- io_req_task_work_add_fallback(req, io_req_task_cancel);
+ io_req_task_queue(req);
return 1;
}
--
2.24.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 5/5] io_uring: save ctx put/get for task_work submit
2021-02-12 3:23 [PATCH 0/5] another round of small tinkering Pavel Begunkov
` (3 preceding siblings ...)
2021-02-12 3:23 ` [PATCH 4/5] io_uring: don't duplicate io_req_task_queue() Pavel Begunkov
@ 2021-02-12 3:23 ` Pavel Begunkov
2021-02-12 12:44 ` [PATCH 0/5] another round of small tinkering Jens Axboe
5 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-02-12 3:23 UTC (permalink / raw)
To: Jens Axboe, io-uring
Do a little trick in io_ring_ctx_free() briefly taking uring_lock, that
will wait for everyone currently holding it, so we can skip pinning ctx
with ctx->refs for __io_req_task_submit(), which is executed and loses
its refs/reqs while holding the lock.
Signed-off-by: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 5c0b1a7dba80..87f2f8e660e8 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2336,6 +2336,7 @@ static void __io_req_task_submit(struct io_kiocb *req)
{
struct io_ring_ctx *ctx = req->ctx;
+ /* ctx stays valid until unlock, even if we drop all ours ctx->refs */
mutex_lock(&ctx->uring_lock);
if (!ctx->sqo_dead && !io_sq_thread_acquire_mm_files(ctx, req))
__io_queue_sqe(req);
@@ -2347,10 +2348,8 @@ static void __io_req_task_submit(struct io_kiocb *req)
static void io_req_task_submit(struct callback_head *cb)
{
struct io_kiocb *req = container_of(cb, struct io_kiocb, task_work);
- struct io_ring_ctx *ctx = req->ctx;
__io_req_task_submit(req);
- percpu_ref_put(&ctx->refs);
}
static void io_req_task_queue(struct io_kiocb *req)
@@ -2358,11 +2357,11 @@ static void io_req_task_queue(struct io_kiocb *req)
int ret;
req->task_work.func = io_req_task_submit;
- percpu_ref_get(&req->ctx->refs);
-
ret = io_req_task_work_add(req);
- if (unlikely(ret))
+ if (unlikely(ret)) {
+ percpu_ref_get(&req->ctx->refs);
io_req_task_work_add_fallback(req, io_req_task_cancel);
+ }
}
static inline void io_queue_next(struct io_kiocb *req)
@@ -8707,6 +8706,14 @@ static void io_ring_ctx_free(struct io_ring_ctx *ctx)
{
struct io_submit_state *submit_state = &ctx->submit_state;
+ /*
+ * Some may use context even when all refs and requests have been put,
+ * and they are free to do so while still holding uring_lock, see
+ * __io_req_task_submit(). Wait for them to finish.
+ */
+ mutex_lock(&ctx->uring_lock);
+ mutex_unlock(&ctx->uring_lock);
+
io_finish_async(ctx);
io_sqe_buffers_unregister(ctx);
--
2.24.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 0/5] another round of small tinkering
2021-02-12 3:23 [PATCH 0/5] another round of small tinkering Pavel Begunkov
` (4 preceding siblings ...)
2021-02-12 3:23 ` [PATCH 5/5] io_uring: save ctx put/get for task_work submit Pavel Begunkov
@ 2021-02-12 12:44 ` Jens Axboe
5 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2021-02-12 12:44 UTC (permalink / raw)
To: Pavel Begunkov, io-uring
On 2/11/21 8:23 PM, Pavel Begunkov wrote:
> Some other small improvements with negative diffstat.
> 2-3 are to address acquire_mm_files overhead
>
> Pavel Begunkov (5):
> io_uring: take compl state from submit state
> io_uring: optimise out unlikely link queue
> io_uring: optimise SQPOLL mm/files grabbing
> io_uring: don't duplicate io_req_task_queue()
> io_uring: save ctx put/get for task_work submit
>
> fs/io_uring.c | 103 +++++++++++++++++++-------------------------------
> 1 file changed, 39 insertions(+), 64 deletions(-)
LGTM, applied, thanks.
--
Jens Axboe
^ permalink raw reply [flat|nested] 7+ messages in thread