public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jens Axboe <[email protected]>
To: [email protected]
Cc: Pavel Begunkov <[email protected]>,
	[email protected],
	Jens Axboe <[email protected]>
Subject: [PATCH 20/33] io_uring: fix __tctx_task_work() ctx race
Date: Wed,  3 Mar 2021 17:26:47 -0700	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

From: Pavel Begunkov <[email protected]>

There is an unlikely but possible race using a freed context. That's
because req->task_work.func() can free a request, but we won't
necessarily find a completion in submit_state.comp and so all ctx refs
may be put by the time we do mutex_lock(&ctx->uring_ctx);

There are several reasons why it can miss going through
submit_state.comp: 1) req->task_work.func() didn't complete it itself,
but punted to iowq (e.g. reissue) and it got freed later, or a similar
situation with it overflowing and getting flushed by someone else, or
being submitted to IRQ completion, 2) As we don't hold the uring_lock,
someone else can do io_submit_flush_completions() and put our ref.
3) Bugs and code obscurities, e.g. failing to propagate issue_flags
properly.

One example is as follows

  CPU1                                  |  CPU2
=======================================================================
@req->task_work.func()                  |
  -> @req overflwed,                    |
     so submit_state.comp,nr==0         |
                                        | flush overflows, and free @req
                                        | ctx refs == 0, free it
ctx is dead, but we do                  |
	lock + flush + unlock           |

So take a ctx reference for each new ctx we see in __tctx_task_work(),
and do release it until we do all our flushing.

Fixes: 65453d1efbd2 ("io_uring: enable req cache for task_work items")
Reported-by: [email protected]
Signed-off-by: Pavel Begunkov <[email protected]>
[axboe: fold in my one-liner and fix ref mismatch]
Signed-off-by: Jens Axboe <[email protected]>
---
 fs/io_uring.c | 36 +++++++++++++++++++-----------------
 1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index f6bc0254cd01..befa0f4dd575 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1800,6 +1800,18 @@ static inline struct io_kiocb *io_req_find_next(struct io_kiocb *req)
 	return __io_req_find_next(req);
 }
 
+static void ctx_flush_and_put(struct io_ring_ctx *ctx)
+{
+	if (!ctx)
+		return;
+	if (ctx->submit_state.comp.nr) {
+		mutex_lock(&ctx->uring_lock);
+		io_submit_flush_completions(&ctx->submit_state.comp, ctx);
+		mutex_unlock(&ctx->uring_lock);
+	}
+	percpu_ref_put(&ctx->refs);
+}
+
 static bool __tctx_task_work(struct io_uring_task *tctx)
 {
 	struct io_ring_ctx *ctx = NULL;
@@ -1817,30 +1829,20 @@ static bool __tctx_task_work(struct io_uring_task *tctx)
 	node = list.first;
 	while (node) {
 		struct io_wq_work_node *next = node->next;
-		struct io_ring_ctx *this_ctx;
 		struct io_kiocb *req;
 
 		req = container_of(node, struct io_kiocb, io_task_work.node);
-		this_ctx = req->ctx;
-		req->task_work.func(&req->task_work);
-		node = next;
-
-		if (!ctx) {
-			ctx = this_ctx;
-		} else if (ctx != this_ctx) {
-			mutex_lock(&ctx->uring_lock);
-			io_submit_flush_completions(&ctx->submit_state.comp, ctx);
-			mutex_unlock(&ctx->uring_lock);
-			ctx = this_ctx;
+		if (req->ctx != ctx) {
+			ctx_flush_and_put(ctx);
+			ctx = req->ctx;
+			percpu_ref_get(&ctx->refs);
 		}
-	}
 
-	if (ctx && ctx->submit_state.comp.nr) {
-		mutex_lock(&ctx->uring_lock);
-		io_submit_flush_completions(&ctx->submit_state.comp, ctx);
-		mutex_unlock(&ctx->uring_lock);
+		req->task_work.func(&req->task_work);
+		node = next;
 	}
 
+	ctx_flush_and_put(ctx);
 	return list.first != NULL;
 }
 
-- 
2.30.1


  parent reply	other threads:[~2021-03-04  1:10 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-04  0:26 [PATCHSET 0/33] Fixes queued up for 5.12 Jens Axboe
2021-03-04  0:26 ` [PATCH 01/33] io-wq: wait for worker startup when forking a new one Jens Axboe
2021-03-04  0:26 ` [PATCH 02/33] io-wq: have manager wait for all workers to exit Jens Axboe
2021-03-04  0:26 ` [PATCH 03/33] io-wq: don't ask for a new worker if we're exiting Jens Axboe
2021-03-04  0:26 ` [PATCH 04/33] io-wq: rename wq->done completion to wq->started Jens Axboe
2021-03-04  0:26 ` [PATCH 05/33] io-wq: wait for manager exit on wq destroy Jens Axboe
2021-03-04  0:26 ` [PATCH 06/33] io-wq: fix double put of 'wq' in error path Jens Axboe
2021-03-04  0:26 ` [PATCH 07/33] io_uring: SQPOLL stop error handling fixes Jens Axboe
2021-03-04  0:26 ` [PATCH 08/33] io_uring: run fallback on cancellation Jens Axboe
2021-03-04  0:26 ` [PATCH 09/33] io_uring: don't use complete_all() on SQPOLL thread exit Jens Axboe
2021-03-04  0:26 ` [PATCH 10/33] io-wq: provide an io_wq_put_and_exit() helper Jens Axboe
2021-03-04  0:26 ` [PATCH 11/33] io_uring: fix race condition in task_work add and clear Jens Axboe
2021-03-04  0:26 ` [PATCH 12/33] io_uring: signal worker thread unshare Jens Axboe
2021-03-04 12:15   ` Stefan Metzmacher
2021-03-04 14:05     ` Jens Axboe
2021-03-04  0:26 ` [PATCH 13/33] io_uring: warn on not destroyed io-wq Jens Axboe
2021-03-04  0:26 ` [PATCH 14/33] io_uring: destroy io-wq on exec Jens Axboe
2021-03-04  0:26 ` [PATCH 15/33] io_uring: remove unused argument 'tsk' from io_req_caches_free() Jens Axboe
2021-03-04  0:26 ` [PATCH 16/33] io_uring: kill unnecessary REQ_F_WORK_INITIALIZED checks Jens Axboe
2021-03-04  0:26 ` [PATCH 17/33] io_uring: move cred assignment into io_issue_sqe() Jens Axboe
2021-03-04  0:26 ` [PATCH 18/33] io_uring: kill unnecessary io_run_ctx_fallback() in io_ring_exit_work() Jens Axboe
2021-03-04  0:26 ` [PATCH 19/33] io_uring: kill io_uring_flush() Jens Axboe
2021-03-04  0:26 ` Jens Axboe [this message]
2021-03-04  0:26 ` [PATCH 21/33] io_uring: replace cmpxchg in fallback with xchg Jens Axboe
2021-03-04  0:26 ` [PATCH 22/33] io_uring: ensure that SQPOLL thread is started for exit Jens Axboe
2021-03-04  0:26 ` [PATCH 23/33] io_uring: ignore double poll add on the same waitqueue head Jens Axboe
2021-03-04  0:26 ` [PATCH 24/33] io_uring: kill sqo_dead and sqo submission halting Jens Axboe
2021-03-04  0:26 ` [PATCH 25/33] io_uring: remove sqo_task Jens Axboe
2021-03-04  0:26 ` [PATCH 26/33] io-wq: fix error path leak of buffered write hash map Jens Axboe
2021-03-04  0:26 ` [PATCH 27/33] io_uring: fix -EAGAIN retry with IOPOLL Jens Axboe
2021-03-04  0:26 ` [PATCH 28/33] io_uring: choose right tctx->io_wq for try cancel Jens Axboe
2021-03-04  0:26 ` [PATCH 29/33] io_uring: inline io_req_clean_work() Jens Axboe
2021-03-04  0:26 ` [PATCH 30/33] io_uring: inline __io_queue_async_work() Jens Axboe
2021-03-04  0:26 ` [PATCH 31/33] io_uring: remove extra in_idle wake up Jens Axboe
2021-03-04  0:26 ` [PATCH 32/33] io_uring: ensure that threads freeze on suspend Jens Axboe
2021-03-04  0:27 ` [PATCH 33/33] io-wq: ensure all pending work is canceled on exit Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox