public inbox for [email protected]
 help / color / mirror / Atom feed
From: Marco Elver <[email protected]>
To: [email protected], [email protected]
Cc: syzbot <[email protected]>,
	[email protected], [email protected],
	[email protected], [email protected]
Subject: Re: [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests
Date: Wed, 26 May 2021 17:52:42 +0200	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On Wed, May 26, 2021 at 05:48PM +0200, Marco Elver wrote:
> On Wed, May 26, 2021 at 08:44AM -0700, syzbot wrote:
> > Hello,
> > 
> > syzbot found the following issue on:
> > 
> > HEAD commit:    a050a6d2 Merge tag 'perf-tools-fixes-for-v5.13-2021-05-24'..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=13205087d00000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=3bcc8a6b51ef8094
> > dashboard link: https://syzkaller.appspot.com/bug?extid=73554e2258b7b8bf0bbf
> > compiler:       Debian clang version 11.0.1-2
> [...]
> > write to 0xffff88811d8df330 of 8 bytes by task 3709 on cpu 1:
> >  io_uring_clean_tctx fs/io_uring.c:9042 [inline]
> >  __io_uring_cancel+0x261/0x3b0 fs/io_uring.c:9136
> >  io_uring_files_cancel include/linux/io_uring.h:16 [inline]
> >  do_exit+0x185/0x1560 kernel/exit.c:781
> >  do_group_exit+0xce/0x1a0 kernel/exit.c:923
> >  get_signal+0xfc3/0x1610 kernel/signal.c:2835
> >  arch_do_signal_or_restart+0x2a/0x220 arch/x86/kernel/signal.c:789
> >  handle_signal_work kernel/entry/common.c:147 [inline]
> >  exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
> >  exit_to_user_mode_prepare+0x109/0x190 kernel/entry/common.c:208
> >  __syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline]
> >  syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:301
> >  do_syscall_64+0x56/0x90 arch/x86/entry/common.c:57
> >  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > 
> > read to 0xffff88811d8df330 of 8 bytes by task 6412 on cpu 0:
> >  io_uring_try_cancel_iowq fs/io_uring.c:8911 [inline]
> >  io_uring_try_cancel_requests+0x1ce/0x8e0 fs/io_uring.c:8933
> >  io_ring_exit_work+0x7c/0x1110 fs/io_uring.c:8736
> >  process_one_work+0x3e9/0x8f0 kernel/workqueue.c:2276
> >  worker_thread+0x636/0xae0 kernel/workqueue.c:2422
> >  kthread+0x1d0/0x1f0 kernel/kthread.c:313
> >  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
> 
> I wasn't entirely sure if io_wq is guaranteed to remain live in this
> case in io_uring_try_cancel_iowq(), but the comment there suggests it
> does. In that case, I think the below patch would explain the situation
> better and also propose a fix.
> 
> Thoughts?

Due to some moving around of code, the patch lost the actual fix (using
atomically read io_wq) -- so here it is again ... hopefully as intended.
:-)

Thanks,
-- Marco

From: Marco Elver <[email protected]>
Date: Wed, 26 May 2021 16:56:37 +0200
Subject: [PATCH] io_uring: fix data race to avoid potential NULL-deref

Commit ba5ef6dc8a82 ("io_uring: fortify tctx/io_wq cleanup") introduced
setting tctx->io_wq to NULL a bit earlier. This has caused KCSAN to
detect a data race between between accesses to tctx->io_wq:

  write to 0xffff88811d8df330 of 8 bytes by task 3709 on cpu 1:
   io_uring_clean_tctx                  fs/io_uring.c:9042 [inline]
   __io_uring_cancel                    fs/io_uring.c:9136
   io_uring_files_cancel                include/linux/io_uring.h:16 [inline]
   do_exit                              kernel/exit.c:781
   do_group_exit                        kernel/exit.c:923
   get_signal                           kernel/signal.c:2835
   arch_do_signal_or_restart            arch/x86/kernel/signal.c:789
   handle_signal_work                   kernel/entry/common.c:147 [inline]
   exit_to_user_mode_loop               kernel/entry/common.c:171 [inline]
   ...
  read to 0xffff88811d8df330 of 8 bytes by task 6412 on cpu 0:
   io_uring_try_cancel_iowq             fs/io_uring.c:8911 [inline]
   io_uring_try_cancel_requests         fs/io_uring.c:8933
   io_ring_exit_work                    fs/io_uring.c:8736
   process_one_work                     kernel/workqueue.c:2276
   ...

With the config used, KCSAN only reports data races with value changes:
this implies that in the case here we also know that tctx->io_wq was
non-NULL. Therefore, depending on interleaving, we may end up with:

              [CPU 0]                 |        [CPU 1]
  io_uring_try_cancel_iowq()          | io_uring_clean_tctx()
    if (!tctx->io_wq) // false        |   ...
    ...                               |   tctx->io_wq = NULL
    io_wq_cancel_cb(tctx->io_wq, ...) |   ...
      -> NULL-deref                   |

Note: It is likely that thus far we've gotten lucky and the compiler
optimizes the double-read into a single read into a register -- but this
is never guaranteed, and can easily change with a different config!

Fix the data race by atomically accessing tctx->io_wq. Of course, this
assumes that a valid io_wq remains alive for the duration of
io_uring_try_cancel_iowq(), which should be the case per comment there.

Reported-by: [email protected]
Signed-off-by: Marco Elver <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 5f82954004f6..e681ece1bbca 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -8903,14 +8903,18 @@ static bool io_uring_try_cancel_iowq(struct io_ring_ctx *ctx)
 	mutex_lock(&ctx->uring_lock);
 	list_for_each_entry(node, &ctx->tctx_list, ctx_node) {
 		struct io_uring_task *tctx = node->task->io_uring;
+		struct io_wq *io_wq;
 
+		if (!tctx)
+			continue;
 		/*
 		 * io_wq will stay alive while we hold uring_lock, because it's
 		 * killed after ctx nodes, which requires to take the lock.
 		 */
-		if (!tctx || !tctx->io_wq)
+		io_wq = READ_ONCE(tctx->io_wq);
+		if (!io_wq)
 			continue;
-		cret = io_wq_cancel_cb(tctx->io_wq, io_cancel_ctx_cb, ctx, true);
+		cret = io_wq_cancel_cb(io_wq, io_cancel_ctx_cb, ctx, true);
 		ret |= (cret != IO_WQ_CANCEL_NOTFOUND);
 	}
 	mutex_unlock(&ctx->uring_lock);
@@ -9039,7 +9043,7 @@ static void io_uring_clean_tctx(struct io_uring_task *tctx)
 	struct io_tctx_node *node;
 	unsigned long index;
 
-	tctx->io_wq = NULL;
+	WRITE_ONCE(tctx->io_wq, NULL);
 	xa_for_each(&tctx->xa, index, node)
 		io_uring_del_task_file(index);
 	if (wq)
-- 
2.31.1.818.g46aad6cb9e-goog


  reply	other threads:[~2021-05-26 15:52 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-26 15:44 [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests syzbot
2021-05-26 15:48 ` Marco Elver
2021-05-26 15:52   ` Marco Elver [this message]
2021-05-26 16:29     ` Pavel Begunkov
2021-05-26 16:33       ` Pavel Begunkov
2021-05-26 16:36       ` Marco Elver
2021-05-26 20:31         ` Pavel Begunkov
2021-05-27  9:32           ` Marco Elver
2021-05-27 10:05             ` Pavel Begunkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox