* [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests @ 2021-05-26 15:44 syzbot 2021-05-26 15:48 ` Marco Elver 0 siblings, 1 reply; 9+ messages in thread From: syzbot @ 2021-05-26 15:44 UTC (permalink / raw) To: asml.silence, axboe, io-uring, linux-kernel, syzkaller-bugs Hello, syzbot found the following issue on: HEAD commit: a050a6d2 Merge tag 'perf-tools-fixes-for-v5.13-2021-05-24'.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=13205087d00000 kernel config: https://syzkaller.appspot.com/x/.config?x=3bcc8a6b51ef8094 dashboard link: https://syzkaller.appspot.com/bug?extid=73554e2258b7b8bf0bbf compiler: Debian clang version 11.0.1-2 Unfortunately, I don't have any reproducer for this issue yet. IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: [email protected] ================================================================== BUG: KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests write to 0xffff88811d8df330 of 8 bytes by task 3709 on cpu 1: io_uring_clean_tctx fs/io_uring.c:9042 [inline] __io_uring_cancel+0x261/0x3b0 fs/io_uring.c:9136 io_uring_files_cancel include/linux/io_uring.h:16 [inline] do_exit+0x185/0x1560 kernel/exit.c:781 do_group_exit+0xce/0x1a0 kernel/exit.c:923 get_signal+0xfc3/0x1610 kernel/signal.c:2835 arch_do_signal_or_restart+0x2a/0x220 arch/x86/kernel/signal.c:789 handle_signal_work kernel/entry/common.c:147 [inline] exit_to_user_mode_loop kernel/entry/common.c:171 [inline] exit_to_user_mode_prepare+0x109/0x190 kernel/entry/common.c:208 __syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline] syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:301 do_syscall_64+0x56/0x90 arch/x86/entry/common.c:57 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff88811d8df330 of 8 bytes by task 6412 on cpu 0: io_uring_try_cancel_iowq fs/io_uring.c:8911 [inline] io_uring_try_cancel_requests+0x1ce/0x8e0 fs/io_uring.c:8933 io_ring_exit_work+0x7c/0x1110 fs/io_uring.c:8736 process_one_work+0x3e9/0x8f0 kernel/workqueue.c:2276 worker_thread+0x636/0xae0 kernel/workqueue.c:2422 kthread+0x1d0/0x1f0 kernel/kthread.c:313 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 6412 Comm: kworker/u4:9 Not tainted 5.13.0-rc3-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: events_unbound io_ring_exit_work ================================================================== --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at [email protected]. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests 2021-05-26 15:44 [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests syzbot @ 2021-05-26 15:48 ` Marco Elver 2021-05-26 15:52 ` Marco Elver 0 siblings, 1 reply; 9+ messages in thread From: Marco Elver @ 2021-05-26 15:48 UTC (permalink / raw) To: asml.silence, axboe Cc: syzbot, io-uring, linux-kernel, syzkaller-bugs, dvyukov On Wed, May 26, 2021 at 08:44AM -0700, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit: a050a6d2 Merge tag 'perf-tools-fixes-for-v5.13-2021-05-24'.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=13205087d00000 > kernel config: https://syzkaller.appspot.com/x/.config?x=3bcc8a6b51ef8094 > dashboard link: https://syzkaller.appspot.com/bug?extid=73554e2258b7b8bf0bbf > compiler: Debian clang version 11.0.1-2 [...] > write to 0xffff88811d8df330 of 8 bytes by task 3709 on cpu 1: > io_uring_clean_tctx fs/io_uring.c:9042 [inline] > __io_uring_cancel+0x261/0x3b0 fs/io_uring.c:9136 > io_uring_files_cancel include/linux/io_uring.h:16 [inline] > do_exit+0x185/0x1560 kernel/exit.c:781 > do_group_exit+0xce/0x1a0 kernel/exit.c:923 > get_signal+0xfc3/0x1610 kernel/signal.c:2835 > arch_do_signal_or_restart+0x2a/0x220 arch/x86/kernel/signal.c:789 > handle_signal_work kernel/entry/common.c:147 [inline] > exit_to_user_mode_loop kernel/entry/common.c:171 [inline] > exit_to_user_mode_prepare+0x109/0x190 kernel/entry/common.c:208 > __syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline] > syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:301 > do_syscall_64+0x56/0x90 arch/x86/entry/common.c:57 > entry_SYSCALL_64_after_hwframe+0x44/0xae > > read to 0xffff88811d8df330 of 8 bytes by task 6412 on cpu 0: > io_uring_try_cancel_iowq fs/io_uring.c:8911 [inline] > io_uring_try_cancel_requests+0x1ce/0x8e0 fs/io_uring.c:8933 > io_ring_exit_work+0x7c/0x1110 fs/io_uring.c:8736 > process_one_work+0x3e9/0x8f0 kernel/workqueue.c:2276 > worker_thread+0x636/0xae0 kernel/workqueue.c:2422 > kthread+0x1d0/0x1f0 kernel/kthread.c:313 > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 I wasn't entirely sure if io_wq is guaranteed to remain live in this case in io_uring_try_cancel_iowq(), but the comment there suggests it does. In that case, I think the below patch would explain the situation better and also propose a fix. Thoughts? Thanks, -- Marco ------ >8 ------ From: Marco Elver <[email protected]> Date: Wed, 26 May 2021 16:56:37 +0200 Subject: [PATCH] io_uring: fix data race to avoid potential NULL-deref Commit ba5ef6dc8a82 ("io_uring: fortify tctx/io_wq cleanup") introduced setting tctx->io_wq to NULL a bit earlier. This has caused KCSAN to detect a data race between between accesses to tctx->io_wq: write to 0xffff88811d8df330 of 8 bytes by task 3709 on cpu 1: io_uring_clean_tctx fs/io_uring.c:9042 [inline] __io_uring_cancel fs/io_uring.c:9136 io_uring_files_cancel include/linux/io_uring.h:16 [inline] do_exit kernel/exit.c:781 do_group_exit kernel/exit.c:923 get_signal kernel/signal.c:2835 arch_do_signal_or_restart arch/x86/kernel/signal.c:789 handle_signal_work kernel/entry/common.c:147 [inline] exit_to_user_mode_loop kernel/entry/common.c:171 [inline] ... read to 0xffff88811d8df330 of 8 bytes by task 6412 on cpu 0: io_uring_try_cancel_iowq fs/io_uring.c:8911 [inline] io_uring_try_cancel_requests fs/io_uring.c:8933 io_ring_exit_work fs/io_uring.c:8736 process_one_work kernel/workqueue.c:2276 ... With the config used, KCSAN only reports data races with value changes: this implies that in the case here we also know that tctx->io_wq was non-NULL. Therefore, depending on interleaving, we may end up with: [CPU 0] | [CPU 1] io_uring_try_cancel_iowq() | io_uring_clean_tctx() if (!tctx->io_wq) // false | ... ... | tctx->io_wq = NULL io_wq_cancel_cb(tctx->io_wq, ...) | ... -> NULL-deref | Note: It is likely that thus far we've gotten lucky and the compiler optimizes the double-read into a single read into a register -- but this is never guaranteed, and can easily change with a different config! Fix the data race by atomically accessing tctx->io_wq. Of course, this assumes that a valid io_wq remains alive for the duration of io_uring_try_cancel_iowq(), which should be the case per comment there. Reported-by: [email protected] Signed-off-by: Marco Elver <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Pavel Begunkov <[email protected]> --- fs/io_uring.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 5f82954004f6..c7e27b464cb6 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -8903,12 +8903,16 @@ static bool io_uring_try_cancel_iowq(struct io_ring_ctx *ctx) mutex_lock(&ctx->uring_lock); list_for_each_entry(node, &ctx->tctx_list, ctx_node) { struct io_uring_task *tctx = node->task->io_uring; + struct io_wq *io_wq; + if (!tctx) + continue; /* * io_wq will stay alive while we hold uring_lock, because it's * killed after ctx nodes, which requires to take the lock. */ - if (!tctx || !tctx->io_wq) + io_wq = READ_ONCE(tctx->io_wq); + if (!io_wq) continue; cret = io_wq_cancel_cb(tctx->io_wq, io_cancel_ctx_cb, ctx, true); ret |= (cret != IO_WQ_CANCEL_NOTFOUND); @@ -9039,7 +9043,7 @@ static void io_uring_clean_tctx(struct io_uring_task *tctx) struct io_tctx_node *node; unsigned long index; - tctx->io_wq = NULL; + WRITE_ONCE(tctx->io_wq, NULL); xa_for_each(&tctx->xa, index, node) io_uring_del_task_file(index); if (wq) -- 2.31.1.818.g46aad6cb9e-goog ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests 2021-05-26 15:48 ` Marco Elver @ 2021-05-26 15:52 ` Marco Elver 2021-05-26 16:29 ` Pavel Begunkov 0 siblings, 1 reply; 9+ messages in thread From: Marco Elver @ 2021-05-26 15:52 UTC (permalink / raw) To: asml.silence, axboe Cc: syzbot, io-uring, linux-kernel, syzkaller-bugs, dvyukov On Wed, May 26, 2021 at 05:48PM +0200, Marco Elver wrote: > On Wed, May 26, 2021 at 08:44AM -0700, syzbot wrote: > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit: a050a6d2 Merge tag 'perf-tools-fixes-for-v5.13-2021-05-24'.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=13205087d00000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=3bcc8a6b51ef8094 > > dashboard link: https://syzkaller.appspot.com/bug?extid=73554e2258b7b8bf0bbf > > compiler: Debian clang version 11.0.1-2 > [...] > > write to 0xffff88811d8df330 of 8 bytes by task 3709 on cpu 1: > > io_uring_clean_tctx fs/io_uring.c:9042 [inline] > > __io_uring_cancel+0x261/0x3b0 fs/io_uring.c:9136 > > io_uring_files_cancel include/linux/io_uring.h:16 [inline] > > do_exit+0x185/0x1560 kernel/exit.c:781 > > do_group_exit+0xce/0x1a0 kernel/exit.c:923 > > get_signal+0xfc3/0x1610 kernel/signal.c:2835 > > arch_do_signal_or_restart+0x2a/0x220 arch/x86/kernel/signal.c:789 > > handle_signal_work kernel/entry/common.c:147 [inline] > > exit_to_user_mode_loop kernel/entry/common.c:171 [inline] > > exit_to_user_mode_prepare+0x109/0x190 kernel/entry/common.c:208 > > __syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline] > > syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:301 > > do_syscall_64+0x56/0x90 arch/x86/entry/common.c:57 > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > read to 0xffff88811d8df330 of 8 bytes by task 6412 on cpu 0: > > io_uring_try_cancel_iowq fs/io_uring.c:8911 [inline] > > io_uring_try_cancel_requests+0x1ce/0x8e0 fs/io_uring.c:8933 > > io_ring_exit_work+0x7c/0x1110 fs/io_uring.c:8736 > > process_one_work+0x3e9/0x8f0 kernel/workqueue.c:2276 > > worker_thread+0x636/0xae0 kernel/workqueue.c:2422 > > kthread+0x1d0/0x1f0 kernel/kthread.c:313 > > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 > > I wasn't entirely sure if io_wq is guaranteed to remain live in this > case in io_uring_try_cancel_iowq(), but the comment there suggests it > does. In that case, I think the below patch would explain the situation > better and also propose a fix. > > Thoughts? Due to some moving around of code, the patch lost the actual fix (using atomically read io_wq) -- so here it is again ... hopefully as intended. :-) Thanks, -- Marco From: Marco Elver <[email protected]> Date: Wed, 26 May 2021 16:56:37 +0200 Subject: [PATCH] io_uring: fix data race to avoid potential NULL-deref Commit ba5ef6dc8a82 ("io_uring: fortify tctx/io_wq cleanup") introduced setting tctx->io_wq to NULL a bit earlier. This has caused KCSAN to detect a data race between between accesses to tctx->io_wq: write to 0xffff88811d8df330 of 8 bytes by task 3709 on cpu 1: io_uring_clean_tctx fs/io_uring.c:9042 [inline] __io_uring_cancel fs/io_uring.c:9136 io_uring_files_cancel include/linux/io_uring.h:16 [inline] do_exit kernel/exit.c:781 do_group_exit kernel/exit.c:923 get_signal kernel/signal.c:2835 arch_do_signal_or_restart arch/x86/kernel/signal.c:789 handle_signal_work kernel/entry/common.c:147 [inline] exit_to_user_mode_loop kernel/entry/common.c:171 [inline] ... read to 0xffff88811d8df330 of 8 bytes by task 6412 on cpu 0: io_uring_try_cancel_iowq fs/io_uring.c:8911 [inline] io_uring_try_cancel_requests fs/io_uring.c:8933 io_ring_exit_work fs/io_uring.c:8736 process_one_work kernel/workqueue.c:2276 ... With the config used, KCSAN only reports data races with value changes: this implies that in the case here we also know that tctx->io_wq was non-NULL. Therefore, depending on interleaving, we may end up with: [CPU 0] | [CPU 1] io_uring_try_cancel_iowq() | io_uring_clean_tctx() if (!tctx->io_wq) // false | ... ... | tctx->io_wq = NULL io_wq_cancel_cb(tctx->io_wq, ...) | ... -> NULL-deref | Note: It is likely that thus far we've gotten lucky and the compiler optimizes the double-read into a single read into a register -- but this is never guaranteed, and can easily change with a different config! Fix the data race by atomically accessing tctx->io_wq. Of course, this assumes that a valid io_wq remains alive for the duration of io_uring_try_cancel_iowq(), which should be the case per comment there. Reported-by: [email protected] Signed-off-by: Marco Elver <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Pavel Begunkov <[email protected]> --- fs/io_uring.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 5f82954004f6..e681ece1bbca 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -8903,14 +8903,18 @@ static bool io_uring_try_cancel_iowq(struct io_ring_ctx *ctx) mutex_lock(&ctx->uring_lock); list_for_each_entry(node, &ctx->tctx_list, ctx_node) { struct io_uring_task *tctx = node->task->io_uring; + struct io_wq *io_wq; + if (!tctx) + continue; /* * io_wq will stay alive while we hold uring_lock, because it's * killed after ctx nodes, which requires to take the lock. */ - if (!tctx || !tctx->io_wq) + io_wq = READ_ONCE(tctx->io_wq); + if (!io_wq) continue; - cret = io_wq_cancel_cb(tctx->io_wq, io_cancel_ctx_cb, ctx, true); + cret = io_wq_cancel_cb(io_wq, io_cancel_ctx_cb, ctx, true); ret |= (cret != IO_WQ_CANCEL_NOTFOUND); } mutex_unlock(&ctx->uring_lock); @@ -9039,7 +9043,7 @@ static void io_uring_clean_tctx(struct io_uring_task *tctx) struct io_tctx_node *node; unsigned long index; - tctx->io_wq = NULL; + WRITE_ONCE(tctx->io_wq, NULL); xa_for_each(&tctx->xa, index, node) io_uring_del_task_file(index); if (wq) -- 2.31.1.818.g46aad6cb9e-goog ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests 2021-05-26 15:52 ` Marco Elver @ 2021-05-26 16:29 ` Pavel Begunkov 2021-05-26 16:33 ` Pavel Begunkov 2021-05-26 16:36 ` Marco Elver 0 siblings, 2 replies; 9+ messages in thread From: Pavel Begunkov @ 2021-05-26 16:29 UTC (permalink / raw) To: Marco Elver, axboe Cc: syzbot, io-uring, linux-kernel, syzkaller-bugs, dvyukov On 5/26/21 4:52 PM, Marco Elver wrote: > Due to some moving around of code, the patch lost the actual fix (using > atomically read io_wq) -- so here it is again ... hopefully as intended. > :-) "fortify" damn it... It was synchronised with &ctx->uring_lock before, see io_uring_try_cancel_iowq() and io_uring_del_tctx_node(), so should not clear before *del_tctx_node() The fix should just move it after this sync point. Will you send it out as a patch? diff --git a/fs/io_uring.c b/fs/io_uring.c index 7db6aaf31080..b76ba26b4c6c 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -9075,11 +9075,12 @@ static void io_uring_clean_tctx(struct io_uring_task *tctx) struct io_tctx_node *node; unsigned long index; - tctx->io_wq = NULL; xa_for_each(&tctx->xa, index, node) io_uring_del_tctx_node(index); - if (wq) + if (wq) { + tctx->io_wq = NULL; io_wq_put_and_exit(wq); + } } static s64 tctx_inflight(struct io_uring_task *tctx, bool tracked) -- Pavel Begunkov ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests 2021-05-26 16:29 ` Pavel Begunkov @ 2021-05-26 16:33 ` Pavel Begunkov 2021-05-26 16:36 ` Marco Elver 1 sibling, 0 replies; 9+ messages in thread From: Pavel Begunkov @ 2021-05-26 16:33 UTC (permalink / raw) To: Marco Elver, axboe Cc: syzbot, io-uring, linux-kernel, syzkaller-bugs, dvyukov On 5/26/21 5:29 PM, Pavel Begunkov wrote: > On 5/26/21 4:52 PM, Marco Elver wrote: >> Due to some moving around of code, the patch lost the actual fix (using >> atomically read io_wq) -- so here it is again ... hopefully as intended. >> :-) > > "fortify" damn it... fwiw, it's a reference to my own commit that came after -rc -- Pavel Begunkov ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests 2021-05-26 16:29 ` Pavel Begunkov 2021-05-26 16:33 ` Pavel Begunkov @ 2021-05-26 16:36 ` Marco Elver 2021-05-26 20:31 ` Pavel Begunkov 1 sibling, 1 reply; 9+ messages in thread From: Marco Elver @ 2021-05-26 16:36 UTC (permalink / raw) To: Pavel Begunkov Cc: Jens Axboe, syzbot, io-uring, LKML, syzkaller-bugs, Dmitry Vyukov On Wed, 26 May 2021 at 18:29, Pavel Begunkov <[email protected]> wrote: > On 5/26/21 4:52 PM, Marco Elver wrote: > > Due to some moving around of code, the patch lost the actual fix (using > > atomically read io_wq) -- so here it is again ... hopefully as intended. > > :-) > > "fortify" damn it... It was synchronised with &ctx->uring_lock > before, see io_uring_try_cancel_iowq() and io_uring_del_tctx_node(), > so should not clear before *del_tctx_node() Ah, so if I understand right, the property stated by the comment in io_uring_try_cancel_iowq() was broken, and your patch below would fix that, right? > The fix should just move it after this sync point. Will you send > it out as a patch? Do you mean your move of write to io_wq goes on top of the patch I proposed? (If so, please also leave your Signed-of-by so I can squash it.) So if I understand right, we do in fact have 2 problems: 1. the data race as I noted in my patch, and 2. the fact that io_wq does not live long enough. Did I get it right? Thanks, -- Marco > diff --git a/fs/io_uring.c b/fs/io_uring.c > index 7db6aaf31080..b76ba26b4c6c 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -9075,11 +9075,12 @@ static void io_uring_clean_tctx(struct io_uring_task *tctx) > struct io_tctx_node *node; > unsigned long index; > > - tctx->io_wq = NULL; > xa_for_each(&tctx->xa, index, node) > io_uring_del_tctx_node(index); > - if (wq) > + if (wq) { > + tctx->io_wq = NULL; > io_wq_put_and_exit(wq); > + } > } > > static s64 tctx_inflight(struct io_uring_task *tctx, bool tracked) > > > -- > Pavel Begunkov ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests 2021-05-26 16:36 ` Marco Elver @ 2021-05-26 20:31 ` Pavel Begunkov 2021-05-27 9:32 ` Marco Elver 0 siblings, 1 reply; 9+ messages in thread From: Pavel Begunkov @ 2021-05-26 20:31 UTC (permalink / raw) To: Marco Elver Cc: Jens Axboe, syzbot, io-uring, LKML, syzkaller-bugs, Dmitry Vyukov On 5/26/21 5:36 PM, Marco Elver wrote: > On Wed, 26 May 2021 at 18:29, Pavel Begunkov <[email protected]> wrote: >> On 5/26/21 4:52 PM, Marco Elver wrote: >>> Due to some moving around of code, the patch lost the actual fix (using >>> atomically read io_wq) -- so here it is again ... hopefully as intended. >>> :-) >> >> "fortify" damn it... It was synchronised with &ctx->uring_lock >> before, see io_uring_try_cancel_iowq() and io_uring_del_tctx_node(), >> so should not clear before *del_tctx_node() > > Ah, so if I understand right, the property stated by the comment in > io_uring_try_cancel_iowq() was broken, and your patch below would fix > that, right? "io_uring: fortify tctx/io_wq cleanup" broke it and the diff should fix it. >> The fix should just move it after this sync point. Will you send >> it out as a patch? > > Do you mean your move of write to io_wq goes on top of the patch I > proposed? (If so, please also leave your Signed-of-by so I can squash > it.) No, only my diff, but you hinted on what has happened, so I would prefer you to take care of patching. If you want of course. To be entirely fair, assuming that aligned ptr reads can't be torn, I don't see any _real_ problem. But surely the report is very helpful and the current state is too wonky, so should be patched. TL;DR; The synchronisation goes as this: it's usually used by the owner task, and the owner task deletes it, so is mostly naturally synchronised. An exception is a worker (not only) that accesses it for cancellation purpose, but it uses it only under ->uring_lock, so if removal is also taking the lock it should be fine. see io_uring_del_tctx_node() locking. > > So if I understand right, we do in fact have 2 problems: > 1. the data race as I noted in my patch, and Yes, and it deals with it > 2. the fact that io_wq does not live long enough. Nope, io_wq outlives them fine. > Did I get it right? > >> diff --git a/fs/io_uring.c b/fs/io_uring.c >> index 7db6aaf31080..b76ba26b4c6c 100644 >> --- a/fs/io_uring.c >> +++ b/fs/io_uring.c >> @@ -9075,11 +9075,12 @@ static void io_uring_clean_tctx(struct io_uring_task *tctx) >> struct io_tctx_node *node; >> unsigned long index; >> >> - tctx->io_wq = NULL; >> xa_for_each(&tctx->xa, index, node) >> io_uring_del_tctx_node(index); >> - if (wq) >> + if (wq) { >> + tctx->io_wq = NULL; >> io_wq_put_and_exit(wq); >> + } >> } >> >> static s64 tctx_inflight(struct io_uring_task *tctx, bool tracked) -- Pavel Begunkov ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests 2021-05-26 20:31 ` Pavel Begunkov @ 2021-05-27 9:32 ` Marco Elver 2021-05-27 10:05 ` Pavel Begunkov 0 siblings, 1 reply; 9+ messages in thread From: Marco Elver @ 2021-05-27 9:32 UTC (permalink / raw) To: Pavel Begunkov Cc: Jens Axboe, syzbot, io-uring, LKML, syzkaller-bugs, Dmitry Vyukov On Wed, May 26, 2021 at 09:31PM +0100, Pavel Begunkov wrote: > On 5/26/21 5:36 PM, Marco Elver wrote: > > On Wed, 26 May 2021 at 18:29, Pavel Begunkov <[email protected]> wrote: > >> On 5/26/21 4:52 PM, Marco Elver wrote: > >>> Due to some moving around of code, the patch lost the actual fix (using > >>> atomically read io_wq) -- so here it is again ... hopefully as intended. > >>> :-) > >> > >> "fortify" damn it... It was synchronised with &ctx->uring_lock > >> before, see io_uring_try_cancel_iowq() and io_uring_del_tctx_node(), > >> so should not clear before *del_tctx_node() > > > > Ah, so if I understand right, the property stated by the comment in > > io_uring_try_cancel_iowq() was broken, and your patch below would fix > > that, right? > > "io_uring: fortify tctx/io_wq cleanup" broke it and the diff > should fix it. > > >> The fix should just move it after this sync point. Will you send > >> it out as a patch? > > > > Do you mean your move of write to io_wq goes on top of the patch I > > proposed? (If so, please also leave your Signed-of-by so I can squash > > it.) > > No, only my diff, but you hinted on what has happened, so I would > prefer you to take care of patching. If you want of course. > > To be entirely fair, assuming that aligned ptr > reads can't be torn, I don't see any _real_ problem. But surely > the report is very helpful and the current state is too wonky, so > should be patched. In the current version, it is a problem if we end up with a double-read, as it is in the current C code. The compiler might of course optimize it into 1 read into a register. Tangent: I avoid reasoning in terms of compiler optimizations where I can. :-) It's is a slippery slope if the code in question isn't tolerant to data races by design (examples are stats counting, or other heuristics -- in the case here that's certainly not the case). Therefore, my wish is that we really ought to resolve as many data races as we can (+ mark intentional ones appropriately). Also, so that we're left with only the interesting cases like in the case here. (More background if you're interested: https://lwn.net/Articles/816850/) The problem here, however, has a nicer resolution as you suggested. > TL;DR; > The synchronisation goes as this: it's usually used by the owner > task, and the owner task deletes it, so is mostly naturally > synchronised. An exception is a worker (not only) that accesses > it for cancellation purpose, but it uses it only under ->uring_lock, > so if removal is also taking the lock it should be fine. see > io_uring_del_tctx_node() locking. Did you mean io_uring_del_task_file()? There is no io_uring_del_tctx_node(). > > So if I understand right, we do in fact have 2 problems: > > 1. the data race as I noted in my patch, and > > Yes, and it deals with it > > > 2. the fact that io_wq does not live long enough. > > Nope, io_wq outlives them fine. I've sent: https://lkml.kernel.org/r/[email protected] Thanks, -- Marco ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests 2021-05-27 9:32 ` Marco Elver @ 2021-05-27 10:05 ` Pavel Begunkov 0 siblings, 0 replies; 9+ messages in thread From: Pavel Begunkov @ 2021-05-27 10:05 UTC (permalink / raw) To: Marco Elver Cc: Jens Axboe, syzbot, io-uring, LKML, syzkaller-bugs, Dmitry Vyukov On 5/27/21 10:32 AM, Marco Elver wrote: > On Wed, May 26, 2021 at 09:31PM +0100, Pavel Begunkov wrote: >> On 5/26/21 5:36 PM, Marco Elver wrote: >>> On Wed, 26 May 2021 at 18:29, Pavel Begunkov <[email protected]> wrote: >>>> On 5/26/21 4:52 PM, Marco Elver wrote: >>>>> Due to some moving around of code, the patch lost the actual fix (using >>>>> atomically read io_wq) -- so here it is again ... hopefully as intended. >>>>> :-) >>>> >>>> "fortify" damn it... It was synchronised with &ctx->uring_lock >>>> before, see io_uring_try_cancel_iowq() and io_uring_del_tctx_node(), >>>> so should not clear before *del_tctx_node() >>> >>> Ah, so if I understand right, the property stated by the comment in >>> io_uring_try_cancel_iowq() was broken, and your patch below would fix >>> that, right? >> >> "io_uring: fortify tctx/io_wq cleanup" broke it and the diff >> should fix it. >> >>>> The fix should just move it after this sync point. Will you send >>>> it out as a patch? >>> >>> Do you mean your move of write to io_wq goes on top of the patch I >>> proposed? (If so, please also leave your Signed-of-by so I can squash >>> it.) >> >> No, only my diff, but you hinted on what has happened, so I would >> prefer you to take care of patching. If you want of course. >> >> To be entirely fair, assuming that aligned ptr >> reads can't be torn, I don't see any _real_ problem. But surely >> the report is very helpful and the current state is too wonky, so >> should be patched. > > In the current version, it is a problem if we end up with a double-read, > as it is in the current C code. The compiler might of course optimize > it into 1 read into a register. Absolutely agree on that > Tangent: I avoid reasoning in terms of compiler optimizations where > I can. :-) It's is a slippery slope if the code in question isn't > tolerant to data races by design (examples are stats counting, or other > heuristics -- in the case here that's certainly not the case). > Therefore, my wish is that we really ought to resolve as many data races > as we can (+ mark intentional ones appropriately). Also, so that we're > left with only the interesting cases like in the case here. (More > background if you're interested: https://lwn.net/Articles/816850/) > > The problem here, however, has a nicer resolution as you suggested. > >> TL;DR; >> The synchronisation goes as this: it's usually used by the owner >> task, and the owner task deletes it, so is mostly naturally >> synchronised. An exception is a worker (not only) that accesses >> it for cancellation purpose, but it uses it only under ->uring_lock, >> so if removal is also taking the lock it should be fine. see >> io_uring_del_tctx_node() locking. > > Did you mean io_uring_del_task_file()? There is no > io_uring_del_tctx_node(). Ah, yes, that's from patches I sent for next. -- Pavel Begunkov ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2021-05-27 10:05 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-05-26 15:44 [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests syzbot 2021-05-26 15:48 ` Marco Elver 2021-05-26 15:52 ` Marco Elver 2021-05-26 16:29 ` Pavel Begunkov 2021-05-26 16:33 ` Pavel Begunkov 2021-05-26 16:36 ` Marco Elver 2021-05-26 20:31 ` Pavel Begunkov 2021-05-27 9:32 ` Marco Elver 2021-05-27 10:05 ` Pavel Begunkov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox