* [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests
@ 2021-05-26 15:44 syzbot
2021-05-26 15:48 ` Marco Elver
0 siblings, 1 reply; 9+ messages in thread
From: syzbot @ 2021-05-26 15:44 UTC (permalink / raw)
To: asml.silence, axboe, io-uring, linux-kernel, syzkaller-bugs
Hello,
syzbot found the following issue on:
HEAD commit: a050a6d2 Merge tag 'perf-tools-fixes-for-v5.13-2021-05-24'..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13205087d00000
kernel config: https://syzkaller.appspot.com/x/.config?x=3bcc8a6b51ef8094
dashboard link: https://syzkaller.appspot.com/bug?extid=73554e2258b7b8bf0bbf
compiler: Debian clang version 11.0.1-2
Unfortunately, I don't have any reproducer for this issue yet.
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]
==================================================================
BUG: KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests
write to 0xffff88811d8df330 of 8 bytes by task 3709 on cpu 1:
io_uring_clean_tctx fs/io_uring.c:9042 [inline]
__io_uring_cancel+0x261/0x3b0 fs/io_uring.c:9136
io_uring_files_cancel include/linux/io_uring.h:16 [inline]
do_exit+0x185/0x1560 kernel/exit.c:781
do_group_exit+0xce/0x1a0 kernel/exit.c:923
get_signal+0xfc3/0x1610 kernel/signal.c:2835
arch_do_signal_or_restart+0x2a/0x220 arch/x86/kernel/signal.c:789
handle_signal_work kernel/entry/common.c:147 [inline]
exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
exit_to_user_mode_prepare+0x109/0x190 kernel/entry/common.c:208
__syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline]
syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:301
do_syscall_64+0x56/0x90 arch/x86/entry/common.c:57
entry_SYSCALL_64_after_hwframe+0x44/0xae
read to 0xffff88811d8df330 of 8 bytes by task 6412 on cpu 0:
io_uring_try_cancel_iowq fs/io_uring.c:8911 [inline]
io_uring_try_cancel_requests+0x1ce/0x8e0 fs/io_uring.c:8933
io_ring_exit_work+0x7c/0x1110 fs/io_uring.c:8736
process_one_work+0x3e9/0x8f0 kernel/workqueue.c:2276
worker_thread+0x636/0xae0 kernel/workqueue.c:2422
kthread+0x1d0/0x1f0 kernel/kthread.c:313
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 6412 Comm: kworker/u4:9 Not tainted 5.13.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events_unbound io_ring_exit_work
==================================================================
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests
2021-05-26 15:44 [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests syzbot
@ 2021-05-26 15:48 ` Marco Elver
2021-05-26 15:52 ` Marco Elver
0 siblings, 1 reply; 9+ messages in thread
From: Marco Elver @ 2021-05-26 15:48 UTC (permalink / raw)
To: asml.silence, axboe
Cc: syzbot, io-uring, linux-kernel, syzkaller-bugs, dvyukov
On Wed, May 26, 2021 at 08:44AM -0700, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: a050a6d2 Merge tag 'perf-tools-fixes-for-v5.13-2021-05-24'..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=13205087d00000
> kernel config: https://syzkaller.appspot.com/x/.config?x=3bcc8a6b51ef8094
> dashboard link: https://syzkaller.appspot.com/bug?extid=73554e2258b7b8bf0bbf
> compiler: Debian clang version 11.0.1-2
[...]
> write to 0xffff88811d8df330 of 8 bytes by task 3709 on cpu 1:
> io_uring_clean_tctx fs/io_uring.c:9042 [inline]
> __io_uring_cancel+0x261/0x3b0 fs/io_uring.c:9136
> io_uring_files_cancel include/linux/io_uring.h:16 [inline]
> do_exit+0x185/0x1560 kernel/exit.c:781
> do_group_exit+0xce/0x1a0 kernel/exit.c:923
> get_signal+0xfc3/0x1610 kernel/signal.c:2835
> arch_do_signal_or_restart+0x2a/0x220 arch/x86/kernel/signal.c:789
> handle_signal_work kernel/entry/common.c:147 [inline]
> exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
> exit_to_user_mode_prepare+0x109/0x190 kernel/entry/common.c:208
> __syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline]
> syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:301
> do_syscall_64+0x56/0x90 arch/x86/entry/common.c:57
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> read to 0xffff88811d8df330 of 8 bytes by task 6412 on cpu 0:
> io_uring_try_cancel_iowq fs/io_uring.c:8911 [inline]
> io_uring_try_cancel_requests+0x1ce/0x8e0 fs/io_uring.c:8933
> io_ring_exit_work+0x7c/0x1110 fs/io_uring.c:8736
> process_one_work+0x3e9/0x8f0 kernel/workqueue.c:2276
> worker_thread+0x636/0xae0 kernel/workqueue.c:2422
> kthread+0x1d0/0x1f0 kernel/kthread.c:313
> ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
I wasn't entirely sure if io_wq is guaranteed to remain live in this
case in io_uring_try_cancel_iowq(), but the comment there suggests it
does. In that case, I think the below patch would explain the situation
better and also propose a fix.
Thoughts?
Thanks,
-- Marco
------ >8 ------
From: Marco Elver <[email protected]>
Date: Wed, 26 May 2021 16:56:37 +0200
Subject: [PATCH] io_uring: fix data race to avoid potential NULL-deref
Commit ba5ef6dc8a82 ("io_uring: fortify tctx/io_wq cleanup") introduced
setting tctx->io_wq to NULL a bit earlier. This has caused KCSAN to
detect a data race between between accesses to tctx->io_wq:
write to 0xffff88811d8df330 of 8 bytes by task 3709 on cpu 1:
io_uring_clean_tctx fs/io_uring.c:9042 [inline]
__io_uring_cancel fs/io_uring.c:9136
io_uring_files_cancel include/linux/io_uring.h:16 [inline]
do_exit kernel/exit.c:781
do_group_exit kernel/exit.c:923
get_signal kernel/signal.c:2835
arch_do_signal_or_restart arch/x86/kernel/signal.c:789
handle_signal_work kernel/entry/common.c:147 [inline]
exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
...
read to 0xffff88811d8df330 of 8 bytes by task 6412 on cpu 0:
io_uring_try_cancel_iowq fs/io_uring.c:8911 [inline]
io_uring_try_cancel_requests fs/io_uring.c:8933
io_ring_exit_work fs/io_uring.c:8736
process_one_work kernel/workqueue.c:2276
...
With the config used, KCSAN only reports data races with value changes:
this implies that in the case here we also know that tctx->io_wq was
non-NULL. Therefore, depending on interleaving, we may end up with:
[CPU 0] | [CPU 1]
io_uring_try_cancel_iowq() | io_uring_clean_tctx()
if (!tctx->io_wq) // false | ...
... | tctx->io_wq = NULL
io_wq_cancel_cb(tctx->io_wq, ...) | ...
-> NULL-deref |
Note: It is likely that thus far we've gotten lucky and the compiler
optimizes the double-read into a single read into a register -- but this
is never guaranteed, and can easily change with a different config!
Fix the data race by atomically accessing tctx->io_wq. Of course, this
assumes that a valid io_wq remains alive for the duration of
io_uring_try_cancel_iowq(), which should be the case per comment there.
Reported-by: [email protected]
Signed-off-by: Marco Elver <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 5f82954004f6..c7e27b464cb6 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -8903,12 +8903,16 @@ static bool io_uring_try_cancel_iowq(struct io_ring_ctx *ctx)
mutex_lock(&ctx->uring_lock);
list_for_each_entry(node, &ctx->tctx_list, ctx_node) {
struct io_uring_task *tctx = node->task->io_uring;
+ struct io_wq *io_wq;
+ if (!tctx)
+ continue;
/*
* io_wq will stay alive while we hold uring_lock, because it's
* killed after ctx nodes, which requires to take the lock.
*/
- if (!tctx || !tctx->io_wq)
+ io_wq = READ_ONCE(tctx->io_wq);
+ if (!io_wq)
continue;
cret = io_wq_cancel_cb(tctx->io_wq, io_cancel_ctx_cb, ctx, true);
ret |= (cret != IO_WQ_CANCEL_NOTFOUND);
@@ -9039,7 +9043,7 @@ static void io_uring_clean_tctx(struct io_uring_task *tctx)
struct io_tctx_node *node;
unsigned long index;
- tctx->io_wq = NULL;
+ WRITE_ONCE(tctx->io_wq, NULL);
xa_for_each(&tctx->xa, index, node)
io_uring_del_task_file(index);
if (wq)
--
2.31.1.818.g46aad6cb9e-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests
2021-05-26 15:48 ` Marco Elver
@ 2021-05-26 15:52 ` Marco Elver
2021-05-26 16:29 ` Pavel Begunkov
0 siblings, 1 reply; 9+ messages in thread
From: Marco Elver @ 2021-05-26 15:52 UTC (permalink / raw)
To: asml.silence, axboe
Cc: syzbot, io-uring, linux-kernel, syzkaller-bugs, dvyukov
On Wed, May 26, 2021 at 05:48PM +0200, Marco Elver wrote:
> On Wed, May 26, 2021 at 08:44AM -0700, syzbot wrote:
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: a050a6d2 Merge tag 'perf-tools-fixes-for-v5.13-2021-05-24'..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=13205087d00000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=3bcc8a6b51ef8094
> > dashboard link: https://syzkaller.appspot.com/bug?extid=73554e2258b7b8bf0bbf
> > compiler: Debian clang version 11.0.1-2
> [...]
> > write to 0xffff88811d8df330 of 8 bytes by task 3709 on cpu 1:
> > io_uring_clean_tctx fs/io_uring.c:9042 [inline]
> > __io_uring_cancel+0x261/0x3b0 fs/io_uring.c:9136
> > io_uring_files_cancel include/linux/io_uring.h:16 [inline]
> > do_exit+0x185/0x1560 kernel/exit.c:781
> > do_group_exit+0xce/0x1a0 kernel/exit.c:923
> > get_signal+0xfc3/0x1610 kernel/signal.c:2835
> > arch_do_signal_or_restart+0x2a/0x220 arch/x86/kernel/signal.c:789
> > handle_signal_work kernel/entry/common.c:147 [inline]
> > exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
> > exit_to_user_mode_prepare+0x109/0x190 kernel/entry/common.c:208
> > __syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline]
> > syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:301
> > do_syscall_64+0x56/0x90 arch/x86/entry/common.c:57
> > entry_SYSCALL_64_after_hwframe+0x44/0xae
> >
> > read to 0xffff88811d8df330 of 8 bytes by task 6412 on cpu 0:
> > io_uring_try_cancel_iowq fs/io_uring.c:8911 [inline]
> > io_uring_try_cancel_requests+0x1ce/0x8e0 fs/io_uring.c:8933
> > io_ring_exit_work+0x7c/0x1110 fs/io_uring.c:8736
> > process_one_work+0x3e9/0x8f0 kernel/workqueue.c:2276
> > worker_thread+0x636/0xae0 kernel/workqueue.c:2422
> > kthread+0x1d0/0x1f0 kernel/kthread.c:313
> > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
>
> I wasn't entirely sure if io_wq is guaranteed to remain live in this
> case in io_uring_try_cancel_iowq(), but the comment there suggests it
> does. In that case, I think the below patch would explain the situation
> better and also propose a fix.
>
> Thoughts?
Due to some moving around of code, the patch lost the actual fix (using
atomically read io_wq) -- so here it is again ... hopefully as intended.
:-)
Thanks,
-- Marco
From: Marco Elver <[email protected]>
Date: Wed, 26 May 2021 16:56:37 +0200
Subject: [PATCH] io_uring: fix data race to avoid potential NULL-deref
Commit ba5ef6dc8a82 ("io_uring: fortify tctx/io_wq cleanup") introduced
setting tctx->io_wq to NULL a bit earlier. This has caused KCSAN to
detect a data race between between accesses to tctx->io_wq:
write to 0xffff88811d8df330 of 8 bytes by task 3709 on cpu 1:
io_uring_clean_tctx fs/io_uring.c:9042 [inline]
__io_uring_cancel fs/io_uring.c:9136
io_uring_files_cancel include/linux/io_uring.h:16 [inline]
do_exit kernel/exit.c:781
do_group_exit kernel/exit.c:923
get_signal kernel/signal.c:2835
arch_do_signal_or_restart arch/x86/kernel/signal.c:789
handle_signal_work kernel/entry/common.c:147 [inline]
exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
...
read to 0xffff88811d8df330 of 8 bytes by task 6412 on cpu 0:
io_uring_try_cancel_iowq fs/io_uring.c:8911 [inline]
io_uring_try_cancel_requests fs/io_uring.c:8933
io_ring_exit_work fs/io_uring.c:8736
process_one_work kernel/workqueue.c:2276
...
With the config used, KCSAN only reports data races with value changes:
this implies that in the case here we also know that tctx->io_wq was
non-NULL. Therefore, depending on interleaving, we may end up with:
[CPU 0] | [CPU 1]
io_uring_try_cancel_iowq() | io_uring_clean_tctx()
if (!tctx->io_wq) // false | ...
... | tctx->io_wq = NULL
io_wq_cancel_cb(tctx->io_wq, ...) | ...
-> NULL-deref |
Note: It is likely that thus far we've gotten lucky and the compiler
optimizes the double-read into a single read into a register -- but this
is never guaranteed, and can easily change with a different config!
Fix the data race by atomically accessing tctx->io_wq. Of course, this
assumes that a valid io_wq remains alive for the duration of
io_uring_try_cancel_iowq(), which should be the case per comment there.
Reported-by: [email protected]
Signed-off-by: Marco Elver <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 5f82954004f6..e681ece1bbca 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -8903,14 +8903,18 @@ static bool io_uring_try_cancel_iowq(struct io_ring_ctx *ctx)
mutex_lock(&ctx->uring_lock);
list_for_each_entry(node, &ctx->tctx_list, ctx_node) {
struct io_uring_task *tctx = node->task->io_uring;
+ struct io_wq *io_wq;
+ if (!tctx)
+ continue;
/*
* io_wq will stay alive while we hold uring_lock, because it's
* killed after ctx nodes, which requires to take the lock.
*/
- if (!tctx || !tctx->io_wq)
+ io_wq = READ_ONCE(tctx->io_wq);
+ if (!io_wq)
continue;
- cret = io_wq_cancel_cb(tctx->io_wq, io_cancel_ctx_cb, ctx, true);
+ cret = io_wq_cancel_cb(io_wq, io_cancel_ctx_cb, ctx, true);
ret |= (cret != IO_WQ_CANCEL_NOTFOUND);
}
mutex_unlock(&ctx->uring_lock);
@@ -9039,7 +9043,7 @@ static void io_uring_clean_tctx(struct io_uring_task *tctx)
struct io_tctx_node *node;
unsigned long index;
- tctx->io_wq = NULL;
+ WRITE_ONCE(tctx->io_wq, NULL);
xa_for_each(&tctx->xa, index, node)
io_uring_del_task_file(index);
if (wq)
--
2.31.1.818.g46aad6cb9e-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests
2021-05-26 15:52 ` Marco Elver
@ 2021-05-26 16:29 ` Pavel Begunkov
2021-05-26 16:33 ` Pavel Begunkov
2021-05-26 16:36 ` Marco Elver
0 siblings, 2 replies; 9+ messages in thread
From: Pavel Begunkov @ 2021-05-26 16:29 UTC (permalink / raw)
To: Marco Elver, axboe
Cc: syzbot, io-uring, linux-kernel, syzkaller-bugs, dvyukov
On 5/26/21 4:52 PM, Marco Elver wrote:
> Due to some moving around of code, the patch lost the actual fix (using
> atomically read io_wq) -- so here it is again ... hopefully as intended.
> :-)
"fortify" damn it... It was synchronised with &ctx->uring_lock
before, see io_uring_try_cancel_iowq() and io_uring_del_tctx_node(),
so should not clear before *del_tctx_node()
The fix should just move it after this sync point. Will you send
it out as a patch?
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 7db6aaf31080..b76ba26b4c6c 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -9075,11 +9075,12 @@ static void io_uring_clean_tctx(struct io_uring_task *tctx)
struct io_tctx_node *node;
unsigned long index;
- tctx->io_wq = NULL;
xa_for_each(&tctx->xa, index, node)
io_uring_del_tctx_node(index);
- if (wq)
+ if (wq) {
+ tctx->io_wq = NULL;
io_wq_put_and_exit(wq);
+ }
}
static s64 tctx_inflight(struct io_uring_task *tctx, bool tracked)
--
Pavel Begunkov
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests
2021-05-26 16:29 ` Pavel Begunkov
@ 2021-05-26 16:33 ` Pavel Begunkov
2021-05-26 16:36 ` Marco Elver
1 sibling, 0 replies; 9+ messages in thread
From: Pavel Begunkov @ 2021-05-26 16:33 UTC (permalink / raw)
To: Marco Elver, axboe
Cc: syzbot, io-uring, linux-kernel, syzkaller-bugs, dvyukov
On 5/26/21 5:29 PM, Pavel Begunkov wrote:
> On 5/26/21 4:52 PM, Marco Elver wrote:
>> Due to some moving around of code, the patch lost the actual fix (using
>> atomically read io_wq) -- so here it is again ... hopefully as intended.
>> :-)
>
> "fortify" damn it...
fwiw, it's a reference to my own commit that came after -rc
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests
2021-05-26 16:29 ` Pavel Begunkov
2021-05-26 16:33 ` Pavel Begunkov
@ 2021-05-26 16:36 ` Marco Elver
2021-05-26 20:31 ` Pavel Begunkov
1 sibling, 1 reply; 9+ messages in thread
From: Marco Elver @ 2021-05-26 16:36 UTC (permalink / raw)
To: Pavel Begunkov
Cc: Jens Axboe, syzbot, io-uring, LKML, syzkaller-bugs, Dmitry Vyukov
On Wed, 26 May 2021 at 18:29, Pavel Begunkov <[email protected]> wrote:
> On 5/26/21 4:52 PM, Marco Elver wrote:
> > Due to some moving around of code, the patch lost the actual fix (using
> > atomically read io_wq) -- so here it is again ... hopefully as intended.
> > :-)
>
> "fortify" damn it... It was synchronised with &ctx->uring_lock
> before, see io_uring_try_cancel_iowq() and io_uring_del_tctx_node(),
> so should not clear before *del_tctx_node()
Ah, so if I understand right, the property stated by the comment in
io_uring_try_cancel_iowq() was broken, and your patch below would fix
that, right?
> The fix should just move it after this sync point. Will you send
> it out as a patch?
Do you mean your move of write to io_wq goes on top of the patch I
proposed? (If so, please also leave your Signed-of-by so I can squash
it.)
So if I understand right, we do in fact have 2 problems:
1. the data race as I noted in my patch, and
2. the fact that io_wq does not live long enough.
Did I get it right?
Thanks,
-- Marco
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 7db6aaf31080..b76ba26b4c6c 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -9075,11 +9075,12 @@ static void io_uring_clean_tctx(struct io_uring_task *tctx)
> struct io_tctx_node *node;
> unsigned long index;
>
> - tctx->io_wq = NULL;
> xa_for_each(&tctx->xa, index, node)
> io_uring_del_tctx_node(index);
> - if (wq)
> + if (wq) {
> + tctx->io_wq = NULL;
> io_wq_put_and_exit(wq);
> + }
> }
>
> static s64 tctx_inflight(struct io_uring_task *tctx, bool tracked)
>
>
> --
> Pavel Begunkov
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests
2021-05-26 16:36 ` Marco Elver
@ 2021-05-26 20:31 ` Pavel Begunkov
2021-05-27 9:32 ` Marco Elver
0 siblings, 1 reply; 9+ messages in thread
From: Pavel Begunkov @ 2021-05-26 20:31 UTC (permalink / raw)
To: Marco Elver
Cc: Jens Axboe, syzbot, io-uring, LKML, syzkaller-bugs, Dmitry Vyukov
On 5/26/21 5:36 PM, Marco Elver wrote:
> On Wed, 26 May 2021 at 18:29, Pavel Begunkov <[email protected]> wrote:
>> On 5/26/21 4:52 PM, Marco Elver wrote:
>>> Due to some moving around of code, the patch lost the actual fix (using
>>> atomically read io_wq) -- so here it is again ... hopefully as intended.
>>> :-)
>>
>> "fortify" damn it... It was synchronised with &ctx->uring_lock
>> before, see io_uring_try_cancel_iowq() and io_uring_del_tctx_node(),
>> so should not clear before *del_tctx_node()
>
> Ah, so if I understand right, the property stated by the comment in
> io_uring_try_cancel_iowq() was broken, and your patch below would fix
> that, right?
"io_uring: fortify tctx/io_wq cleanup" broke it and the diff
should fix it.
>> The fix should just move it after this sync point. Will you send
>> it out as a patch?
>
> Do you mean your move of write to io_wq goes on top of the patch I
> proposed? (If so, please also leave your Signed-of-by so I can squash
> it.)
No, only my diff, but you hinted on what has happened, so I would
prefer you to take care of patching. If you want of course.
To be entirely fair, assuming that aligned ptr
reads can't be torn, I don't see any _real_ problem. But surely
the report is very helpful and the current state is too wonky, so
should be patched.
TL;DR;
The synchronisation goes as this: it's usually used by the owner
task, and the owner task deletes it, so is mostly naturally
synchronised. An exception is a worker (not only) that accesses
it for cancellation purpose, but it uses it only under ->uring_lock,
so if removal is also taking the lock it should be fine. see
io_uring_del_tctx_node() locking.
>
> So if I understand right, we do in fact have 2 problems:
> 1. the data race as I noted in my patch, and
Yes, and it deals with it
> 2. the fact that io_wq does not live long enough.
Nope, io_wq outlives them fine.
> Did I get it right?
>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 7db6aaf31080..b76ba26b4c6c 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -9075,11 +9075,12 @@ static void io_uring_clean_tctx(struct io_uring_task *tctx)
>> struct io_tctx_node *node;
>> unsigned long index;
>>
>> - tctx->io_wq = NULL;
>> xa_for_each(&tctx->xa, index, node)
>> io_uring_del_tctx_node(index);
>> - if (wq)
>> + if (wq) {
>> + tctx->io_wq = NULL;
>> io_wq_put_and_exit(wq);
>> + }
>> }
>>
>> static s64 tctx_inflight(struct io_uring_task *tctx, bool tracked)
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests
2021-05-26 20:31 ` Pavel Begunkov
@ 2021-05-27 9:32 ` Marco Elver
2021-05-27 10:05 ` Pavel Begunkov
0 siblings, 1 reply; 9+ messages in thread
From: Marco Elver @ 2021-05-27 9:32 UTC (permalink / raw)
To: Pavel Begunkov
Cc: Jens Axboe, syzbot, io-uring, LKML, syzkaller-bugs, Dmitry Vyukov
On Wed, May 26, 2021 at 09:31PM +0100, Pavel Begunkov wrote:
> On 5/26/21 5:36 PM, Marco Elver wrote:
> > On Wed, 26 May 2021 at 18:29, Pavel Begunkov <[email protected]> wrote:
> >> On 5/26/21 4:52 PM, Marco Elver wrote:
> >>> Due to some moving around of code, the patch lost the actual fix (using
> >>> atomically read io_wq) -- so here it is again ... hopefully as intended.
> >>> :-)
> >>
> >> "fortify" damn it... It was synchronised with &ctx->uring_lock
> >> before, see io_uring_try_cancel_iowq() and io_uring_del_tctx_node(),
> >> so should not clear before *del_tctx_node()
> >
> > Ah, so if I understand right, the property stated by the comment in
> > io_uring_try_cancel_iowq() was broken, and your patch below would fix
> > that, right?
>
> "io_uring: fortify tctx/io_wq cleanup" broke it and the diff
> should fix it.
>
> >> The fix should just move it after this sync point. Will you send
> >> it out as a patch?
> >
> > Do you mean your move of write to io_wq goes on top of the patch I
> > proposed? (If so, please also leave your Signed-of-by so I can squash
> > it.)
>
> No, only my diff, but you hinted on what has happened, so I would
> prefer you to take care of patching. If you want of course.
>
> To be entirely fair, assuming that aligned ptr
> reads can't be torn, I don't see any _real_ problem. But surely
> the report is very helpful and the current state is too wonky, so
> should be patched.
In the current version, it is a problem if we end up with a double-read,
as it is in the current C code. The compiler might of course optimize
it into 1 read into a register.
Tangent: I avoid reasoning in terms of compiler optimizations where
I can. :-) It's is a slippery slope if the code in question isn't
tolerant to data races by design (examples are stats counting, or other
heuristics -- in the case here that's certainly not the case).
Therefore, my wish is that we really ought to resolve as many data races
as we can (+ mark intentional ones appropriately). Also, so that we're
left with only the interesting cases like in the case here. (More
background if you're interested: https://lwn.net/Articles/816850/)
The problem here, however, has a nicer resolution as you suggested.
> TL;DR;
> The synchronisation goes as this: it's usually used by the owner
> task, and the owner task deletes it, so is mostly naturally
> synchronised. An exception is a worker (not only) that accesses
> it for cancellation purpose, but it uses it only under ->uring_lock,
> so if removal is also taking the lock it should be fine. see
> io_uring_del_tctx_node() locking.
Did you mean io_uring_del_task_file()? There is no
io_uring_del_tctx_node().
> > So if I understand right, we do in fact have 2 problems:
> > 1. the data race as I noted in my patch, and
>
> Yes, and it deals with it
>
> > 2. the fact that io_wq does not live long enough.
>
> Nope, io_wq outlives them fine.
I've sent:
https://lkml.kernel.org/r/[email protected]
Thanks,
-- Marco
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests
2021-05-27 9:32 ` Marco Elver
@ 2021-05-27 10:05 ` Pavel Begunkov
0 siblings, 0 replies; 9+ messages in thread
From: Pavel Begunkov @ 2021-05-27 10:05 UTC (permalink / raw)
To: Marco Elver
Cc: Jens Axboe, syzbot, io-uring, LKML, syzkaller-bugs, Dmitry Vyukov
On 5/27/21 10:32 AM, Marco Elver wrote:
> On Wed, May 26, 2021 at 09:31PM +0100, Pavel Begunkov wrote:
>> On 5/26/21 5:36 PM, Marco Elver wrote:
>>> On Wed, 26 May 2021 at 18:29, Pavel Begunkov <[email protected]> wrote:
>>>> On 5/26/21 4:52 PM, Marco Elver wrote:
>>>>> Due to some moving around of code, the patch lost the actual fix (using
>>>>> atomically read io_wq) -- so here it is again ... hopefully as intended.
>>>>> :-)
>>>>
>>>> "fortify" damn it... It was synchronised with &ctx->uring_lock
>>>> before, see io_uring_try_cancel_iowq() and io_uring_del_tctx_node(),
>>>> so should not clear before *del_tctx_node()
>>>
>>> Ah, so if I understand right, the property stated by the comment in
>>> io_uring_try_cancel_iowq() was broken, and your patch below would fix
>>> that, right?
>>
>> "io_uring: fortify tctx/io_wq cleanup" broke it and the diff
>> should fix it.
>>
>>>> The fix should just move it after this sync point. Will you send
>>>> it out as a patch?
>>>
>>> Do you mean your move of write to io_wq goes on top of the patch I
>>> proposed? (If so, please also leave your Signed-of-by so I can squash
>>> it.)
>>
>> No, only my diff, but you hinted on what has happened, so I would
>> prefer you to take care of patching. If you want of course.
>>
>> To be entirely fair, assuming that aligned ptr
>> reads can't be torn, I don't see any _real_ problem. But surely
>> the report is very helpful and the current state is too wonky, so
>> should be patched.
>
> In the current version, it is a problem if we end up with a double-read,
> as it is in the current C code. The compiler might of course optimize
> it into 1 read into a register.
Absolutely agree on that
> Tangent: I avoid reasoning in terms of compiler optimizations where
> I can. :-) It's is a slippery slope if the code in question isn't
> tolerant to data races by design (examples are stats counting, or other
> heuristics -- in the case here that's certainly not the case).
> Therefore, my wish is that we really ought to resolve as many data races
> as we can (+ mark intentional ones appropriately). Also, so that we're
> left with only the interesting cases like in the case here. (More
> background if you're interested: https://lwn.net/Articles/816850/)
>
> The problem here, however, has a nicer resolution as you suggested.
>
>> TL;DR;
>> The synchronisation goes as this: it's usually used by the owner
>> task, and the owner task deletes it, so is mostly naturally
>> synchronised. An exception is a worker (not only) that accesses
>> it for cancellation purpose, but it uses it only under ->uring_lock,
>> so if removal is also taking the lock it should be fine. see
>> io_uring_del_tctx_node() locking.
>
> Did you mean io_uring_del_task_file()? There is no
> io_uring_del_tctx_node().
Ah, yes, that's from patches I sent for next.
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2021-05-27 10:05 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-05-26 15:44 [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests syzbot
2021-05-26 15:48 ` Marco Elver
2021-05-26 15:52 ` Marco Elver
2021-05-26 16:29 ` Pavel Begunkov
2021-05-26 16:33 ` Pavel Begunkov
2021-05-26 16:36 ` Marco Elver
2021-05-26 20:31 ` Pavel Begunkov
2021-05-27 9:32 ` Marco Elver
2021-05-27 10:05 ` Pavel Begunkov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox