* [syzbot] [io-uring] KCSAN: data-race in copy_mm / percpu_counter_destroy_many
@ 2025-05-12 6:36 syzbot
2025-05-12 13:51 ` Jens Axboe
0 siblings, 1 reply; 3+ messages in thread
From: syzbot @ 2025-05-12 6:36 UTC (permalink / raw)
To: asml.silence, axboe, io-uring, linux-kernel, syzkaller-bugs
Hello,
syzbot found the following issue on:
HEAD commit: 3ce9925823c7 Merge tag 'mm-hotfixes-stable-2025-05-10-14-2..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=14ff74d4580000
kernel config: https://syzkaller.appspot.com/x/.config?x=6154604431d9aaf9
dashboard link: https://syzkaller.appspot.com/bug?extid=8be9bf36c3cf574426c8
compiler: Debian clang version 20.1.2 (++20250402124445+58df0ef89dd6-1~exp1~20250402004600.97), Debian LLD 20.1.2
Unfortunately, I don't have any reproducer for this issue yet.
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/afdc6302fc05/disk-3ce99258.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/fc7f98d3c420/vmlinux-3ce99258.xz
kernel image: https://storage.googleapis.com/syzbot-assets/ea7ca2da2258/bzImage-3ce99258.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+8be9bf36c3cf574426c8@syzkaller.appspotmail.com
==================================================================
BUG: KCSAN: data-race in copy_mm / percpu_counter_destroy_many
write to 0xffff8881045e19d8 of 8 bytes by task 2123 on cpu 0:
__list_del include/linux/list.h:195 [inline]
__list_del_entry include/linux/list.h:218 [inline]
list_del include/linux/list.h:229 [inline]
percpu_counter_destroy_many+0xc7/0x2b0 lib/percpu_counter.c:244
__mmdrop+0x22e/0x350 kernel/fork.c:947
mmdrop include/linux/sched/mm.h:55 [inline]
io_ring_ctx_free+0x31e/0x360 io_uring/io_uring.c:2740
io_ring_exit_work+0x529/0x560 io_uring/io_uring.c:2962
process_one_work kernel/workqueue.c:3238 [inline]
process_scheduled_works+0x4cb/0x9d0 kernel/workqueue.c:3319
worker_thread+0x582/0x770 kernel/workqueue.c:3400
kthread+0x486/0x510 kernel/kthread.c:464
ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:153
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
read to 0xffff8881045e1600 of 1344 bytes by task 5051 on cpu 1:
dup_mm kernel/fork.c:1728 [inline]
copy_mm+0xfb/0x1310 kernel/fork.c:1786
copy_process+0xcf1/0x1f90 kernel/fork.c:2429
kernel_clone+0x16c/0x5b0 kernel/fork.c:2844
__do_sys_clone kernel/fork.c:2987 [inline]
__se_sys_clone kernel/fork.c:2971 [inline]
__x64_sys_clone+0xe6/0x120 kernel/fork.c:2971
x64_sys_call+0x2c59/0x2fb0 arch/x86/include/generated/asm/syscalls_64.h:57
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xd0/0x1a0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 5051 Comm: syz.1.494 Not tainted 6.15.0-rc5-syzkaller-00300-g3ce9925823c7 #0 PREEMPT(voluntary)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/19/2025
==================================================================
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [syzbot] [io-uring] KCSAN: data-race in copy_mm / percpu_counter_destroy_many
2025-05-12 6:36 [syzbot] [io-uring] KCSAN: data-race in copy_mm / percpu_counter_destroy_many syzbot
@ 2025-05-12 13:51 ` Jens Axboe
2025-05-12 17:08 ` Jann Horn
0 siblings, 1 reply; 3+ messages in thread
From: Jens Axboe @ 2025-05-12 13:51 UTC (permalink / raw)
To: syzbot, asml.silence, io-uring, linux-kernel, syzkaller-bugs,
Linux-MM, Andrew Morton, dennis
On 5/12/25 12:36 AM, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 3ce9925823c7 Merge tag 'mm-hotfixes-stable-2025-05-10-14-2..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=14ff74d4580000
> kernel config: https://syzkaller.appspot.com/x/.config?x=6154604431d9aaf9
> dashboard link: https://syzkaller.appspot.com/bug?extid=8be9bf36c3cf574426c8
> compiler: Debian clang version 20.1.2 (++20250402124445+58df0ef89dd6-1~exp1~20250402004600.97), Debian LLD 20.1.2
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/afdc6302fc05/disk-3ce99258.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/fc7f98d3c420/vmlinux-3ce99258.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/ea7ca2da2258/bzImage-3ce99258.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+8be9bf36c3cf574426c8@syzkaller.appspotmail.com
>
> ==================================================================
> BUG: KCSAN: data-race in copy_mm / percpu_counter_destroy_many
>
> write to 0xffff8881045e19d8 of 8 bytes by task 2123 on cpu 0:
> __list_del include/linux/list.h:195 [inline]
> __list_del_entry include/linux/list.h:218 [inline]
> list_del include/linux/list.h:229 [inline]
> percpu_counter_destroy_many+0xc7/0x2b0 lib/percpu_counter.c:244
> __mmdrop+0x22e/0x350 kernel/fork.c:947
> mmdrop include/linux/sched/mm.h:55 [inline]
> io_ring_ctx_free+0x31e/0x360 io_uring/io_uring.c:2740
> io_ring_exit_work+0x529/0x560 io_uring/io_uring.c:2962
> process_one_work kernel/workqueue.c:3238 [inline]
> process_scheduled_works+0x4cb/0x9d0 kernel/workqueue.c:3319
> worker_thread+0x582/0x770 kernel/workqueue.c:3400
> kthread+0x486/0x510 kernel/kthread.c:464
> ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:153
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
>
> read to 0xffff8881045e1600 of 1344 bytes by task 5051 on cpu 1:
> dup_mm kernel/fork.c:1728 [inline]
> copy_mm+0xfb/0x1310 kernel/fork.c:1786
> copy_process+0xcf1/0x1f90 kernel/fork.c:2429
> kernel_clone+0x16c/0x5b0 kernel/fork.c:2844
> __do_sys_clone kernel/fork.c:2987 [inline]
> __se_sys_clone kernel/fork.c:2971 [inline]
> __x64_sys_clone+0xe6/0x120 kernel/fork.c:2971
> x64_sys_call+0x2c59/0x2fb0 arch/x86/include/generated/asm/syscalls_64.h:57
> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> do_syscall_64+0xd0/0x1a0 arch/x86/entry/syscall_64.c:94
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 1 UID: 0 PID: 5051 Comm: syz.1.494 Not tainted 6.15.0-rc5-syzkaller-00300-g3ce9925823c7 #0 PREEMPT(voluntary)
This doesn't look like an io_uring issue, it's successive setup and teardown
of a percpu counter. Adding some relevant folks.
#syz set subsystems: kernel
--
Jens Axboe
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [syzbot] [io-uring] KCSAN: data-race in copy_mm / percpu_counter_destroy_many
2025-05-12 13:51 ` Jens Axboe
@ 2025-05-12 17:08 ` Jann Horn
0 siblings, 0 replies; 3+ messages in thread
From: Jann Horn @ 2025-05-12 17:08 UTC (permalink / raw)
To: Jens Axboe, Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
Vlastimil Babka, Pedro Falcato
Cc: syzbot, asml.silence, io-uring, linux-kernel, syzkaller-bugs,
Linux-MM, dennis
+mmap maintainers
On Mon, May 12, 2025 at 3:51 PM Jens Axboe <axboe@kernel.dk> wrote:
> On 5/12/25 12:36 AM, syzbot wrote:
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: 3ce9925823c7 Merge tag 'mm-hotfixes-stable-2025-05-10-14-2..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=14ff74d4580000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=6154604431d9aaf9
> > dashboard link: https://syzkaller.appspot.com/bug?extid=8be9bf36c3cf574426c8
> > compiler: Debian clang version 20.1.2 (++20250402124445+58df0ef89dd6-1~exp1~20250402004600.97), Debian LLD 20.1.2
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > Downloadable assets:
> > disk image: https://storage.googleapis.com/syzbot-assets/afdc6302fc05/disk-3ce99258.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/fc7f98d3c420/vmlinux-3ce99258.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/ea7ca2da2258/bzImage-3ce99258.xz
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+8be9bf36c3cf574426c8@syzkaller.appspotmail.com
> >
> > ==================================================================
> > BUG: KCSAN: data-race in copy_mm / percpu_counter_destroy_many
> >
> > write to 0xffff8881045e19d8 of 8 bytes by task 2123 on cpu 0:
> > __list_del include/linux/list.h:195 [inline]
> > __list_del_entry include/linux/list.h:218 [inline]
> > list_del include/linux/list.h:229 [inline]
> > percpu_counter_destroy_many+0xc7/0x2b0 lib/percpu_counter.c:244
> > __mmdrop+0x22e/0x350 kernel/fork.c:947
> > mmdrop include/linux/sched/mm.h:55 [inline]
> > io_ring_ctx_free+0x31e/0x360 io_uring/io_uring.c:2740
> > io_ring_exit_work+0x529/0x560 io_uring/io_uring.c:2962
> > process_one_work kernel/workqueue.c:3238 [inline]
> > process_scheduled_works+0x4cb/0x9d0 kernel/workqueue.c:3319
> > worker_thread+0x582/0x770 kernel/workqueue.c:3400
> > kthread+0x486/0x510 kernel/kthread.c:464
> > ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:153
> > ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> >
> > read to 0xffff8881045e1600 of 1344 bytes by task 5051 on cpu 1:
> > dup_mm kernel/fork.c:1728 [inline]
> > copy_mm+0xfb/0x1310 kernel/fork.c:1786
> > copy_process+0xcf1/0x1f90 kernel/fork.c:2429
> > kernel_clone+0x16c/0x5b0 kernel/fork.c:2844
> > __do_sys_clone kernel/fork.c:2987 [inline]
> > __se_sys_clone kernel/fork.c:2971 [inline]
> > __x64_sys_clone+0xe6/0x120 kernel/fork.c:2971
> > x64_sys_call+0x2c59/0x2fb0 arch/x86/include/generated/asm/syscalls_64.h:57
> > do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> > do_syscall_64+0xd0/0x1a0 arch/x86/entry/syscall_64.c:94
> > entry_SYSCALL_64_after_hwframe+0x77/0x7f
> >
> > Reported by Kernel Concurrency Sanitizer on:
> > CPU: 1 UID: 0 PID: 5051 Comm: syz.1.494 Not tainted 6.15.0-rc5-syzkaller-00300-g3ce9925823c7 #0 PREEMPT(voluntary)
>
> This doesn't look like an io_uring issue, it's successive setup and teardown
> of a percpu counter. Adding some relevant folks.
This is an intentional-but-dodgy data race:
dup_mm() does memcpy() of the entire old mm_struct, while the old
mm_struct is not yet locked, in order to initialize some parts of the
new mm_struct that are expected to be stable (I guess?); and then
afterwards re-initializes all the important fields with proper
locking.
(What happens in the KCSAN report is that one mm is being forked while
an **unrelated** mm is being destroyed; and the unrelated mm's
destruction involves a linked list deletion at mm.rss_stat[].list, and
the two mm's happen to be adjacent on this linked list, so destroying
the unrelated mm updates a linked list element embedded in the first
mm.)
This design is pretty dirty, and it would be a good idea to get rid of
that memcpy() by going through every mm_struct field, checking whether
it is already initialized after the memcpy(), and figuring out proper
initialization if it isn't; another easier, less clean approach would
be to just slap a data_race() annotation around the memcpy() for now.
From a quick look, I think these are the fields that might not be reinitialized:
Fields that are safe to copy without lock because they're immutable:
- mmap_{compat_,}{legacy_,}base
- task_size
- binfmt
Fields which could actually race but only if CRIU-specific APIs raced
with execve:
- {start,end}_{code,data}
- start_brk, start_stack
- {arg,env}_{start,end}
- saved_auxv
Fields which actually look like they might be data racing in practice
(but with most of these, probably not much bad stuff actually results
from that):
- membarrier_state
- mm_cid_next_scan
- brk
- numa_scan_offset
- tlb_flush_batched
- ksm_merging_pages, ksm_rmap_items, ksm_zero_pages
And then there's arch-specific stuff in mm_context_t, I haven't looked
at that for all architectures. But there is some pretty dodgy stuff in
there too, for example on X86 the mm->context.vdso_image pointer can
be updated through the CRIU API concurrently with the memcpy(), and it
doesn't look like that pointer is re-initialized later, so I think
that could theoretically result in a torn pointer being accessed later
on in the child. (Note that memcpy() is very much not guaranteed to
copy pointer-sized fields atomically, though it tends to often do that
in practice.)
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-05-12 17:09 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-12 6:36 [syzbot] [io-uring] KCSAN: data-race in copy_mm / percpu_counter_destroy_many syzbot
2025-05-12 13:51 ` Jens Axboe
2025-05-12 17:08 ` Jann Horn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox