public inbox for [email protected]
 help / color / mirror / Atom feed
* [syzbot] [io-uring?] BUG: unable to handle kernel NULL pointer dereference in percpu_ref_put_many
@ 2024-12-23 19:52 syzbot
  2024-12-23 20:33 ` Jens Axboe
  2024-12-23 20:51 ` Jens Axboe
  0 siblings, 2 replies; 5+ messages in thread
From: syzbot @ 2024-12-23 19:52 UTC (permalink / raw)
  To: asml.silence, axboe, io-uring, linux-kernel, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    eabcdba3ad40 Merge tag 'for-6.13-rc3-tag' of git://git.ker..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=10871f44580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=c22efbd20f8da769
dashboard link: https://syzkaller.appspot.com/bug?extid=3dcac84cc1d50f43ed31
compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=141bccf8580000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=135f7730580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/a9904ed2be77/disk-eabcdba3.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/fb8d571e1cb3/vmlinux-eabcdba3.xz
kernel image: https://storage.googleapis.com/syzbot-assets/76349070db25/bzImage-eabcdba3.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 0 P4D 0 
Oops: Oops: 0010 [#1] PREEMPT SMP KASAN PTI
CPU: 0 UID: 0 PID: 11082 Comm: syz-executor246 Not tainted 6.13.0-rc3-syzkaller-00073-geabcdba3ad40 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/25/2024
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc9000413f9e0 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff88807c722018 RCX: ffffffff8497d56c
RDX: 1ffff110287e09e1 RSI: ffffffff8497d57a RDI: ffff88807c722018
RBP: ffff888143f04f00 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000002 R12: ffff88807c722020
R13: 0000000000000000 R14: 0000000000000000 R15: ffff8880745b4a10
FS:  0000000000000000(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000000db7e000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 percpu_ref_put_many.constprop.0+0x269/0x2a0 include/linux/percpu-refcount.h:335
 percpu_ref_put include/linux/percpu-refcount.h:351 [inline]
 percpu_ref_kill_and_confirm+0x94/0x180 lib/percpu-refcount.c:396
 percpu_ref_kill include/linux/percpu-refcount.h:149 [inline]
 io_ring_ctx_wait_and_kill+0x86/0x250 io_uring/io_uring.c:2973
 io_uring_release+0x39/0x50 io_uring/io_uring.c:2995
 __fput+0x3f8/0xb60 fs/file_table.c:450
 task_work_run+0x14e/0x250 kernel/task_work.c:239
 exit_task_work include/linux/task_work.h:43 [inline]
 do_exit+0xadd/0x2d70 kernel/exit.c:938
 do_group_exit+0xd3/0x2a0 kernel/exit.c:1087
 get_signal+0x2576/0x2610 kernel/signal.c:3017
 arch_do_signal_or_restart+0x90/0x7e0 arch/x86/kernel/signal.c:337
 exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
 exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
 __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
 syscall_exit_to_user_mode+0x150/0x2a0 kernel/entry/common.c:218
 do_syscall_64+0xda/0x250 arch/x86/entry/common.c:89
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f1575ca04e9
Code: Unable to access opcode bytes at 0x7f1575ca04bf.
RSP: 002b:00007f1575c5b218 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00007f1575d2a308 RCX: 00007f1575ca04e9
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f1575d2a308
RBP: 00007f1575d2a300 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f1575d2a30c
R13: 00007f1575cf7074 R14: 006e716e5f797265 R15: 0030656c69662f2e
 </TASK>
Modules linked in:
CR2: 0000000000000000
---[ end trace 0000000000000000 ]---
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc9000413f9e0 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff88807c722018 RCX: ffffffff8497d56c
RDX: 1ffff110287e09e1 RSI: ffffffff8497d57a RDI: ffff88807c722018
RBP: ffff888143f04f00 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000002 R12: ffff88807c722020
R13: 0000000000000000 R14: 0000000000000000 R15: ffff8880745b4a10
FS:  0000000000000000(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000000db7e000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [syzbot] [io-uring?] BUG: unable to handle kernel NULL pointer dereference in percpu_ref_put_many
  2024-12-23 19:52 [syzbot] [io-uring?] BUG: unable to handle kernel NULL pointer dereference in percpu_ref_put_many syzbot
@ 2024-12-23 20:33 ` Jens Axboe
  2024-12-23 20:52   ` Caleb Sander
  2024-12-23 20:51 ` Jens Axboe
  1 sibling, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2024-12-23 20:33 UTC (permalink / raw)
  To: syzbot, asml.silence, io-uring, linux-kernel, syzkaller-bugs,
	[email protected], Hannes Reinecke, Sagi Grimberg

On 12/23/24 12:52 PM, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    eabcdba3ad40 Merge tag 'for-6.13-rc3-tag' of git://git.ker..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=10871f44580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c22efbd20f8da769
> dashboard link: https://syzkaller.appspot.com/bug?extid=3dcac84cc1d50f43ed31
> compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=141bccf8580000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=135f7730580000

I ran this one but his this instead:

==================================================================
BUG: KASAN: slab-out-of-bounds in nvmet_root_discovery_nqn_store+0x110/0x180
Write of size 256 at addr ffff000009e71180 by task refcrash/775

CPU: 0 UID: 0 PID: 775 Comm: refcrash Not tainted 6.13.0-rc4 #2
Hardware name: linux,dummy-virt (DT)
Call trace:
 show_stack+0x1c/0x30 (C)
 __dump_stack+0x24/0x30
 dump_stack_lvl+0x60/0x80
 print_address_description+0x88/0x220
 print_report+0x4c/0x60
 kasan_report+0x94/0xf0
 kasan_check_range+0x248/0x288
 __asan_memset+0x30/0x60
 nvmet_root_discovery_nqn_store+0x110/0x180
 configfs_write_iter+0x220/0x2e8
 do_iter_readv_writev+0x2e0/0x458
 vfs_writev+0x220/0x728
 do_writev+0xf8/0x1a8
 __arm64_sys_writev+0x80/0x98
 invoke_syscall+0x7c/0x258
 el0_svc_common+0x108/0x1d0
 do_el0_svc+0x4c/0x60
 el0_svc+0x4c/0xa0
 el0t_64_sync_handler+0x70/0x100
 el0t_64_sync+0x170/0x178

Allocated by task 1:
 kasan_save_track+0x2c/0x60
 kasan_save_alloc_info+0x3c/0x48
 __kasan_kmalloc+0x80/0x98
 __kmalloc_node_track_caller_noprof+0x2f0/0x590
 kstrndup+0x4c/0xb8
 nvmet_subsys_alloc+0x1c4/0x498
 nvmet_init_discovery+0x20/0x48
 nvmet_init+0x18c/0x1c0
 do_one_initcall+0x1a4/0x718
 do_initcall_level+0x178/0x348
 do_initcalls+0x58/0xa0
 do_basic_setup+0x7c/0x98
 kernel_init_freeable+0x268/0x380
 kernel_init+0x24/0x148
 ret_from_fork+0x10/0x20

The buggy address belongs to the object at ffff000009e71180
 which belongs to the cache kmalloc-64 of size 64
The buggy address is located 0 bytes inside of
 allocated 37-byte region [ffff000009e71180, ffff000009e711a5)

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x49e71
anon flags: 0x3ffe00000000000(node=0|zone=0|lastcpupid=0x1fff)
page_type: f5(slab)
raw: 03ffe00000000000 ffff0000070028c0 fffffdffc0523d80 dead000000000005
raw: 0000000000000000 0000000000200020 00000001f5000000 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff000009e71080: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
 ffff000009e71100: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc
>ffff000009e71180: 00 00 00 00 05 fc fc fc fc fc fc fc fc fc fc fc
Zero length message leads to an empty skb
                               ^
 ffff000009e71200: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
 ffff000009e71280: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
==================================================================
Disabling lock debugging due to kernel taint

which makes me think something else is the culprit here. The test case
doesn't do much outside of creating two rings, it doesn't actually use
them.

CC'ing likely suspects on the nvme front. This is on 6.13-rc4 fwiw.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [syzbot] [io-uring?] BUG: unable to handle kernel NULL pointer dereference in percpu_ref_put_many
  2024-12-23 19:52 [syzbot] [io-uring?] BUG: unable to handle kernel NULL pointer dereference in percpu_ref_put_many syzbot
  2024-12-23 20:33 ` Jens Axboe
@ 2024-12-23 20:51 ` Jens Axboe
  1 sibling, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2024-12-23 20:51 UTC (permalink / raw)
  To: syzbot, asml.silence, io-uring, linux-kernel, syzkaller-bugs

#syz set subsystems: nvme

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [syzbot] [io-uring?] BUG: unable to handle kernel NULL pointer dereference in percpu_ref_put_many
  2024-12-23 20:33 ` Jens Axboe
@ 2024-12-23 20:52   ` Caleb Sander
  2024-12-23 20:55     ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Caleb Sander @ 2024-12-23 20:52 UTC (permalink / raw)
  To: Jens Axboe
  Cc: syzbot, asml.silence, io-uring, linux-kernel, syzkaller-bugs,
	[email protected], Hannes Reinecke, Sagi Grimberg

This is probably the same bug that is being addressed by
https://lore.kernel.org/lkml/[email protected]/T/

On Mon, Dec 23, 2024 at 12:35 PM Jens Axboe <[email protected]> wrote:
>
> On 12/23/24 12:52 PM, syzbot wrote:
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit:    eabcdba3ad40 Merge tag 'for-6.13-rc3-tag' of git://git.ker..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=10871f44580000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=c22efbd20f8da769
> > dashboard link: https://syzkaller.appspot.com/bug?extid=3dcac84cc1d50f43ed31
> > compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=141bccf8580000
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=135f7730580000
>
> I ran this one but his this instead:
>
> ==================================================================
> BUG: KASAN: slab-out-of-bounds in nvmet_root_discovery_nqn_store+0x110/0x180
> Write of size 256 at addr ffff000009e71180 by task refcrash/775
>
> CPU: 0 UID: 0 PID: 775 Comm: refcrash Not tainted 6.13.0-rc4 #2
> Hardware name: linux,dummy-virt (DT)
> Call trace:
>  show_stack+0x1c/0x30 (C)
>  __dump_stack+0x24/0x30
>  dump_stack_lvl+0x60/0x80
>  print_address_description+0x88/0x220
>  print_report+0x4c/0x60
>  kasan_report+0x94/0xf0
>  kasan_check_range+0x248/0x288
>  __asan_memset+0x30/0x60
>  nvmet_root_discovery_nqn_store+0x110/0x180
>  configfs_write_iter+0x220/0x2e8
>  do_iter_readv_writev+0x2e0/0x458
>  vfs_writev+0x220/0x728
>  do_writev+0xf8/0x1a8
>  __arm64_sys_writev+0x80/0x98
>  invoke_syscall+0x7c/0x258
>  el0_svc_common+0x108/0x1d0
>  do_el0_svc+0x4c/0x60
>  el0_svc+0x4c/0xa0
>  el0t_64_sync_handler+0x70/0x100
>  el0t_64_sync+0x170/0x178
>
> Allocated by task 1:
>  kasan_save_track+0x2c/0x60
>  kasan_save_alloc_info+0x3c/0x48
>  __kasan_kmalloc+0x80/0x98
>  __kmalloc_node_track_caller_noprof+0x2f0/0x590
>  kstrndup+0x4c/0xb8
>  nvmet_subsys_alloc+0x1c4/0x498
>  nvmet_init_discovery+0x20/0x48
>  nvmet_init+0x18c/0x1c0
>  do_one_initcall+0x1a4/0x718
>  do_initcall_level+0x178/0x348
>  do_initcalls+0x58/0xa0
>  do_basic_setup+0x7c/0x98
>  kernel_init_freeable+0x268/0x380
>  kernel_init+0x24/0x148
>  ret_from_fork+0x10/0x20
>
> The buggy address belongs to the object at ffff000009e71180
>  which belongs to the cache kmalloc-64 of size 64
> The buggy address is located 0 bytes inside of
>  allocated 37-byte region [ffff000009e71180, ffff000009e711a5)
>
> The buggy address belongs to the physical page:
> page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x49e71
> anon flags: 0x3ffe00000000000(node=0|zone=0|lastcpupid=0x1fff)
> page_type: f5(slab)
> raw: 03ffe00000000000 ffff0000070028c0 fffffdffc0523d80 dead000000000005
> raw: 0000000000000000 0000000000200020 00000001f5000000 0000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
>  ffff000009e71080: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
>  ffff000009e71100: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc
> >ffff000009e71180: 00 00 00 00 05 fc fc fc fc fc fc fc fc fc fc fc
> Zero length message leads to an empty skb
>                                ^
>  ffff000009e71200: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
>  ffff000009e71280: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
> ==================================================================
> Disabling lock debugging due to kernel taint
>
> which makes me think something else is the culprit here. The test case
> doesn't do much outside of creating two rings, it doesn't actually use
> them.
>
> CC'ing likely suspects on the nvme front. This is on 6.13-rc4 fwiw.
>
> --
> Jens Axboe
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [syzbot] [io-uring?] BUG: unable to handle kernel NULL pointer dereference in percpu_ref_put_many
  2024-12-23 20:52   ` Caleb Sander
@ 2024-12-23 20:55     ` Jens Axboe
  0 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2024-12-23 20:55 UTC (permalink / raw)
  To: Caleb Sander
  Cc: syzbot, asml.silence, io-uring, linux-kernel, syzkaller-bugs,
	[email protected], Hannes Reinecke, Sagi Grimberg

On 12/23/24 1:52 PM, Caleb Sander wrote:
> This is probably the same bug that is being addressed by
> https://lore.kernel.org/lkml/[email protected]/T/

Yep that looks highly plausible. We should get this queued for 6.12 and
marked for stable, it's missing the cc stable tag.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-12-23 20:55 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-23 19:52 [syzbot] [io-uring?] BUG: unable to handle kernel NULL pointer dereference in percpu_ref_put_many syzbot
2024-12-23 20:33 ` Jens Axboe
2024-12-23 20:52   ` Caleb Sander
2024-12-23 20:55     ` Jens Axboe
2024-12-23 20:51 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox