* INFO: task hung in io_uring_cancel_generic @ 2021-09-07 11:50 Hao Sun 2021-09-07 19:30 ` Jens Axboe 0 siblings, 1 reply; 11+ messages in thread From: Hao Sun @ 2021-09-07 11:50 UTC (permalink / raw) To: axboe, io-uring; +Cc: Pavel Begunkov, linux-kernel Hello, When using Healer to fuzz the latest Linux kernel, the following crash was triggered. HEAD commit: 7d2a07b76933 Linux 5.14 git tree: upstream console output: https://drive.google.com/file/d/1c8uRooM0TwJiTIwEviOCB4RC-hhOgGHR/view?usp=sharing kernel config: https://drive.google.com/file/d/1XD9WYDViQLSXN7RGwH8AGGDvP9JvOghx/view?usp=sharing Similar report: https://groups.google.com/u/1/g/syzkaller-bugs/c/FvdcTiJIGtY/m/PcXkoenUAAAJ Sorry, I don't have a reproducer for this crash, hope the symbolized report can help. If you fix this issue, please add the following tag to the commit: Reported-by: Hao Sun <[email protected]> INFO: task syz-executor:10969 blocked for more than 143 seconds. Not tainted 5.14.0+ #26 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz-executor state:D stack:28528 pid:10969 ppid: 9213 flags:0x00024004 Call Trace: context_switch kernel/sched/core.c:4940 [inline] __schedule+0xcd9/0x2530 kernel/sched/core.c:6287 schedule+0xd3/0x270 kernel/sched/core.c:6366 io_uring_cancel_generic+0x458/0x740 fs/io_uring.c:9646 io_uring_files_cancel include/linux/io_uring.h:16 [inline] do_exit+0x25c/0x2dd0 kernel/exit.c:780 do_group_exit+0x125/0x340 kernel/exit.c:922 get_signal+0x4d5/0x25a0 kernel/signal.c:2868 arch_do_signal_or_restart+0x2ed/0x1c40 arch/x86/kernel/signal.c:865 handle_signal_work kernel/entry/common.c:148 [inline] exit_to_user_mode_loop kernel/entry/common.c:172 [inline] exit_to_user_mode_prepare+0x192/0x2a0 kernel/entry/common.c:209 __syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline] syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:302 do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x4739cd RSP: 002b:00007fb8de229218 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca RAX: fffffffffffffe00 RBX: 000000000059c0a0 RCX: 00000000004739cd RDX: 0000000000000000 RSI: 0000000000000080 RDI: 000000000059c0a8 RBP: 000000000059c0a8 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000059c0ac R13: 00007ffc3738731f R14: 00007ffc373874c0 R15: 00007fb8de229300 Showing all locks held in the system: 1 lock held by khungtaskd/1674: #0: ffffffff8b97dde0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:6446 1 lock held by in:imklog/8129: #0: ffff888016f972f0 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:990 ============================================= NMI backtrace for cpu 1 CPU: 1 PID: 1674 Comm: khungtaskd Not tainted 5.14.0+ #26 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105 nmi_cpu_backtrace.cold+0x47/0x144 lib/nmi_backtrace.c:105 nmi_trigger_cpumask_backtrace+0x1e1/0x220 lib/nmi_backtrace.c:62 trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline] check_hung_uninterruptible_tasks kernel/hung_task.c:210 [inline] watchdog+0xcc8/0x1010 kernel/hung_task.c:295 kthread+0x3e5/0x4d0 kernel/kthread.c:319 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 Sending NMI from CPU 1 to CPUs 0,2-3: NMI backtrace for cpu 2 CPU: 2 PID: 4927 Comm: systemd-journal Not tainted 5.14.0+ #26 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:preempt_count arch/x86/include/asm/preempt.h:27 [inline] RIP: 0010:check_kcov_mode+0x0/0x40 kernel/kcov.c:163 Code: 48 89 df e8 02 80 47 00 e9 59 fe ff ff 48 8b 7c 24 08 e8 f3 7f 47 00 e9 61 fd ff ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc <65> 8b 05 29 d0 8b 7e 89 c2 81 e2 00 01 00 00 a9 00 01 ff 00 74 10 RSP: 0018:ffffc90001077d08 EFLAGS: 00000046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff888014fa9d40 RDX: 0000000000000000 RSI: ffff888014fa9d40 RDI: 0000000000000002 RBP: ffff88802cb20000 R08: ffffffff817c1bfe R09: 0000000000000000 R10: 0000000000000005 R11: ffffed100c7e6531 R12: ffff888010e0af00 R13: 0000000000000200 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f9028f938c0(0000) GS:ffff888063f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f90256da000 CR3: 00000000193df000 CR4: 0000000000750ee0 PKRU: 55555554 Call Trace: __sanitizer_cov_trace_pc+0x1a/0x40 kernel/kcov.c:197 trace_hardirqs_on+0x6e/0x1c0 kernel/trace/trace_preemptirq.c:44 kasan_quarantine_put+0x11d/0x1c0 mm/kasan/quarantine.c:220 kasan_slab_free include/linux/kasan.h:230 [inline] slab_free_hook mm/slub.c:1628 [inline] slab_free_freelist_hook+0xe1/0x240 mm/slub.c:1653 slab_free mm/slub.c:3213 [inline] kmem_cache_free+0x8a/0x5c0 mm/slub.c:3229 putname+0x10c/0x150 fs/namei.c:270 do_mkdirat+0x18a/0x2b0 fs/namei.c:3920 __do_sys_mkdir fs/namei.c:3931 [inline] __se_sys_mkdir fs/namei.c:3929 [inline] __x64_sys_mkdir+0x61/0x80 fs/namei.c:3929 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f902824f687 Code: 00 b8 ff ff ff ff c3 0f 1f 40 00 48 8b 05 09 d8 2b 00 64 c7 00 5f 00 00 00 b8 ff ff ff ff c3 0f 1f 40 00 b8 53 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 d7 2b 00 f7 d8 64 89 01 48 RSP: 002b:00007ffd1a764c18 EFLAGS: 00000293 ORIG_RAX: 0000000000000053 RAX: ffffffffffffffda RBX: 00007ffd1a767b30 RCX: 00007f902824f687 RDX: 00007f9028cc0a00 RSI: 00000000000001ed RDI: 00005650418998a0 RBP: 00007ffd1a764c50 R08: 000000000000eec0 R09: 0000000000000000 R10: 0000000000000069 R11: 0000000000000293 R12: 0000000000000000 R13: 0000000000000000 R14: 00007ffd1a767b30 R15: 00007ffd1a765140 NMI backtrace for cpu 3 CPU: 3 PID: 15560 Comm: syz-executor Not tainted 5.14.0+ #26 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:memchr+0x2e/0x80 lib/string.c:1070 Code: 48 bd 00 00 00 00 00 fc ff df 53 48 83 ec 18 eb 26 48 8d 5f 01 48 89 f8 48 89 f9 48 c1 e8 03 83 e1 07 0f b6 04 28 38 c8 7f 04 <84> c0 75 21 40 3a 73 ff 74 11 48 89 df 48 39 d7 75 d5 48 83 c4 18 RSP: 0018:ffffc90007f7f910 EFLAGS: 00000293 RAX: 0000000000000000 RBX: ffffffff90212e86 RCX: 0000000000000005 RDX: ffffffff90212e93 RSI: 000000000000000a RDI: ffffffff90212e85 RBP: dffffc0000000000 R08: ffffffff815dff8f R09: 0000000000000000 R10: 0000000000000005 R11: fffff52000feff84 R12: 0000000000000001 R13: dffffc0000000000 R14: 000000000000001b R15: ffffffff90212e78 FS: 00007f917dcd5700(0000) GS:ffff888135d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000005f CR3: 00000001133e5000 CR4: 0000000000750ee0 PKRU: 55555554 Call Trace: memchr include/linux/fortify-string.h:246 [inline] count_lines kernel/printk/printk_ringbuffer.c:1675 [inline] copy_data kernel/printk/printk_ringbuffer.c:1721 [inline] prb_read kernel/printk/printk_ringbuffer.c:1814 [inline] _prb_read_valid+0x3ae/0x6b0 kernel/printk/printk_ringbuffer.c:1880 prb_read_valid_info+0xa7/0xe0 kernel/printk/printk_ringbuffer.c:1963 find_first_fitting_seq+0xfd/0x210 kernel/printk/printk.c:1450 syslog_print_all+0x15a/0x3a0 kernel/printk/printk.c:1590 do_syslog kernel/printk/printk.c:1669 [inline] do_syslog+0x53f/0x650 kernel/printk/printk.c:1632 __do_sys_syslog kernel/printk/printk.c:1747 [inline] __se_sys_syslog kernel/printk/printk.c:1745 [inline] __x64_sys_syslog+0x71/0xb0 kernel/printk/printk.c:1745 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x2000010c Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc 0d 00 00 00 00 00 00 36 e1 06 4a 2a e9 2c b8 b6 4c 0f 05 <bf> 03 00 00 00 c4 a3 7b f0 c5 01 41 e2 e9 c4 22 e9 aa bb 3c 00 00 RSP: 002b:00007f917dcd50f8 EFLAGS: 00000203 ORIG_RAX: 0000000000000067 RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 000000002000010c RDX: 0000000000004c01 RSI: 0000000000000003 RDI: 0000000000000003 RBP: 0000000000000033 R08: 0000000000000005 R09: 0000000000000006 R10: 0000000000000007 R11: 0000000000000203 R12: 000000000000000b R13: 000000000000000c R14: 000000000000000d R15: 00007f917dcd5300 NMI backtrace for cpu 0 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:51 [inline] NMI backtrace for cpu 0 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:89 [inline] NMI backtrace for cpu 0 skipped: idling at default_idle+0xb/0x10 arch/x86/kernel/process.c:716 ---------------- Code disassembly (best guess): 0: 48 89 df mov %rbx,%rdi 3: e8 02 80 47 00 callq 0x47800a 8: e9 59 fe ff ff jmpq 0xfffffe66 d: 48 8b 7c 24 08 mov 0x8(%rsp),%rdi 12: e8 f3 7f 47 00 callq 0x47800a 17: e9 61 fd ff ff jmpq 0xfffffd7d 1c: cc int3 1d: cc int3 1e: cc int3 1f: cc int3 20: cc int3 21: cc int3 22: cc int3 23: cc int3 24: cc int3 25: cc int3 26: cc int3 27: cc int3 28: cc int3 29: cc int3 * 2a: 65 8b 05 29 d0 8b 7e mov %gs:0x7e8bd029(%rip),%eax # 0x7e8bd05a <-- trapping instruction 31: 89 c2 mov %eax,%edx 33: 81 e2 00 01 00 00 and $0x100,%edx 39: a9 00 01 ff 00 test $0xff0100,%eax 3e: 74 10 je 0x50% ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: INFO: task hung in io_uring_cancel_generic 2021-09-07 11:50 INFO: task hung in io_uring_cancel_generic Hao Sun @ 2021-09-07 19:30 ` Jens Axboe 2021-09-07 21:31 ` Pavel Begunkov 0 siblings, 1 reply; 11+ messages in thread From: Jens Axboe @ 2021-09-07 19:30 UTC (permalink / raw) To: Hao Sun, io-uring; +Cc: Pavel Begunkov, linux-kernel On 9/7/21 5:50 AM, Hao Sun wrote: > Hello, > > When using Healer to fuzz the latest Linux kernel, the following crash > was triggered. > > HEAD commit: 7d2a07b76933 Linux 5.14 > git tree: upstream > console output: > https://drive.google.com/file/d/1c8uRooM0TwJiTIwEviOCB4RC-hhOgGHR/view?usp=sharing > kernel config: https://drive.google.com/file/d/1XD9WYDViQLSXN7RGwH8AGGDvP9JvOghx/view?usp=sharing > Similar report: > https://groups.google.com/u/1/g/syzkaller-bugs/c/FvdcTiJIGtY/m/PcXkoenUAAAJ > > Sorry, I don't have a reproducer for this crash, hope the symbolized > report can help. > If you fix this issue, please add the following tag to the commit: > Reported-by: Hao Sun <[email protected]> Would be great with a reproducer for this one, though... -- Jens Axboe ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: INFO: task hung in io_uring_cancel_generic 2021-09-07 19:30 ` Jens Axboe @ 2021-09-07 21:31 ` Pavel Begunkov 2021-09-08 1:01 ` Hao Sun 0 siblings, 1 reply; 11+ messages in thread From: Pavel Begunkov @ 2021-09-07 21:31 UTC (permalink / raw) To: Jens Axboe, Hao Sun, io-uring; +Cc: linux-kernel On 9/7/21 8:30 PM, Jens Axboe wrote: > On 9/7/21 5:50 AM, Hao Sun wrote: >> Hello, >> >> When using Healer to fuzz the latest Linux kernel, the following crash >> was triggered. >> >> HEAD commit: 7d2a07b76933 Linux 5.14 >> git tree: upstream >> console output: >> https://drive.google.com/file/d/1c8uRooM0TwJiTIwEviOCB4RC-hhOgGHR/view?usp=sharing >> kernel config: https://drive.google.com/file/d/1XD9WYDViQLSXN7RGwH8AGGDvP9JvOghx/view?usp=sharing >> Similar report: >> https://groups.google.com/u/1/g/syzkaller-bugs/c/FvdcTiJIGtY/m/PcXkoenUAAAJ >> >> Sorry, I don't have a reproducer for this crash, hope the symbolized >> report can help. >> If you fix this issue, please add the following tag to the commit: >> Reported-by: Hao Sun <[email protected]> > > Would be great with a reproducer for this one, though... And syzbot usually sends an execution log with all syz programs it run, which may be helpful. Any chance you have anything similar left? -- Pavel Begunkov ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: INFO: task hung in io_uring_cancel_generic 2021-09-07 21:31 ` Pavel Begunkov @ 2021-09-08 1:01 ` Hao Sun 2021-09-08 6:57 ` Hao Sun 0 siblings, 1 reply; 11+ messages in thread From: Hao Sun @ 2021-09-08 1:01 UTC (permalink / raw) To: Pavel Begunkov; +Cc: Jens Axboe, io-uring, linux-kernel Pavel Begunkov <[email protected]> 于2021年9月8日周三 上午5:31写道: > > On 9/7/21 8:30 PM, Jens Axboe wrote: > > On 9/7/21 5:50 AM, Hao Sun wrote: > >> Hello, > >> > >> When using Healer to fuzz the latest Linux kernel, the following crash > >> was triggered. > >> > >> HEAD commit: 7d2a07b76933 Linux 5.14 > >> git tree: upstream > >> console output: > >> https://drive.google.com/file/d/1c8uRooM0TwJiTIwEviOCB4RC-hhOgGHR/view?usp=sharing > >> kernel config: https://drive.google.com/file/d/1XD9WYDViQLSXN7RGwH8AGGDvP9JvOghx/view?usp=sharing > >> Similar report: > >> https://groups.google.com/u/1/g/syzkaller-bugs/c/FvdcTiJIGtY/m/PcXkoenUAAAJ > >> > >> Sorry, I don't have a reproducer for this crash, hope the symbolized > >> report can help. > >> If you fix this issue, please add the following tag to the commit: > >> Reported-by: Hao Sun <[email protected]> > > > > Would be great with a reproducer for this one, though... > > And syzbot usually sends an execution log with all syz programs > it run, which may be helpful. Any chance you have anything similar > left? > Yes, found it[1]. Here is an execution history with latest 1024 executed progs before crash saved. Hope it can help. I'll also follow this crash closely, see if Healer can find a reproducer and send it to you once it found. [1] https://drive.google.com/file/d/14k8qOFeyKPD4HsqOpIjud3b9jsxFSo-u/view?usp=sharing > -- > Pavel Begunkov ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: INFO: task hung in io_uring_cancel_generic 2021-09-08 1:01 ` Hao Sun @ 2021-09-08 6:57 ` Hao Sun 2021-09-08 20:16 ` Pavel Begunkov 0 siblings, 1 reply; 11+ messages in thread From: Hao Sun @ 2021-09-08 6:57 UTC (permalink / raw) To: Pavel Begunkov; +Cc: Jens Axboe, io-uring, linux-kernel Another similar report: INFO: task hung in io_ring_exit_work INFO: task kworker/u9:3:1867 blocked for more than 143 seconds. Not tainted 5.14.0+ #13 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u9:3 state:D stack:12344 pid: 1867 ppid: 2 flags:0x00004000 Workqueue: events_unbound io_ring_exit_work Call Trace: context_switch kernel/sched/core.c:4940 [inline] __schedule+0x323/0xae0 kernel/sched/core.c:6287 schedule+0x36/0xe0 kernel/sched/core.c:6366 schedule_timeout+0x189/0x430 kernel/time/timer.c:1857 do_wait_for_common kernel/sched/completion.c:85 [inline] __wait_for_common kernel/sched/completion.c:106 [inline] wait_for_common kernel/sched/completion.c:117 [inline] wait_for_completion+0xb4/0x110 kernel/sched/completion.c:138 io_ring_exit_work+0x287/0x850 fs/io_uring.c:9297 process_one_work+0x359/0x850 kernel/workqueue.c:2297 worker_thread+0x41/0x4d0 kernel/workqueue.c:2444 kthread+0x178/0x1b0 kernel/kthread.c:319 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 INFO: task kworker/u9:4:2050 blocked for more than 143 seconds. Not tainted 5.14.0+ #13 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u9:4 state:D stack:12232 pid: 2050 ppid: 2 flags:0x00004000 Workqueue: events_unbound io_ring_exit_work Call Trace: context_switch kernel/sched/core.c:4940 [inline] __schedule+0x323/0xae0 kernel/sched/core.c:6287 schedule+0x36/0xe0 kernel/sched/core.c:6366 schedule_timeout+0x189/0x430 kernel/time/timer.c:1857 do_wait_for_common kernel/sched/completion.c:85 [inline] __wait_for_common kernel/sched/completion.c:106 [inline] wait_for_common kernel/sched/completion.c:117 [inline] wait_for_completion+0xb4/0x110 kernel/sched/completion.c:138 io_ring_exit_work+0x287/0x850 fs/io_uring.c:9297 process_one_work+0x359/0x850 kernel/workqueue.c:2297 worker_thread+0x41/0x4d0 kernel/workqueue.c:2444 kthread+0x178/0x1b0 kernel/kthread.c:319 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 Showing all locks held in the system: 1 lock held by khungtaskd/1340: #0: ffffffff85a1d3a0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0xe/0x1a0 kernel/locking/lockdep.c:6446 2 locks held by kworker/u9:3/1867: #0: ffff888008c5c938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:633 [inline] #0: ffff888008c5c938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:661 [inline] #0: ffff888008c5c938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x2a0/0x850 kernel/workqueue.c:2268 #1: ffffc90003e67e70 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:633 [inline] #1: ffffc90003e67e70 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:661 [inline] #1: ffffc90003e67e70 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x2a0/0x850 kernel/workqueue.c:2268 2 locks held by kworker/u9:4/2050: #0: ffff888008c5c938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:633 [inline] #0: ffff888008c5c938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:661 [inline] #0: ffff888008c5c938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x2a0/0x850 kernel/workqueue.c:2268 #1: ffffc90004327e70 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:633 [inline] #1: ffffc90004327e70 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:661 [inline] #1: ffffc90004327e70 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x2a0/0x850 kernel/workqueue.c:2268 1 lock held by in:imklog/8089: #0: ffff88800ece7ef0 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x55/0x60 fs/file.c:990 ============================================= NMI backtrace for cpu 0 CPU: 0 PID: 1340 Comm: khungtaskd Not tainted 5.14.0+ #13 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x8d/0xcf lib/dump_stack.c:105 nmi_cpu_backtrace+0x1e9/0x210 lib/nmi_backtrace.c:105 nmi_trigger_cpumask_backtrace+0x120/0x180 lib/nmi_backtrace.c:62 trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline] check_hung_uninterruptible_tasks kernel/hung_task.c:210 [inline] watchdog+0x4e1/0x980 kernel/hung_task.c:295 kthread+0x178/0x1b0 kernel/kthread.c:319 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 Sending NMI from CPU 0 to CPUs 1-3: NMI backtrace for cpu 1 CPU: 1 PID: 4934 Comm: systemd-journal Not tainted 5.14.0+ #13 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 RIP: 0010:lockdep_enabled kernel/locking/lockdep.c:91 [inline] RIP: 0010:lock_acquire+0x1a4/0x340 kernel/locking/lockdep.c:5601 Code: 04 25 28 00 00 00 0f 85 b5 01 00 00 48 83 c4 38 5b 5d 41 5c 41 5d 41 5e 41 5f c3 65 48 8b 04 25 40 70 01 00 8b 80 f4 09 00 00 <85> c0 0f 85 c5 fe ff ff 9c 8f 04 24 fa 48 c7 c7 a0 36 5d 85 e8 e3 RSP: 0018:ffffc9000086fe18 EFLAGS: 00000246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffffffff852a8b94 RDI: ffffffff853cae8e RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff85a1d3a0 R13: 0000000000000002 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f940a97b8c0(0000) GS:ffff88813dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f9407aa3000 CR3: 000000000f49a000 CR4: 0000000000750ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: rcu_lock_acquire include/linux/rcupdate.h:267 [inline] rcu_read_lock include/linux/rcupdate.h:687 [inline] dput+0x72/0x650 fs/dcache.c:874 path_put+0x12/0x20 fs/namei.c:557 do_faccessat+0x1fa/0x370 fs/open.c:455 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x34/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f9409c379c7 Code: 83 c4 08 48 3d 01 f0 ff ff 73 01 c3 48 8b 0d c8 d4 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 b8 15 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a1 d4 2b 00 f7 d8 64 89 01 48 RSP: 002b:00007ffeead2d338 EFLAGS: 00000246 ORIG_RAX: 0000000000000015 RAX: ffffffffffffffda RBX: 00007ffeead30250 RCX: 00007f9409c379c7 RDX: 00007f940a6a8a00 RSI: 0000000000000000 RDI: 000055d9e89aa9a3 RBP: 00007ffeead2d370 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000069 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00007ffeead30250 R15: 00007ffeead2d860 NMI backtrace for cpu 2 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:51 [inline] NMI backtrace for cpu 2 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:89 [inline] NMI backtrace for cpu 2 skipped: idling at default_idle+0xb/0x10 arch/x86/kernel/process.c:716 NMI backtrace for cpu 3 CPU: 3 PID: 8091 Comm: rs:main Q:Reg Not tainted 5.14.0+ #13 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 RIP: 0010:memset_erms+0x9/0x10 arch/x86/lib/memset_64.S:64 Code: c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 f3 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 <f3> aa 4c 89 c8 c3 90 49 89 fa 40 0f b6 ce 48 b8 01 01 01 01 01 01 RSP: 0018:ffffc90007a23ba8 EFLAGS: 00010216 RAX: 0000000000000000 RBX: ffff88810cd3ec78 RCX: 0000000000000c85 RDX: 0000000000000d05 RSI: 0000000000000000 RDI: ffff888107f7b37b RBP: 0000000000001000 R08: 0000000000000000 R09: ffff888107f7b2fb R10: ffffc90007a23978 R11: 0000000000000005 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000001000 R15: ffffea00041fdec0 FS: 00007fd86dba4700(0000) GS:ffff88813dd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055d6e0599858 CR3: 0000000012b38000 CR4: 0000000000750ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: memset include/linux/fortify-string.h:175 [inline] zero_user_segments include/linux/highmem.h:211 [inline] ext4_block_write_begin+0x63e/0x980 fs/ext4/inode.c:1089 ext4_da_write_begin+0x275/0x610 fs/ext4/inode.c:3021 generic_perform_write+0xce/0x220 mm/filemap.c:3770 ext4_buffered_write_iter+0xd6/0x190 fs/ext4/file.c:269 ext4_file_write_iter+0x80/0x940 fs/ext4/file.c:680 call_write_iter include/linux/fs.h:2163 [inline] new_sync_write+0x18d/0x260 fs/read_write.c:507 vfs_write+0x43b/0x4a0 fs/read_write.c:594 ksys_write+0xd2/0x120 fs/read_write.c:647 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x34/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fd8705e81cd Code: c2 20 00 00 75 10 b8 01 00 00 00 0f 05 48 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ae fc ff ff 48 89 04 24 b8 01 00 00 00 0f 05 <48> 8b 3c 24 48 89 c2 e8 f7 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01 RSP: 002b:00007fd86dba3590 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007fd8640268f0 RCX: 00007fd8705e81cd RDX: 00000000000004ce RSI: 00007fd8640268f0 RDI: 0000000000000008 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000293 R12: 00007fd864026670 R13: 00007fd86dba35b0 R14: 00005563e5cf4440 R15: 00000000000004ce ---------------- Code disassembly (best guess): 0: 04 25 add $0x25,%al 2: 28 00 sub %al,(%rax) 4: 00 00 add %al,(%rax) 6: 0f 85 b5 01 00 00 jne 0x1c1 c: 48 83 c4 38 add $0x38,%rsp 10: 5b pop %rbx 11: 5d pop %rbp 12: 41 5c pop %r12 14: 41 5d pop %r13 16: 41 5e pop %r14 18: 41 5f pop %r15 1a: c3 retq 1b: 65 48 8b 04 25 40 70 mov %gs:0x17040,%rax 22: 01 00 24: 8b 80 f4 09 00 00 mov 0x9f4(%rax),%eax * 2a: 85 c0 test %eax,%eax <-- trapping instruction 2c: 0f 85 c5 fe ff ff jne 0xfffffef7 32: 9c pushfq 33: 8f 04 24 popq (%rsp) 36: fa cli 37: 48 c7 c7 a0 36 5d 85 mov $0xffffffff855d36a0,%rdi 3e: e8 .byte 0xe8 3f: e3 .byte 0xe3% Hao Sun <[email protected]> 于2021年9月8日周三 上午9:01写道: > > Pavel Begunkov <[email protected]> 于2021年9月8日周三 上午5:31写道: > > > > On 9/7/21 8:30 PM, Jens Axboe wrote: > > > On 9/7/21 5:50 AM, Hao Sun wrote: > > >> Hello, > > >> > > >> When using Healer to fuzz the latest Linux kernel, the following crash > > >> was triggered. > > >> > > >> HEAD commit: 7d2a07b76933 Linux 5.14 > > >> git tree: upstream > > >> console output: > > >> https://drive.google.com/file/d/1c8uRooM0TwJiTIwEviOCB4RC-hhOgGHR/view?usp=sharing > > >> kernel config: https://drive.google.com/file/d/1XD9WYDViQLSXN7RGwH8AGGDvP9JvOghx/view?usp=sharing > > >> Similar report: > > >> https://groups.google.com/u/1/g/syzkaller-bugs/c/FvdcTiJIGtY/m/PcXkoenUAAAJ > > >> > > >> Sorry, I don't have a reproducer for this crash, hope the symbolized > > >> report can help. > > >> If you fix this issue, please add the following tag to the commit: > > >> Reported-by: Hao Sun <[email protected]> > > > > > > Would be great with a reproducer for this one, though... > > > > And syzbot usually sends an execution log with all syz programs > > it run, which may be helpful. Any chance you have anything similar > > left? > > > > Yes, found it[1]. Here is an execution history with latest 1024 > executed progs before crash saved. > Hope it can help. I'll also follow this crash closely, see if Healer > can find a reproducer and send it to you once it found. > > [1] https://drive.google.com/file/d/14k8qOFeyKPD4HsqOpIjud3b9jsxFSo-u/view?usp=sharing > > > -- > > Pavel Begunkov ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: INFO: task hung in io_uring_cancel_generic 2021-09-08 6:57 ` Hao Sun @ 2021-09-08 20:16 ` Pavel Begunkov 2021-09-13 2:26 ` Hao Sun 0 siblings, 1 reply; 11+ messages in thread From: Pavel Begunkov @ 2021-09-08 20:16 UTC (permalink / raw) To: Hao Sun; +Cc: Jens Axboe, io-uring, linux-kernel On 9/8/21 7:57 AM, Hao Sun wrote: > Another similar report: INFO: task hung in io_ring_exit_work > > INFO: task kworker/u9:3:1867 blocked for more than 143 seconds. > Not tainted 5.14.0+ #13 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:kworker/u9:3 state:D stack:12344 pid: 1867 ppid: 2 flags:0x00004000 > Workqueue: events_unbound io_ring_exit_work This one is different. I guess it may be difficult for syz to find a repro because of the delay before the warning appears. How about to reduce it to a lower value? like 2-5 seconds echo 5 > /proc/sys/kernel/hung_task_timeout_secs > Call Trace: > context_switch kernel/sched/core.c:4940 [inline] > __schedule+0x323/0xae0 kernel/sched/core.c:6287 > schedule+0x36/0xe0 kernel/sched/core.c:6366 > schedule_timeout+0x189/0x430 kernel/time/timer.c:1857 > do_wait_for_common kernel/sched/completion.c:85 [inline] > __wait_for_common kernel/sched/completion.c:106 [inline] > wait_for_common kernel/sched/completion.c:117 [inline] > wait_for_completion+0xb4/0x110 kernel/sched/completion.c:138 > io_ring_exit_work+0x287/0x850 fs/io_uring.c:9297 > process_one_work+0x359/0x850 kernel/workqueue.c:2297 > worker_thread+0x41/0x4d0 kernel/workqueue.c:2444 > kthread+0x178/0x1b0 kernel/kthread.c:319 > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 > INFO: task kworker/u9:4:2050 blocked for more than 143 seconds. > Not tainted 5.14.0+ #13 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:kworker/u9:4 state:D stack:12232 pid: 2050 ppid: 2 flags:0x00004000 > Workqueue: events_unbound io_ring_exit_work > Call Trace: > context_switch kernel/sched/core.c:4940 [inline] > __schedule+0x323/0xae0 kernel/sched/core.c:6287 > schedule+0x36/0xe0 kernel/sched/core.c:6366 > schedule_timeout+0x189/0x430 kernel/time/timer.c:1857 > do_wait_for_common kernel/sched/completion.c:85 [inline] > __wait_for_common kernel/sched/completion.c:106 [inline] > wait_for_common kernel/sched/completion.c:117 [inline] > wait_for_completion+0xb4/0x110 kernel/sched/completion.c:138 > io_ring_exit_work+0x287/0x850 fs/io_uring.c:9297 > process_one_work+0x359/0x850 kernel/workqueue.c:2297 > worker_thread+0x41/0x4d0 kernel/workqueue.c:2444 > kthread+0x178/0x1b0 kernel/kthread.c:319 > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 > > Showing all locks held in the system: > 1 lock held by khungtaskd/1340: > #0: ffffffff85a1d3a0 (rcu_read_lock){....}-{1:2}, at: > debug_show_all_locks+0xe/0x1a0 kernel/locking/lockdep.c:6446 > 2 locks held by kworker/u9:3/1867: > #0: ffff888008c5c938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: > set_work_data kernel/workqueue.c:633 [inline] > #0: ffff888008c5c938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: > set_work_pool_and_clear_pending kernel/workqueue.c:661 [inline] > #0: ffff888008c5c938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: > process_one_work+0x2a0/0x850 kernel/workqueue.c:2268 > #1: ffffc90003e67e70 > ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: set_work_data > kernel/workqueue.c:633 [inline] > #1: ffffc90003e67e70 > ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: > set_work_pool_and_clear_pending kernel/workqueue.c:661 [inline] > #1: ffffc90003e67e70 > ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: > process_one_work+0x2a0/0x850 kernel/workqueue.c:2268 > 2 locks held by kworker/u9:4/2050: > #0: ffff888008c5c938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: > set_work_data kernel/workqueue.c:633 [inline] > #0: ffff888008c5c938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: > set_work_pool_and_clear_pending kernel/workqueue.c:661 [inline] > #0: ffff888008c5c938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: > process_one_work+0x2a0/0x850 kernel/workqueue.c:2268 > #1: ffffc90004327e70 > ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: set_work_data > kernel/workqueue.c:633 [inline] > #1: ffffc90004327e70 > ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: > set_work_pool_and_clear_pending kernel/workqueue.c:661 [inline] > #1: ffffc90004327e70 > ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: > process_one_work+0x2a0/0x850 kernel/workqueue.c:2268 > 1 lock held by in:imklog/8089: > #0: ffff88800ece7ef0 (&f->f_pos_lock){+.+.}-{3:3}, at: > __fdget_pos+0x55/0x60 fs/file.c:990 > > ============================================= > > NMI backtrace for cpu 0 > CPU: 0 PID: 1340 Comm: khungtaskd Not tainted 5.14.0+ #13 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > Call Trace: > __dump_stack lib/dump_stack.c:88 [inline] > dump_stack_lvl+0x8d/0xcf lib/dump_stack.c:105 > nmi_cpu_backtrace+0x1e9/0x210 lib/nmi_backtrace.c:105 > nmi_trigger_cpumask_backtrace+0x120/0x180 lib/nmi_backtrace.c:62 > trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline] > check_hung_uninterruptible_tasks kernel/hung_task.c:210 [inline] > watchdog+0x4e1/0x980 kernel/hung_task.c:295 > kthread+0x178/0x1b0 kernel/kthread.c:319 > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 > Sending NMI from CPU 0 to CPUs 1-3: > NMI backtrace for cpu 1 > CPU: 1 PID: 4934 Comm: systemd-journal Not tainted 5.14.0+ #13 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > RIP: 0010:lockdep_enabled kernel/locking/lockdep.c:91 [inline] > RIP: 0010:lock_acquire+0x1a4/0x340 kernel/locking/lockdep.c:5601 > Code: 04 25 28 00 00 00 0f 85 b5 01 00 00 48 83 c4 38 5b 5d 41 5c 41 > 5d 41 5e 41 5f c3 65 48 8b 04 25 40 70 01 00 8b 80 f4 09 00 00 <85> c0 > 0f 85 c5 fe ff ff 9c 8f 04 24 fa 48 c7 c7 a0 36 5d 85 e8 e3 > RSP: 0018:ffffc9000086fe18 EFLAGS: 00000246 > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 > RDX: 0000000000000001 RSI: ffffffff852a8b94 RDI: ffffffff853cae8e > RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff85a1d3a0 > R13: 0000000000000002 R14: 0000000000000000 R15: 0000000000000000 > FS: 00007f940a97b8c0(0000) GS:ffff88813dc00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007f9407aa3000 CR3: 000000000f49a000 CR4: 0000000000750ee0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > PKRU: 55555554 > Call Trace: > rcu_lock_acquire include/linux/rcupdate.h:267 [inline] > rcu_read_lock include/linux/rcupdate.h:687 [inline] > dput+0x72/0x650 fs/dcache.c:874 > path_put+0x12/0x20 fs/namei.c:557 > do_faccessat+0x1fa/0x370 fs/open.c:455 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x34/0xb0 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x44/0xae > RIP: 0033:0x7f9409c379c7 > Code: 83 c4 08 48 3d 01 f0 ff ff 73 01 c3 48 8b 0d c8 d4 2b 00 f7 d8 > 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 b8 15 00 00 00 0f 05 <48> 3d > 01 f0 ff ff 73 01 c3 48 8b 0d a1 d4 2b 00 f7 d8 64 89 01 48 > RSP: 002b:00007ffeead2d338 EFLAGS: 00000246 ORIG_RAX: 0000000000000015 > RAX: ffffffffffffffda RBX: 00007ffeead30250 RCX: 00007f9409c379c7 > RDX: 00007f940a6a8a00 RSI: 0000000000000000 RDI: 000055d9e89aa9a3 > RBP: 00007ffeead2d370 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000069 R11: 0000000000000246 R12: 0000000000000000 > R13: 0000000000000000 R14: 00007ffeead30250 R15: 00007ffeead2d860 > NMI backtrace for cpu 2 skipped: idling at native_safe_halt > arch/x86/include/asm/irqflags.h:51 [inline] > NMI backtrace for cpu 2 skipped: idling at arch_safe_halt > arch/x86/include/asm/irqflags.h:89 [inline] > NMI backtrace for cpu 2 skipped: idling at default_idle+0xb/0x10 > arch/x86/kernel/process.c:716 > NMI backtrace for cpu 3 > CPU: 3 PID: 8091 Comm: rs:main Q:Reg Not tainted 5.14.0+ #13 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > RIP: 0010:memset_erms+0x9/0x10 arch/x86/lib/memset_64.S:64 > Code: c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 > f3 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 <f3> aa > 4c 89 c8 c3 90 49 89 fa 40 0f b6 ce 48 b8 01 01 01 01 01 01 > RSP: 0018:ffffc90007a23ba8 EFLAGS: 00010216 > RAX: 0000000000000000 RBX: ffff88810cd3ec78 RCX: 0000000000000c85 > RDX: 0000000000000d05 RSI: 0000000000000000 RDI: ffff888107f7b37b > RBP: 0000000000001000 R08: 0000000000000000 R09: ffff888107f7b2fb > R10: ffffc90007a23978 R11: 0000000000000005 R12: 0000000000000000 > R13: 0000000000000000 R14: 0000000000001000 R15: ffffea00041fdec0 > FS: 00007fd86dba4700(0000) GS:ffff88813dd00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 000055d6e0599858 CR3: 0000000012b38000 CR4: 0000000000750ee0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > PKRU: 55555554 > Call Trace: > memset include/linux/fortify-string.h:175 [inline] > zero_user_segments include/linux/highmem.h:211 [inline] > ext4_block_write_begin+0x63e/0x980 fs/ext4/inode.c:1089 > ext4_da_write_begin+0x275/0x610 fs/ext4/inode.c:3021 > generic_perform_write+0xce/0x220 mm/filemap.c:3770 > ext4_buffered_write_iter+0xd6/0x190 fs/ext4/file.c:269 > ext4_file_write_iter+0x80/0x940 fs/ext4/file.c:680 > call_write_iter include/linux/fs.h:2163 [inline] > new_sync_write+0x18d/0x260 fs/read_write.c:507 > vfs_write+0x43b/0x4a0 fs/read_write.c:594 > ksys_write+0xd2/0x120 fs/read_write.c:647 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x34/0xb0 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x44/0xae > RIP: 0033:0x7fd8705e81cd > Code: c2 20 00 00 75 10 b8 01 00 00 00 0f 05 48 3d 01 f0 ff ff 73 31 > c3 48 83 ec 08 e8 ae fc ff ff 48 89 04 24 b8 01 00 00 00 0f 05 <48> 8b > 3c 24 48 89 c2 e8 f7 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01 > RSP: 002b:00007fd86dba3590 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 > RAX: ffffffffffffffda RBX: 00007fd8640268f0 RCX: 00007fd8705e81cd > RDX: 00000000000004ce RSI: 00007fd8640268f0 RDI: 0000000000000008 > RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000293 R12: 00007fd864026670 > R13: 00007fd86dba35b0 R14: 00005563e5cf4440 R15: 00000000000004ce > ---------------- > Code disassembly (best guess): > 0: 04 25 add $0x25,%al > 2: 28 00 sub %al,(%rax) > 4: 00 00 add %al,(%rax) > 6: 0f 85 b5 01 00 00 jne 0x1c1 > c: 48 83 c4 38 add $0x38,%rsp > 10: 5b pop %rbx > 11: 5d pop %rbp > 12: 41 5c pop %r12 > 14: 41 5d pop %r13 > 16: 41 5e pop %r14 > 18: 41 5f pop %r15 > 1a: c3 retq > 1b: 65 48 8b 04 25 40 70 mov %gs:0x17040,%rax > 22: 01 00 > 24: 8b 80 f4 09 00 00 mov 0x9f4(%rax),%eax > * 2a: 85 c0 test %eax,%eax <-- trapping instruction > 2c: 0f 85 c5 fe ff ff jne 0xfffffef7 > 32: 9c pushfq > 33: 8f 04 24 popq (%rsp) > 36: fa cli > 37: 48 c7 c7 a0 36 5d 85 mov $0xffffffff855d36a0,%rdi > 3e: e8 .byte 0xe8 > 3f: e3 .byte 0xe3% > > Hao Sun <[email protected]> 于2021年9月8日周三 上午9:01写道: >> >> Pavel Begunkov <[email protected]> 于2021年9月8日周三 上午5:31写道: >>> >>> On 9/7/21 8:30 PM, Jens Axboe wrote: >>>> On 9/7/21 5:50 AM, Hao Sun wrote: >>>>> Hello, >>>>> >>>>> When using Healer to fuzz the latest Linux kernel, the following crash >>>>> was triggered. >>>>> >>>>> HEAD commit: 7d2a07b76933 Linux 5.14 >>>>> git tree: upstream >>>>> console output: >>>>> https://drive.google.com/file/d/1c8uRooM0TwJiTIwEviOCB4RC-hhOgGHR/view?usp=sharing >>>>> kernel config: https://drive.google.com/file/d/1XD9WYDViQLSXN7RGwH8AGGDvP9JvOghx/view?usp=sharing >>>>> Similar report: >>>>> https://groups.google.com/u/1/g/syzkaller-bugs/c/FvdcTiJIGtY/m/PcXkoenUAAAJ >>>>> >>>>> Sorry, I don't have a reproducer for this crash, hope the symbolized >>>>> report can help. >>>>> If you fix this issue, please add the following tag to the commit: >>>>> Reported-by: Hao Sun <[email protected]> >>>> >>>> Would be great with a reproducer for this one, though... >>> >>> And syzbot usually sends an execution log with all syz programs >>> it run, which may be helpful. Any chance you have anything similar >>> left? >>> >> >> Yes, found it[1]. Here is an execution history with latest 1024 >> executed progs before crash saved. >> Hope it can help. I'll also follow this crash closely, see if Healer >> can find a reproducer and send it to you once it found. >> >> [1] https://drive.google.com/file/d/14k8qOFeyKPD4HsqOpIjud3b9jsxFSo-u/view?usp=sharing >> >>> -- >>> Pavel Begunkov -- Pavel Begunkov ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: INFO: task hung in io_uring_cancel_generic 2021-09-08 20:16 ` Pavel Begunkov @ 2021-09-13 2:26 ` Hao Sun 2021-09-13 2:50 ` Jens Axboe 2021-09-13 8:30 ` Pavel Begunkov 0 siblings, 2 replies; 11+ messages in thread From: Hao Sun @ 2021-09-13 2:26 UTC (permalink / raw) To: Pavel Begunkov, Jens Axboe; +Cc: io-uring, linux-kernel Hi Healer found a C reproducer for this crash ("INFO: task hung in io_ring_exit_work"). HEAD commit: 4b93c544e90e-thunderbolt: test: split up test cases git tree: upstream console output: https://drive.google.com/file/d/1NswMU2yMRTc8-EqbZcVvcJejV92cuZIk/view?usp=sharing kernel config: https://drive.google.com/file/d/1c0u2EeRDhRO-ZCxr9MP2VvAtJd6kfg-p/view?usp=sharing C reproducer: https://drive.google.com/file/d/170wk5_T8mYDaAtDcrdVi2UU9_dW1894s/view?usp=sharing Syzlang reproducer: https://drive.google.com/file/d/1eo-jAS9lncm4i-1kaCBkexrjpQHXboBq/view?usp=sharing If you fix this issue, please add the following tag to the commit: Reported-by: Hao Sun <[email protected]> INFO: task kworker/u10:0:24 blocked for more than 143 seconds. Not tainted 5.14.0+ #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u10:0 state:D stack:27224 pid: 24 ppid: 2 flags:0x00004000 Workqueue: events_unbound io_ring_exit_work Call Trace: context_switch kernel/sched/core.c:4940 [inline] __schedule+0xcd9/0x2530 kernel/sched/core.c:6287 schedule+0xd3/0x270 kernel/sched/core.c:6366 schedule_timeout+0x5e5/0x890 kernel/time/timer.c:1857 do_wait_for_common kernel/sched/completion.c:85 [inline] __wait_for_common kernel/sched/completion.c:106 [inline] wait_for_common kernel/sched/completion.c:117 [inline] wait_for_completion+0x17d/0x280 kernel/sched/completion.c:138 io_ring_exit_work+0x530/0x1540 fs/io_uring.c:9333 process_one_work+0x9df/0x16d0 kernel/workqueue.c:2297 worker_thread+0x90/0xed0 kernel/workqueue.c:2444 kthread+0x3e5/0x4d0 kernel/kthread.c:319 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 Showing all locks held in the system: 2 locks held by kworker/u10:0/24: #0: ffff888010c71138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline] #0: ffff888010c71138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic_long_set include/linux/atomic/atomic-long.h:41 [inline] #0: ffff888010c71138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic_long_set include/linux/atomic/atomic-instrumented.h:1198 [inline] #0: ffff888010c71138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:634 [inline] #0: ffff888010c71138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:661 [inline] #0: ffff888010c71138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x8c3/0x16d0 kernel/workqueue.c:2268 #1: ffffc900007efdc8 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x8f7/0x16d0 kernel/workqueue.c:2272 1 lock held by khungtaskd/1161: #0: ffffffff8b97dde0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:6446 1 lock held by in:imklog/8220: #0: ffff88802031d4f0 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:990 ============================================= NMI backtrace for cpu 0 CPU: 0 PID: 1161 Comm: khungtaskd Not tainted 5.14.0+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105 nmi_cpu_backtrace.cold+0x47/0x144 lib/nmi_backtrace.c:105 nmi_trigger_cpumask_backtrace+0x1e1/0x220 lib/nmi_backtrace.c:62 trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline] check_hung_uninterruptible_tasks kernel/hung_task.c:210 [inline] watchdog+0xcc8/0x1010 kernel/hung_task.c:295 kthread+0x3e5/0x4d0 kernel/kthread.c:319 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 Sending NMI from CPU 0 to CPUs 1-3: NMI backtrace for cpu 1 CPU: 1 PID: 10654 Comm: syz-executor Not tainted 5.14.0+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:__sanitizer_cov_trace_pc+0x1c/0x40 kernel/kcov.c:197 Code: 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 65 48 8b 0c 25 40 f0 01 00 bf 02 00 00 00 48 89 ce 4c 8b 04 24 e8 76 ff ff ff 84 c0 <74> 20 48 8b 91 20 15 00 00 8b 89 1c 15 00 00 48 8b 02 48 83 c0 01 RSP: 0018:ffffc900085df560 EFLAGS: 00000202 RAX: 0000000000000001 RBX: 0000000000000002 RCX: ffff888011b08000 RDX: 0000000000000000 RSI: ffff888011b08000 RDI: 0000000000000002 RBP: ffffc900085df728 R08: ffffffff88b6a994 R09: 0000000000000002 R10: 0000000000000005 R11: ffffed1002361000 R12: ffff88802d8fa100 R13: ffffc900085df620 R14: 00000000fffffe00 R15: 0000000000000002 FS: 00007f7828b86700(0000) GS:ffff888135c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f3570c3c0a0 CR3: 0000000106990000 CR4: 0000000000350ee0 Call Trace: p9_client_rpc+0x3d4/0x11c0 net/9p/client.c:759 p9_client_flush+0xe9/0x1e0 net/9p/client.c:667 p9_client_rpc+0x107e/0x11c0 net/9p/client.c:784 p9_client_version net/9p/client.c:955 [inline] p9_client_create+0xbe0/0x13e0 net/9p/client.c:1055 v9fs_session_init+0x1e4/0x1900 fs/9p/v9fs.c:406 v9fs_mount+0x73/0x9d0 fs/9p/vfs_super.c:126 legacy_get_tree+0x105/0x220 fs/fs_context.c:610 vfs_get_tree+0x89/0x2f0 fs/super.c:1498 do_new_mount fs/namespace.c:2988 [inline] path_mount+0x1228/0x1cb0 fs/namespace.c:3318 do_mount+0xf3/0x110 fs/namespace.c:3331 __do_sys_mount fs/namespace.c:3539 [inline] __se_sys_mount fs/namespace.c:3516 [inline] __x64_sys_mount+0x18f/0x230 fs/namespace.c:3516 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x4739cd Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f7828b85c58 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 000000000059c0a0 RCX: 00000000004739cd RDX: 0000000020000380 RSI: 0000000020000340 RDI: 0000000000000000 RBP: 00000000004ebd80 R08: 00000000200003c0 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000059c0a0 R13: 00007ffd0845607f R14: 00007ffd08456220 R15: 00007f7828b85dc0 NMI backtrace for cpu 3 CPU: 3 PID: 4948 Comm: systemd-journal Not tainted 5.14.0+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:lookup_chain_cache kernel/locking/lockdep.c:3682 [inline] RIP: 0010:lookup_chain_cache_add kernel/locking/lockdep.c:3702 [inline] RIP: 0010:validate_chain kernel/locking/lockdep.c:3757 [inline] RIP: 0010:__lock_acquire+0x1877/0x57e0 kernel/locking/lockdep.c:5015 Code: e5 40 08 c7 8f 48 89 5c 24 08 0f 84 6f 04 00 00 4a 8d 14 e5 40 08 c7 8f 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 80 3c 02 00 <0f> 85 66 2f 00 00 4a 8b 1c e5 40 08 c7 8f 48 85 db 74 58 48 83 eb RSP: 0018:ffffc9000107f868 EFLAGS: 00000046 RAX: dffffc0000000000 RBX: ffffffff8fc8e4b0 RCX: ffffffff815b148b RDX: 1ffffffff1f91c96 RSI: 0000000000000008 RDI: ffffffff8fd008f8 RBP: ffff888013f4dfa0 R08: 0000000000000001 R09: fffffbfff1fa0120 R10: ffffffff8fd008ff R11: fffffbfff1fa011f R12: 0000000000003b8e R13: ffff888013f4d580 R14: d03c86651a37c6c3 R15: 0000000000000000 FS: 00007f35737908c0(0000) GS:ffff888135d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f3570caf000 CR3: 000000001b4d1000 CR4: 0000000000350ee0 Call Trace: lock_acquire kernel/locking/lockdep.c:5625 [inline] lock_acquire+0x1ab/0x520 kernel/locking/lockdep.c:5590 seqcount_lockdep_reader_access include/linux/seqlock.h:103 [inline] set_root+0x3be/0x560 fs/namei.c:940 nd_jump_root+0x38d/0x520 fs/namei.c:961 path_init+0xf81/0x1700 fs/namei.c:2359 path_lookupat.isra.0+0x30/0x580 fs/namei.c:2439 __filename_lookup+0x1ca/0x410 fs/namei.c:2478 filename_lookup fs/namei.c:2494 [inline] user_path_at_empty+0x42/0x60 fs/namei.c:2801 user_path_at include/linux/namei.h:57 [inline] do_faccessat+0x127/0x850 fs/open.c:421 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f3572a4c9c7 Code: 83 c4 08 48 3d 01 f0 ff ff 73 01 c3 48 8b 0d c8 d4 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 b8 15 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a1 d4 2b 00 f7 d8 64 89 01 48 RSP: 002b:00007ffcba7c9ca8 EFLAGS: 00000246 ORIG_RAX: 0000000000000015 RAX: ffffffffffffffda RBX: 00007ffcba7cccd0 RCX: 00007f3572a4c9c7 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00005623bd6869a3 RBP: 00007ffcba7c9df0 R08: 00005623bd67c3e5 R09: 0000000000000018 R10: 0000000000000069 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00005623bd7558a0 R15: 00007ffcba7ca2e0 NMI backtrace for cpu 2 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:51 [inline] NMI backtrace for cpu 2 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:89 [inline] NMI backtrace for cpu 2 skipped: idling at default_idle+0xb/0x10 arch/x86/kernel/process.c:716 ---------------- Code disassembly (best guess): 0: 66 66 2e 0f 1f 84 00 data16 nopw %cs:0x0(%rax,%rax,1) 7: 00 00 00 00 b: 0f 1f 00 nopl (%rax) e: 65 48 8b 0c 25 40 f0 mov %gs:0x1f040,%rcx 15: 01 00 17: bf 02 00 00 00 mov $0x2,%edi 1c: 48 89 ce mov %rcx,%rsi 1f: 4c 8b 04 24 mov (%rsp),%r8 23: e8 76 ff ff ff callq 0xffffff9e 28: 84 c0 test %al,%al * 2a: 74 20 je 0x4c <-- trapping instruction 2c: 48 8b 91 20 15 00 00 mov 0x1520(%rcx),%rdx 33: 8b 89 1c 15 00 00 mov 0x151c(%rcx),%ecx 39: 48 8b 02 mov (%rdx),%rax 3c: 48 83 c0 01 add $0x1,%rax% Pavel Begunkov <[email protected]> 于2021年9月9日周四 上午4:16写道: > > On 9/8/21 7:57 AM, Hao Sun wrote: > > Another similar report: INFO: task hung in io_ring_exit_work > > > > INFO: task kworker/u9:3:1867 blocked for more than 143 seconds. > > Not tainted 5.14.0+ #13 > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > task:kworker/u9:3 state:D stack:12344 pid: 1867 ppid: 2 flags:0x00004000 > > Workqueue: events_unbound io_ring_exit_work > > This one is different. I guess it may be difficult for syz to > find a repro because of the delay before the warning appears. > How about to reduce it to a lower value? like 2-5 seconds > > echo 5 > /proc/sys/kernel/hung_task_timeout_secs > > > Call Trace: > > context_switch kernel/sched/core.c:4940 [inline] > > __schedule+0x323/0xae0 kernel/sched/core.c:6287 > > schedule+0x36/0xe0 kernel/sched/core.c:6366 > > schedule_timeout+0x189/0x430 kernel/time/timer.c:1857 > > do_wait_for_common kernel/sched/completion.c:85 [inline] > > __wait_for_common kernel/sched/completion.c:106 [inline] > > wait_for_common kernel/sched/completion.c:117 [inline] > > wait_for_completion+0xb4/0x110 kernel/sched/completion.c:138 > > io_ring_exit_work+0x287/0x850 fs/io_uring.c:9297 > > process_one_work+0x359/0x850 kernel/workqueue.c:2297 > > worker_thread+0x41/0x4d0 kernel/workqueue.c:2444 > > kthread+0x178/0x1b0 kernel/kthread.c:319 > > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 > > INFO: task kworker/u9:4:2050 blocked for more than 143 seconds. > > Not tainted 5.14.0+ #13 > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > task:kworker/u9:4 state:D stack:12232 pid: 2050 ppid: 2 flags:0x00004000 > > Workqueue: events_unbound io_ring_exit_work > > Call Trace: > > context_switch kernel/sched/core.c:4940 [inline] > > __schedule+0x323/0xae0 kernel/sched/core.c:6287 > > schedule+0x36/0xe0 kernel/sched/core.c:6366 > > schedule_timeout+0x189/0x430 kernel/time/timer.c:1857 > > do_wait_for_common kernel/sched/completion.c:85 [inline] > > __wait_for_common kernel/sched/completion.c:106 [inline] > > wait_for_common kernel/sched/completion.c:117 [inline] > > wait_for_completion+0xb4/0x110 kernel/sched/completion.c:138 > > io_ring_exit_work+0x287/0x850 fs/io_uring.c:9297 > > process_one_work+0x359/0x850 kernel/workqueue.c:2297 > > worker_thread+0x41/0x4d0 kernel/workqueue.c:2444 > > kthread+0x178/0x1b0 kernel/kthread.c:319 > > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 > > > > Showing all locks held in the system: > > 1 lock held by khungtaskd/1340: > > #0: ffffffff85a1d3a0 (rcu_read_lock){....}-{1:2}, at: > > debug_show_all_locks+0xe/0x1a0 kernel/locking/lockdep.c:6446 > > 2 locks held by kworker/u9:3/1867: > > #0: ffff888008c5c938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: > > set_work_data kernel/workqueue.c:633 [inline] > > #0: ffff888008c5c938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: > > set_work_pool_and_clear_pending kernel/workqueue.c:661 [inline] > > #0: ffff888008c5c938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: > > process_one_work+0x2a0/0x850 kernel/workqueue.c:2268 > > #1: ffffc90003e67e70 > > ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: set_work_data > > kernel/workqueue.c:633 [inline] > > #1: ffffc90003e67e70 > > ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: > > set_work_pool_and_clear_pending kernel/workqueue.c:661 [inline] > > #1: ffffc90003e67e70 > > ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: > > process_one_work+0x2a0/0x850 kernel/workqueue.c:2268 > > 2 locks held by kworker/u9:4/2050: > > #0: ffff888008c5c938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: > > set_work_data kernel/workqueue.c:633 [inline] > > #0: ffff888008c5c938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: > > set_work_pool_and_clear_pending kernel/workqueue.c:661 [inline] > > #0: ffff888008c5c938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: > > process_one_work+0x2a0/0x850 kernel/workqueue.c:2268 > > #1: ffffc90004327e70 > > ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: set_work_data > > kernel/workqueue.c:633 [inline] > > #1: ffffc90004327e70 > > ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: > > set_work_pool_and_clear_pending kernel/workqueue.c:661 [inline] > > #1: ffffc90004327e70 > > ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: > > process_one_work+0x2a0/0x850 kernel/workqueue.c:2268 > > 1 lock held by in:imklog/8089: > > #0: ffff88800ece7ef0 (&f->f_pos_lock){+.+.}-{3:3}, at: > > __fdget_pos+0x55/0x60 fs/file.c:990 > > > > ============================================= > > > > NMI backtrace for cpu 0 > > CPU: 0 PID: 1340 Comm: khungtaskd Not tainted 5.14.0+ #13 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > > Call Trace: > > __dump_stack lib/dump_stack.c:88 [inline] > > dump_stack_lvl+0x8d/0xcf lib/dump_stack.c:105 > > nmi_cpu_backtrace+0x1e9/0x210 lib/nmi_backtrace.c:105 > > nmi_trigger_cpumask_backtrace+0x120/0x180 lib/nmi_backtrace.c:62 > > trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline] > > check_hung_uninterruptible_tasks kernel/hung_task.c:210 [inline] > > watchdog+0x4e1/0x980 kernel/hung_task.c:295 > > kthread+0x178/0x1b0 kernel/kthread.c:319 > > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 > > Sending NMI from CPU 0 to CPUs 1-3: > > NMI backtrace for cpu 1 > > CPU: 1 PID: 4934 Comm: systemd-journal Not tainted 5.14.0+ #13 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > > RIP: 0010:lockdep_enabled kernel/locking/lockdep.c:91 [inline] > > RIP: 0010:lock_acquire+0x1a4/0x340 kernel/locking/lockdep.c:5601 > > Code: 04 25 28 00 00 00 0f 85 b5 01 00 00 48 83 c4 38 5b 5d 41 5c 41 > > 5d 41 5e 41 5f c3 65 48 8b 04 25 40 70 01 00 8b 80 f4 09 00 00 <85> c0 > > 0f 85 c5 fe ff ff 9c 8f 04 24 fa 48 c7 c7 a0 36 5d 85 e8 e3 > > RSP: 0018:ffffc9000086fe18 EFLAGS: 00000246 > > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 > > RDX: 0000000000000001 RSI: ffffffff852a8b94 RDI: ffffffff853cae8e > > RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 > > R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff85a1d3a0 > > R13: 0000000000000002 R14: 0000000000000000 R15: 0000000000000000 > > FS: 00007f940a97b8c0(0000) GS:ffff88813dc00000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00007f9407aa3000 CR3: 000000000f49a000 CR4: 0000000000750ee0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > PKRU: 55555554 > > Call Trace: > > rcu_lock_acquire include/linux/rcupdate.h:267 [inline] > > rcu_read_lock include/linux/rcupdate.h:687 [inline] > > dput+0x72/0x650 fs/dcache.c:874 > > path_put+0x12/0x20 fs/namei.c:557 > > do_faccessat+0x1fa/0x370 fs/open.c:455 > > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > do_syscall_64+0x34/0xb0 arch/x86/entry/common.c:80 > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > RIP: 0033:0x7f9409c379c7 > > Code: 83 c4 08 48 3d 01 f0 ff ff 73 01 c3 48 8b 0d c8 d4 2b 00 f7 d8 > > 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 b8 15 00 00 00 0f 05 <48> 3d > > 01 f0 ff ff 73 01 c3 48 8b 0d a1 d4 2b 00 f7 d8 64 89 01 48 > > RSP: 002b:00007ffeead2d338 EFLAGS: 00000246 ORIG_RAX: 0000000000000015 > > RAX: ffffffffffffffda RBX: 00007ffeead30250 RCX: 00007f9409c379c7 > > RDX: 00007f940a6a8a00 RSI: 0000000000000000 RDI: 000055d9e89aa9a3 > > RBP: 00007ffeead2d370 R08: 0000000000000000 R09: 0000000000000000 > > R10: 0000000000000069 R11: 0000000000000246 R12: 0000000000000000 > > R13: 0000000000000000 R14: 00007ffeead30250 R15: 00007ffeead2d860 > > NMI backtrace for cpu 2 skipped: idling at native_safe_halt > > arch/x86/include/asm/irqflags.h:51 [inline] > > NMI backtrace for cpu 2 skipped: idling at arch_safe_halt > > arch/x86/include/asm/irqflags.h:89 [inline] > > NMI backtrace for cpu 2 skipped: idling at default_idle+0xb/0x10 > > arch/x86/kernel/process.c:716 > > NMI backtrace for cpu 3 > > CPU: 3 PID: 8091 Comm: rs:main Q:Reg Not tainted 5.14.0+ #13 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > > RIP: 0010:memset_erms+0x9/0x10 arch/x86/lib/memset_64.S:64 > > Code: c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 > > f3 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 <f3> aa > > 4c 89 c8 c3 90 49 89 fa 40 0f b6 ce 48 b8 01 01 01 01 01 01 > > RSP: 0018:ffffc90007a23ba8 EFLAGS: 00010216 > > RAX: 0000000000000000 RBX: ffff88810cd3ec78 RCX: 0000000000000c85 > > RDX: 0000000000000d05 RSI: 0000000000000000 RDI: ffff888107f7b37b > > RBP: 0000000000001000 R08: 0000000000000000 R09: ffff888107f7b2fb > > R10: ffffc90007a23978 R11: 0000000000000005 R12: 0000000000000000 > > R13: 0000000000000000 R14: 0000000000001000 R15: ffffea00041fdec0 > > FS: 00007fd86dba4700(0000) GS:ffff88813dd00000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 000055d6e0599858 CR3: 0000000012b38000 CR4: 0000000000750ee0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > PKRU: 55555554 > > Call Trace: > > memset include/linux/fortify-string.h:175 [inline] > > zero_user_segments include/linux/highmem.h:211 [inline] > > ext4_block_write_begin+0x63e/0x980 fs/ext4/inode.c:1089 > > ext4_da_write_begin+0x275/0x610 fs/ext4/inode.c:3021 > > generic_perform_write+0xce/0x220 mm/filemap.c:3770 > > ext4_buffered_write_iter+0xd6/0x190 fs/ext4/file.c:269 > > ext4_file_write_iter+0x80/0x940 fs/ext4/file.c:680 > > call_write_iter include/linux/fs.h:2163 [inline] > > new_sync_write+0x18d/0x260 fs/read_write.c:507 > > vfs_write+0x43b/0x4a0 fs/read_write.c:594 > > ksys_write+0xd2/0x120 fs/read_write.c:647 > > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > do_syscall_64+0x34/0xb0 arch/x86/entry/common.c:80 > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > RIP: 0033:0x7fd8705e81cd > > Code: c2 20 00 00 75 10 b8 01 00 00 00 0f 05 48 3d 01 f0 ff ff 73 31 > > c3 48 83 ec 08 e8 ae fc ff ff 48 89 04 24 b8 01 00 00 00 0f 05 <48> 8b > > 3c 24 48 89 c2 e8 f7 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01 > > RSP: 002b:00007fd86dba3590 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 > > RAX: ffffffffffffffda RBX: 00007fd8640268f0 RCX: 00007fd8705e81cd > > RDX: 00000000000004ce RSI: 00007fd8640268f0 RDI: 0000000000000008 > > RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 > > R10: 0000000000000000 R11: 0000000000000293 R12: 00007fd864026670 > > R13: 00007fd86dba35b0 R14: 00005563e5cf4440 R15: 00000000000004ce > > ---------------- > > Code disassembly (best guess): > > 0: 04 25 add $0x25,%al > > 2: 28 00 sub %al,(%rax) > > 4: 00 00 add %al,(%rax) > > 6: 0f 85 b5 01 00 00 jne 0x1c1 > > c: 48 83 c4 38 add $0x38,%rsp > > 10: 5b pop %rbx > > 11: 5d pop %rbp > > 12: 41 5c pop %r12 > > 14: 41 5d pop %r13 > > 16: 41 5e pop %r14 > > 18: 41 5f pop %r15 > > 1a: c3 retq > > 1b: 65 48 8b 04 25 40 70 mov %gs:0x17040,%rax > > 22: 01 00 > > 24: 8b 80 f4 09 00 00 mov 0x9f4(%rax),%eax > > * 2a: 85 c0 test %eax,%eax <-- trapping instruction > > 2c: 0f 85 c5 fe ff ff jne 0xfffffef7 > > 32: 9c pushfq > > 33: 8f 04 24 popq (%rsp) > > 36: fa cli > > 37: 48 c7 c7 a0 36 5d 85 mov $0xffffffff855d36a0,%rdi > > 3e: e8 .byte 0xe8 > > 3f: e3 .byte 0xe3% > > > > Hao Sun <[email protected]> 于2021年9月8日周三 上午9:01写道: > >> > >> Pavel Begunkov <[email protected]> 于2021年9月8日周三 上午5:31写道: > >>> > >>> On 9/7/21 8:30 PM, Jens Axboe wrote: > >>>> On 9/7/21 5:50 AM, Hao Sun wrote: > >>>>> Hello, > >>>>> > >>>>> When using Healer to fuzz the latest Linux kernel, the following crash > >>>>> was triggered. > >>>>> > >>>>> HEAD commit: 7d2a07b76933 Linux 5.14 > >>>>> git tree: upstream > >>>>> console output: > >>>>> https://drive.google.com/file/d/1c8uRooM0TwJiTIwEviOCB4RC-hhOgGHR/view?usp=sharing > >>>>> kernel config: https://drive.google.com/file/d/1XD9WYDViQLSXN7RGwH8AGGDvP9JvOghx/view?usp=sharing > >>>>> Similar report: > >>>>> https://groups.google.com/u/1/g/syzkaller-bugs/c/FvdcTiJIGtY/m/PcXkoenUAAAJ > >>>>> > >>>>> Sorry, I don't have a reproducer for this crash, hope the symbolized > >>>>> report can help. > >>>>> If you fix this issue, please add the following tag to the commit: > >>>>> Reported-by: Hao Sun <[email protected]> > >>>> > >>>> Would be great with a reproducer for this one, though... > >>> > >>> And syzbot usually sends an execution log with all syz programs > >>> it run, which may be helpful. Any chance you have anything similar > >>> left? > >>> > >> > >> Yes, found it[1]. Here is an execution history with latest 1024 > >> executed progs before crash saved. > >> Hope it can help. I'll also follow this crash closely, see if Healer > >> can find a reproducer and send it to you once it found. > >> > >> [1] https://drive.google.com/file/d/14k8qOFeyKPD4HsqOpIjud3b9jsxFSo-u/view?usp=sharing > >> > >>> -- > >>> Pavel Begunkov > > -- > Pavel Begunkov ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: INFO: task hung in io_uring_cancel_generic 2021-09-13 2:26 ` Hao Sun @ 2021-09-13 2:50 ` Jens Axboe 2021-09-13 3:21 ` Hao Sun 2021-09-13 8:30 ` Pavel Begunkov 1 sibling, 1 reply; 11+ messages in thread From: Jens Axboe @ 2021-09-13 2:50 UTC (permalink / raw) To: Hao Sun, Pavel Begunkov; +Cc: io-uring, linux-kernel On 9/12/21 8:26 PM, Hao Sun wrote: > Hi > > Healer found a C reproducer for this crash ("INFO: task hung in > io_ring_exit_work"). > > HEAD commit: 4b93c544e90e-thunderbolt: test: split up test cases Does this reproduce on 5.15-rc1? We had a few hang cases fixed for io-wq since that commit 6 days ago. -- Jens Axboe ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: INFO: task hung in io_uring_cancel_generic 2021-09-13 2:50 ` Jens Axboe @ 2021-09-13 3:21 ` Hao Sun 0 siblings, 0 replies; 11+ messages in thread From: Hao Sun @ 2021-09-13 3:21 UTC (permalink / raw) To: Jens Axboe; +Cc: Pavel Begunkov, io-uring, linux-kernel > Jens Axboe <[email protected]> 于2021年9月13日周一 上午10:50写道: > > On 9/12/21 8:26 PM, Hao Sun wrote: > > Hi > > > > Healer found a C reproducer for this crash ("INFO: task hung in > > io_ring_exit_work"). > > > > HEAD commit: 4b93c544e90e-thunderbolt: test: split up test cases > > Does this reproduce on 5.15-rc1? We had a few hang cases fixed for io-wq > since that commit 6 days ago. Just tried it. No, it did not crash the kernel on 5.15-rc1. Following log was printed repeatedly: [ 647.478557][ T6807] loop0: p1 p2 < > p3 p4 [ 647.478922][ T6807] loop0: p1 size 11290111 extends beyond EOD, truncated [ 647.481111][ T6807] loop0: p3 size 1914664839 extends beyond EOD, truncated [ 647.482512][ T6807] loop0: p4 size 3389030400 extends beyond EOD, truncated It seems that this crash was fixed. Sorry for the dup report. > > -- > Jens Axboe > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: INFO: task hung in io_uring_cancel_generic 2021-09-13 2:26 ` Hao Sun 2021-09-13 2:50 ` Jens Axboe @ 2021-09-13 8:30 ` Pavel Begunkov 2021-09-14 3:01 ` Hao Sun 1 sibling, 1 reply; 11+ messages in thread From: Pavel Begunkov @ 2021-09-13 8:30 UTC (permalink / raw) To: Hao Sun, Jens Axboe; +Cc: io-uring, linux-kernel On 9/13/21 3:26 AM, Hao Sun wrote: > Hi > > Healer found a C reproducer for this crash ("INFO: task hung in > io_ring_exit_work"). > > HEAD commit: 4b93c544e90e-thunderbolt: test: split up test cases > git tree: upstream > console output: > https://drive.google.com/file/d/1NswMU2yMRTc8-EqbZcVvcJejV92cuZIk/view?usp=sharing > kernel config: https://drive.google.com/file/d/1c0u2EeRDhRO-ZCxr9MP2VvAtJd6kfg-p/view?usp=sharing > C reproducer: https://drive.google.com/file/d/170wk5_T8mYDaAtDcrdVi2UU9_dW1894s/view?usp=sharing > Syzlang reproducer: > https://drive.google.com/file/d/1eo-jAS9lncm4i-1kaCBkexrjpQHXboBq/view?usp=sharing > > If you fix this issue, please add the following tag to the commit: > Reported-by: Hao Sun <[email protected]> I don't see the repro using io_uring at all. Can it be because of the delay before the warning shows itself? 120 secs, this appeared after 143. [...] -- Pavel Begunkov ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: INFO: task hung in io_uring_cancel_generic 2021-09-13 8:30 ` Pavel Begunkov @ 2021-09-14 3:01 ` Hao Sun 0 siblings, 0 replies; 11+ messages in thread From: Hao Sun @ 2021-09-14 3:01 UTC (permalink / raw) To: Pavel Begunkov; +Cc: Jens Axboe, io-uring, linux-kernel Pavel Begunkov <[email protected]> 于2021年9月13日周一 下午4:30写道: > > On 9/13/21 3:26 AM, Hao Sun wrote: > > Hi > > > > Healer found a C reproducer for this crash ("INFO: task hung in > > io_ring_exit_work"). > > > > HEAD commit: 4b93c544e90e-thunderbolt: test: split up test cases > > git tree: upstream > > console output: > > https://drive.google.com/file/d/1NswMU2yMRTc8-EqbZcVvcJejV92cuZIk/view?usp=sharing > > kernel config: https://drive.google.com/file/d/1c0u2EeRDhRO-ZCxr9MP2VvAtJd6kfg-p/view?usp=sharing > > C reproducer: https://drive.google.com/file/d/170wk5_T8mYDaAtDcrdVi2UU9_dW1894s/view?usp=sharing > > Syzlang reproducer: > > https://drive.google.com/file/d/1eo-jAS9lncm4i-1kaCBkexrjpQHXboBq/view?usp=sharing > > > > If you fix this issue, please add the following tag to the commit: > > Reported-by: Hao Sun <[email protected]> > > I don't see the repro using io_uring at all. Can it be because of > the delay before the warning shows itself? 120 secs, this appeared > after 143. > I think the crash was most likely fixed. Here is what I've done. First, I re-run the whole execution history (https://drive.google.com/file/d/1NswMU2yMRTc8-EqbZcVvcJejV92cuZIk/view?usp=sharing) with `syz-repro` on latest kernel (6880fa6c5660 Linux 5.15-rc1). The kernel did not crash at all. Then, I re-run the history on the original version of the kernel (4b93c544e90e-thunderbolt: test: split up test cases). It crashed and task hang happened but with a different location ("io_wq_submit_work"). Since `syz-repro` is smart enough and will give prog enough timeout to be executed when the crash type is `Hang` (see https://github.com/google/syzkaller/blob/master/pkg/repro/repro.go#L98), the delay before a warning can be handled properly. However, I'll still keep track of this crash since it was still not reproduced yet. > [...] > > -- > Pavel Begunkov ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-09-14 3:01 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-09-07 11:50 INFO: task hung in io_uring_cancel_generic Hao Sun 2021-09-07 19:30 ` Jens Axboe 2021-09-07 21:31 ` Pavel Begunkov 2021-09-08 1:01 ` Hao Sun 2021-09-08 6:57 ` Hao Sun 2021-09-08 20:16 ` Pavel Begunkov 2021-09-13 2:26 ` Hao Sun 2021-09-13 2:50 ` Jens Axboe 2021-09-13 3:21 ` Hao Sun 2021-09-13 8:30 ` Pavel Begunkov 2021-09-14 3:01 ` Hao Sun
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox