INFO: task hung in linkwatch

public inbox for [email protected]
 help / color / mirror / Atom feed

* INFO: task hung in linkwatch_event (2)
@ 2020-04-29  9:59 syzbot
  2020-05-06  1:38 ` Yunsheng Lin
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: syzbot @ 2020-04-29  9:59 UTC (permalink / raw)
  To: allison, aviad.krawczyk, axboe, davem, gregkh, io-uring, kuba,
	linux-fsdevel, linux-kernel, linyunsheng, luobin9, netdev,
	syzkaller-bugs, tglx, viro

Hello,

syzbot found the following crash on:

HEAD commit:    b4f63322 Merge branch 'for-linus' of git://git.kernel.org/..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1558936fe00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=b7a70e992f2f9b68
dashboard link: https://syzkaller.appspot.com/bug?extid=96ff6cfc4551fcc29342
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=14a57828100000

The bug was bisected to:

commit 386d4716fd91869e07c731657f2cde5a33086516
Author: Luo bin <[email protected]>
Date:   Thu Feb 27 06:34:44 2020 +0000

    hinic: fix a bug of rss configuration

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=16626fcfe00000
final crash:    https://syzkaller.appspot.com/x/report.txt?x=15626fcfe00000
console output: https://syzkaller.appspot.com/x/log.txt?x=11626fcfe00000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: [email protected]
Fixes: 386d4716fd91 ("hinic: fix a bug of rss configuration")

INFO: task kworker/1:5:2724 blocked for more than 143 seconds.
      Not tainted 5.7.0-rc2-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/1:5     D27416  2724      2 0x80004000
Workqueue: events linkwatch_event
Call Trace:
 schedule+0xd0/0x2a0 kernel/sched/core.c:4163
 schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:4222
 __mutex_lock_common kernel/locking/mutex.c:1033 [inline]
 __mutex_lock+0x7ab/0x13c0 kernel/locking/mutex.c:1103
 linkwatch_event+0xb/0x60 net/core/link_watch.c:242
 process_one_work+0x965/0x16a0 kernel/workqueue.c:2268
 worker_thread+0x96/0xe20 kernel/workqueue.c:2414
 kthread+0x388/0x470 kernel/kthread.c:268
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
INFO: task syz-executor.0:7053 blocked for more than 143 seconds.
      Not tainted 5.7.0-rc2-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor.0  D23512  7053      1 0x80004006
Call Trace:
 schedule+0xd0/0x2a0 kernel/sched/core.c:4163
 schedule_timeout+0x55b/0x850 kernel/time/timer.c:1874
 do_wait_for_common kernel/sched/completion.c:85 [inline]
 __wait_for_common kernel/sched/completion.c:106 [inline]
 wait_for_common kernel/sched/completion.c:117 [inline]
 wait_for_completion+0x16a/0x270 kernel/sched/completion.c:138
 __flush_work+0x4fd/0xa80 kernel/workqueue.c:3045
 flush_all_backlogs net/core/dev.c:5527 [inline]
 rollback_registered_many+0x562/0xe70 net/core/dev.c:8813
 rollback_registered+0xf2/0x1c0 net/core/dev.c:8873
 unregister_netdevice_queue net/core/dev.c:9969 [inline]
 unregister_netdevice_queue+0x1d7/0x2b0 net/core/dev.c:9962
 unregister_netdevice include/linux/netdevice.h:2725 [inline]
 __tun_detach+0xe42/0x1110 drivers/net/tun.c:690
 tun_detach drivers/net/tun.c:707 [inline]
 tun_chr_close+0xd9/0x180 drivers/net/tun.c:3413
 __fput+0x33e/0x880 fs/file_table.c:280
 task_work_run+0xf4/0x1b0 kernel/task_work.c:123
 exit_task_work include/linux/task_work.h:22 [inline]
 do_exit+0xb34/0x2dd0 kernel/exit.c:795
 do_group_exit+0x125/0x340 kernel/exit.c:893
 get_signal+0x47b/0x24e0 kernel/signal.c:2739
 do_signal+0x81/0x2240 arch/x86/kernel/signal.c:784
 exit_to_usermode_loop+0x26c/0x360 arch/x86/entry/common.c:161
 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
 do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305
 entry_SYSCALL_64_after_hwframe+0x49/0xb3
RIP: 0033:0x4166ca
Code: Bad RIP value.
RSP: 002b:00007ffd4022d478 EFLAGS: 00000246 ORIG_RAX: 000000000000003d
RAX: fffffffffffffe00 RBX: 0000000001d60940 RCX: 00000000004166ca
RDX: 0000000040000000 RSI: 00007ffd4022d4b0 RDI: ffffffffffffffff
RBP: 0000000000002996 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
R13: 00007ffd4022d4b0 R14: 0000000001d6099b R15: 00007ffd4022d4c0

Showing all locks held in the system:
1 lock held by khungtaskd/1125:
 #0: ffffffff899beb00 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:5754
3 locks held by kworker/1:5/2724:
 #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: __write_once_size include/linux/compiler.h:226 [inline]
 #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:855 [inline]
 #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
 #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:615 [inline]
 #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:642 [inline]
 #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x844/0x16a0 kernel/workqueue.c:2239
 #1: ffffc90008367dc0 ((linkwatch_work).work){+.+.}-{0:0}, at: process_one_work+0x878/0x16a0 kernel/workqueue.c:2243
 #2: ffffffff8a582268 (rtnl_mutex){+.+.}-{3:3}, at: linkwatch_event+0xb/0x60 net/core/link_watch.c:242
1 lock held by in:imklog/6717:
 #0: ffff888098d271b0 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:826
2 locks held by syz-executor.0/7053:
 #0: ffffffff8a582268 (rtnl_mutex){+.+.}-{3:3}, at: tun_detach drivers/net/tun.c:704 [inline]
 #0: ffffffff8a582268 (rtnl_mutex){+.+.}-{3:3}, at: tun_chr_close+0x3a/0x180 drivers/net/tun.c:3413
 #1: ffffffff89979ad0 (cpu_hotplug_lock){++++}-{0:0}, at: get_online_cpus include/linux/cpu.h:143 [inline]
 #1: ffffffff89979ad0 (cpu_hotplug_lock){++++}-{0:0}, at: flush_all_backlogs net/core/dev.c:5520 [inline]
 #1: ffffffff89979ad0 (cpu_hotplug_lock){++++}-{0:0}, at: rollback_registered_many+0x45b/0xe70 net/core/dev.c:8813
3 locks held by kworker/1:6/14336:
 #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: __write_once_size include/linux/compiler.h:226 [inline]
 #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:855 [inline]
 #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
 #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:615 [inline]
 #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:642 [inline]
 #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: process_one_work+0x844/0x16a0 kernel/workqueue.c:2239
 #1: ffffc90004637dc0 ((addr_chk_work).work){+.+.}-{0:0}, at: process_one_work+0x878/0x16a0 kernel/workqueue.c:2243
 #2: ffffffff8a582268 (rtnl_mutex){+.+.}-{3:3}, at: addrconf_verify_work+0xa/0x20 net/ipv6/addrconf.c:4584

=============================================

NMI backtrace for cpu 1
CPU: 1 PID: 1125 Comm: khungtaskd Not tainted 5.7.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x188/0x20d lib/dump_stack.c:118
 nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101
 nmi_trigger_cpumask_backtrace+0x231/0x27e lib/nmi_backtrace.c:62
 trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
 check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline]
 watchdog+0xa8c/0x1010 kernel/hung_task.c:289
 kthread+0x388/0x470 kernel/kthread.c:268
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 28894 Comm: syz-executor.0 Not tainted 5.7.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:io_ring_ctx_wait_and_kill+0x98/0x5e0 fs/io_uring.c:7329
Code: 01 00 00 4d 89 f4 48 b8 00 00 00 00 00 fc ff df 4c 89 ed 49 c1 ec 03 48 c1 ed 03 49 01 c4 48 01 c5 eb 1c e8 6a f2 9d ff f3 90 <41> 80 3c 24 00 0f 85 b0 04 00 00 48 83 bb 10 01 00 00 00 74 21 e8
RSP: 0018:ffffc90004e17a48 EFLAGS: 00000293
RAX: ffff888091758480 RBX: ffff888094860000 RCX: 1ffff920009c2f36
RDX: 0000000000000000 RSI: ffffffff81d53c26 RDI: ffff888094860300
RBP: ffffed101290c02c R08: 0000000000000001 R09: ffffed101290c061
R10: ffff888094860307 R11: ffffed101290c060 R12: ffffed101290c022
R13: ffff888094860160 R14: ffff888094860110 R15: ffffffff81d54170
FS:  00007fac6c1a8700(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000560ad6a654a7 CR3: 0000000009879000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 io_uring_release+0x3e/0x50 fs/io_uring.c:7352
 __fput+0x33e/0x880 fs/file_table.c:280
 task_work_run+0xf4/0x1b0 kernel/task_work.c:123
 exit_task_work include/linux/task_work.h:22 [inline]
 do_exit+0xb34/0x2dd0 kernel/exit.c:795
 do_group_exit+0x125/0x340 kernel/exit.c:893
 get_signal+0x47b/0x24e0 kernel/signal.c:2739
 do_signal+0x81/0x2240 arch/x86/kernel/signal.c:784
 exit_to_usermode_loop+0x26c/0x360 arch/x86/entry/common.c:161
 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
 do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305
 entry_SYSCALL_64_after_hwframe+0x49/0xb3
RIP: 0033:0x45c829
Code: Bad RIP value.
RSP: 002b:00007fac6c1a7c78 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9
RAX: 0000000000000003 RBX: 00000000004e0bc0 RCX: 000000000045c829
RDX: 0000000000000000 RSI: 0000000020000580 RDI: 00000000000000f1
RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000000204 R14: 00000000004c425f R15: 00007fac6c1a86d4


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: INFO: task hung in linkwatch_event (2)
  2020-04-29  9:59 INFO: task hung in linkwatch_event (2) syzbot
@ 2020-05-06  1:38 ` Yunsheng Lin
       [not found] ` <[email protected]>
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Yunsheng Lin @ 2020-05-06  1:38 UTC (permalink / raw)
  To: syzbot, allison, aviad.krawczyk, axboe, davem, gregkh, io-uring,
	kuba, linux-fsdevel, linux-kernel, luobin9, netdev,
	syzkaller-bugs, tglx, viro

On 2020/4/29 17:59, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    b4f63322 Merge branch 'for-linus' of git://git.kernel.org/..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1558936fe00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=b7a70e992f2f9b68
> dashboard link: https://syzkaller.appspot.com/bug?extid=96ff6cfc4551fcc29342
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=14a57828100000
> 
> The bug was bisected to:
> 
> commit 386d4716fd91869e07c731657f2cde5a33086516
> Author: Luo bin <[email protected]>
> Date:   Thu Feb 27 06:34:44 2020 +0000
> 
>     hinic: fix a bug of rss configuration

The above patch does not seem to be the cause of the crash.

From the below call trace, it seems the blocking is caused by
the tun_detach() which need to flush the all the pending work
for each online cpu, it is the linkwatch_work that need to be
flushed in this crash case. But the linkwatch_work() need to take
RTNL lock, which is already taken by the tun_detach(), and that is
where the blocking is happening.

Possible way to fix or avoid this:
1. Call flush_all_backlogs() without holding the RTNL lock, I am not
   sure it is safe to do this.
2. Disabling adding link event to the unregisterring netdev, and flush
   all the pending link event without taking RTNL lock before calling
   unregister_netdevice() in tun_detach().

Any better suggestion? Thanks.

> 
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=16626fcfe00000
> final crash:    https://syzkaller.appspot.com/x/report.txt?x=15626fcfe00000
> console output: https://syzkaller.appspot.com/x/log.txt?x=11626fcfe00000
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: [email protected]
> Fixes: 386d4716fd91 ("hinic: fix a bug of rss configuration")
> 
> INFO: task kworker/1:5:2724 blocked for more than 143 seconds.
>       Not tainted 5.7.0-rc2-syzkaller #0
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> kworker/1:5     D27416  2724      2 0x80004000
> Workqueue: events linkwatch_event
> Call Trace:
>  schedule+0xd0/0x2a0 kernel/sched/core.c:4163
>  schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:4222
>  __mutex_lock_common kernel/locking/mutex.c:1033 [inline]
>  __mutex_lock+0x7ab/0x13c0 kernel/locking/mutex.c:1103
>  linkwatch_event+0xb/0x60 net/core/link_watch.c:242
>  process_one_work+0x965/0x16a0 kernel/workqueue.c:2268
>  worker_thread+0x96/0xe20 kernel/workqueue.c:2414
>  kthread+0x388/0x470 kernel/kthread.c:268
>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
> INFO: task syz-executor.0:7053 blocked for more than 143 seconds.
>       Not tainted 5.7.0-rc2-syzkaller #0
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> syz-executor.0  D23512  7053      1 0x80004006
> Call Trace:
>  schedule+0xd0/0x2a0 kernel/sched/core.c:4163
>  schedule_timeout+0x55b/0x850 kernel/time/timer.c:1874
>  do_wait_for_common kernel/sched/completion.c:85 [inline]
>  __wait_for_common kernel/sched/completion.c:106 [inline]
>  wait_for_common kernel/sched/completion.c:117 [inline]
>  wait_for_completion+0x16a/0x270 kernel/sched/completion.c:138
>  __flush_work+0x4fd/0xa80 kernel/workqueue.c:3045
>  flush_all_backlogs net/core/dev.c:5527 [inline]
>  rollback_registered_many+0x562/0xe70 net/core/dev.c:8813
>  rollback_registered+0xf2/0x1c0 net/core/dev.c:8873
>  unregister_netdevice_queue net/core/dev.c:9969 [inline]
>  unregister_netdevice_queue+0x1d7/0x2b0 net/core/dev.c:9962
>  unregister_netdevice include/linux/netdevice.h:2725 [inline]
>  __tun_detach+0xe42/0x1110 drivers/net/tun.c:690
>  tun_detach drivers/net/tun.c:707 [inline]
>  tun_chr_close+0xd9/0x180 drivers/net/tun.c:3413
>  __fput+0x33e/0x880 fs/file_table.c:280
>  task_work_run+0xf4/0x1b0 kernel/task_work.c:123
>  exit_task_work include/linux/task_work.h:22 [inline]
>  do_exit+0xb34/0x2dd0 kernel/exit.c:795
>  do_group_exit+0x125/0x340 kernel/exit.c:893
>  get_signal+0x47b/0x24e0 kernel/signal.c:2739
>  do_signal+0x81/0x2240 arch/x86/kernel/signal.c:784
>  exit_to_usermode_loop+0x26c/0x360 arch/x86/entry/common.c:161
>  prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
>  syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
>  do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305
>  entry_SYSCALL_64_after_hwframe+0x49/0xb3
> RIP: 0033:0x4166ca
> Code: Bad RIP value.
> RSP: 002b:00007ffd4022d478 EFLAGS: 00000246 ORIG_RAX: 000000000000003d
> RAX: fffffffffffffe00 RBX: 0000000001d60940 RCX: 00000000004166ca
> RDX: 0000000040000000 RSI: 00007ffd4022d4b0 RDI: ffffffffffffffff
> RBP: 0000000000002996 R08: 0000000000000001 R09: 0000000000000001
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
> R13: 00007ffd4022d4b0 R14: 0000000001d6099b R15: 00007ffd4022d4c0
> 
> Showing all locks held in the system:
> 1 lock held by khungtaskd/1125:
>  #0: ffffffff899beb00 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:5754
> 3 locks held by kworker/1:5/2724:
>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: __write_once_size include/linux/compiler.h:226 [inline]
>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:855 [inline]
>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:615 [inline]
>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:642 [inline]
>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x844/0x16a0 kernel/workqueue.c:2239
>  #1: ffffc90008367dc0 ((linkwatch_work).work){+.+.}-{0:0}, at: process_one_work+0x878/0x16a0 kernel/workqueue.c:2243
>  #2: ffffffff8a582268 (rtnl_mutex){+.+.}-{3:3}, at: linkwatch_event+0xb/0x60 net/core/link_watch.c:242
> 1 lock held by in:imklog/6717:
>  #0: ffff888098d271b0 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:826
> 2 locks held by syz-executor.0/7053:
>  #0: ffffffff8a582268 (rtnl_mutex){+.+.}-{3:3}, at: tun_detach drivers/net/tun.c:704 [inline]
>  #0: ffffffff8a582268 (rtnl_mutex){+.+.}-{3:3}, at: tun_chr_close+0x3a/0x180 drivers/net/tun.c:3413
>  #1: ffffffff89979ad0 (cpu_hotplug_lock){++++}-{0:0}, at: get_online_cpus include/linux/cpu.h:143 [inline]
>  #1: ffffffff89979ad0 (cpu_hotplug_lock){++++}-{0:0}, at: flush_all_backlogs net/core/dev.c:5520 [inline]
>  #1: ffffffff89979ad0 (cpu_hotplug_lock){++++}-{0:0}, at: rollback_registered_many+0x45b/0xe70 net/core/dev.c:8813
> 3 locks held by kworker/1:6/14336:
>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: __write_once_size include/linux/compiler.h:226 [inline]
>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:855 [inline]
>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:615 [inline]
>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:642 [inline]
>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: process_one_work+0x844/0x16a0 kernel/workqueue.c:2239
>  #1: ffffc90004637dc0 ((addr_chk_work).work){+.+.}-{0:0}, at: process_one_work+0x878/0x16a0 kernel/workqueue.c:2243
>  #2: ffffffff8a582268 (rtnl_mutex){+.+.}-{3:3}, at: addrconf_verify_work+0xa/0x20 net/ipv6/addrconf.c:4584
> 
> =============================================
> 
> NMI backtrace for cpu 1
> CPU: 1 PID: 1125 Comm: khungtaskd Not tainted 5.7.0-rc2-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x188/0x20d lib/dump_stack.c:118
>  nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101
>  nmi_trigger_cpumask_backtrace+0x231/0x27e lib/nmi_backtrace.c:62
>  trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
>  check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline]
>  watchdog+0xa8c/0x1010 kernel/hung_task.c:289
>  kthread+0x388/0x470 kernel/kthread.c:268
>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
> Sending NMI from CPU 1 to CPUs 0:
> NMI backtrace for cpu 0
> CPU: 0 PID: 28894 Comm: syz-executor.0 Not tainted 5.7.0-rc2-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:io_ring_ctx_wait_and_kill+0x98/0x5e0 fs/io_uring.c:7329
> Code: 01 00 00 4d 89 f4 48 b8 00 00 00 00 00 fc ff df 4c 89 ed 49 c1 ec 03 48 c1 ed 03 49 01 c4 48 01 c5 eb 1c e8 6a f2 9d ff f3 90 <41> 80 3c 24 00 0f 85 b0 04 00 00 48 83 bb 10 01 00 00 00 74 21 e8
> RSP: 0018:ffffc90004e17a48 EFLAGS: 00000293
> RAX: ffff888091758480 RBX: ffff888094860000 RCX: 1ffff920009c2f36
> RDX: 0000000000000000 RSI: ffffffff81d53c26 RDI: ffff888094860300
> RBP: ffffed101290c02c R08: 0000000000000001 R09: ffffed101290c061
> R10: ffff888094860307 R11: ffffed101290c060 R12: ffffed101290c022
> R13: ffff888094860160 R14: ffff888094860110 R15: ffffffff81d54170
> FS:  00007fac6c1a8700(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000560ad6a654a7 CR3: 0000000009879000 CR4: 00000000001406f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  io_uring_release+0x3e/0x50 fs/io_uring.c:7352
>  __fput+0x33e/0x880 fs/file_table.c:280
>  task_work_run+0xf4/0x1b0 kernel/task_work.c:123
>  exit_task_work include/linux/task_work.h:22 [inline]
>  do_exit+0xb34/0x2dd0 kernel/exit.c:795
>  do_group_exit+0x125/0x340 kernel/exit.c:893
>  get_signal+0x47b/0x24e0 kernel/signal.c:2739
>  do_signal+0x81/0x2240 arch/x86/kernel/signal.c:784
>  exit_to_usermode_loop+0x26c/0x360 arch/x86/entry/common.c:161
>  prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
>  syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
>  do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305
>  entry_SYSCALL_64_after_hwframe+0x49/0xb3
> RIP: 0033:0x45c829
> Code: Bad RIP value.
> RSP: 002b:00007fac6c1a7c78 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9
> RAX: 0000000000000003 RBX: 00000000004e0bc0 RCX: 000000000045c829
> RDX: 0000000000000000 RSI: 0000000020000580 RDI: 00000000000000f1
> RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
> R13: 0000000000000204 R14: 00000000004c425f R15: 00007fac6c1a86d4
> 
> 
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at [email protected].
> 
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection
> syzbot can test patches for this bug, for details see:
> https://goo.gl/tpsmEJ#testing-patches
> .
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: INFO: task hung in linkwatch_event (2)
       [not found] ` <[email protected]>
@ 2020-05-06 11:20   ` Yunsheng Lin
  0 siblings, 0 replies; 7+ messages in thread
From: Yunsheng Lin @ 2020-05-06 11:20 UTC (permalink / raw)
  To: Hillf Danton
  Cc: syzbot, allison, aviad.krawczyk, axboe, davem, gregkh, io-uring,
	kuba, linux-fsdevel, linux-kernel, luobin9, netdev,
	syzkaller-bugs, tglx, viro

On 2020/5/6 12:25, Hillf Danton wrote:
> 
> On Wed, 6 May 2020 09:38:21 Yunsheng Lin wrote:
>>
>> On 2020/4/29 17:59, syzbot wrote:
>>> Hello,
>>>
>>> syzbot found the following crash on:
>>>
>>> HEAD commit:    b4f63322 Merge branch 'for-linus' of git://git.kernel.org/..
>>> git tree:       upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1558936fe00000
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=b7a70e992f2f9b68
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=96ff6cfc4551fcc29342
>>> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=14a57828100000
>>>
>>> The bug was bisected to:
>>>
>>> commit 386d4716fd91869e07c731657f2cde5a33086516
>>> Author: Luo bin <[email protected]>
>>> Date:   Thu Feb 27 06:34:44 2020 +0000
>>>
>>>     hinic: fix a bug of rss configuration
>>
>> The above patch does not seem to be the cause of the crash.
>>
>> From the below call trace, it seems the blocking is caused by
>> the tun_detach() which need to flush the all the pending work
> 
> queued on system_highpri_wq
> 
>> for each online cpu, it is the linkwatch_work that need to be
>> flushed in this crash case.
> 
> Not so sure it's linkwatch_work because it's on system_wq.

Yes, you are right. The work of flush_backlog() is queued on
system_highpri_wq, and maybe that is the work tun_detach() is
trying to flush.

So the tun_detach is flushing a work queued on the system_highpri_wq
while holding the RTNL lock, and linkwatch_event work is running to
try to take the RTNL lock, do they compete for the same worker in the
same cpu even they are queued for different wq? I do not understand
wq very well, if Yes, there may be a dead loop here?

> 
>> But the linkwatch_work() need to take
>> RTNL lock, which is already taken by the tun_detach(), and that is
>> where the blocking is happening.
>>
>> Possible way to fix or avoid this:
>> 1. Call flush_all_backlogs() without holding the RTNL lock, I am not
>>    sure it is safe to do this.
>> 2. Disabling adding link event to the unregisterring netdev, and flush
>>    all the pending link event without taking RTNL lock before calling
>>    unregister_netdevice() in tun_detach().
>>
>> Any better suggestion? Thanks.
>>
> Not before some extra info about what's going on the highpri wq is available.
> 
>>>
>>> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=16626fcfe00000
>>> final crash:    https://syzkaller.appspot.com/x/report.txt?x=15626fcfe00000
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=11626fcfe00000
>>>
>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>> Reported-by: [email protected]
>>> Fixes: 386d4716fd91 ("hinic: fix a bug of rss configuration")
>>>
>>> INFO: task kworker/1:5:2724 blocked for more than 143 seconds.
>>>       Not tainted 5.7.0-rc2-syzkaller #0
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> kworker/1:5     D27416  2724      2 0x80004000
>>> Workqueue: events linkwatch_event
>>> Call Trace:
>>>  schedule+0xd0/0x2a0 kernel/sched/core.c:4163
>>>  schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:4222
>>>  __mutex_lock_common kernel/locking/mutex.c:1033 [inline]
>>>  __mutex_lock+0x7ab/0x13c0 kernel/locking/mutex.c:1103
>>>  linkwatch_event+0xb/0x60 net/core/link_watch.c:242
>>>  process_one_work+0x965/0x16a0 kernel/workqueue.c:2268
>>>  worker_thread+0x96/0xe20 kernel/workqueue.c:2414
>>>  kthread+0x388/0x470 kernel/kthread.c:268
>>>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
>>> INFO: task syz-executor.0:7053 blocked for more than 143 seconds.
>>>       Not tainted 5.7.0-rc2-syzkaller #0
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> syz-executor.0  D23512  7053      1 0x80004006
>>> Call Trace:
>>>  schedule+0xd0/0x2a0 kernel/sched/core.c:4163
>>>  schedule_timeout+0x55b/0x850 kernel/time/timer.c:1874
>>>  do_wait_for_common kernel/sched/completion.c:85 [inline]
>>>  __wait_for_common kernel/sched/completion.c:106 [inline]
>>>  wait_for_common kernel/sched/completion.c:117 [inline]
>>>  wait_for_completion+0x16a/0x270 kernel/sched/completion.c:138
>>>  __flush_work+0x4fd/0xa80 kernel/workqueue.c:3045
>>>  flush_all_backlogs net/core/dev.c:5527 [inline]
>>>  rollback_registered_many+0x562/0xe70 net/core/dev.c:8813
>>>  rollback_registered+0xf2/0x1c0 net/core/dev.c:8873
>>>  unregister_netdevice_queue net/core/dev.c:9969 [inline]
>>>  unregister_netdevice_queue+0x1d7/0x2b0 net/core/dev.c:9962
>>>  unregister_netdevice include/linux/netdevice.h:2725 [inline]
>>>  __tun_detach+0xe42/0x1110 drivers/net/tun.c:690
>>>  tun_detach drivers/net/tun.c:707 [inline]
>>>  tun_chr_close+0xd9/0x180 drivers/net/tun.c:3413
>>>  __fput+0x33e/0x880 fs/file_table.c:280
>>>  task_work_run+0xf4/0x1b0 kernel/task_work.c:123
>>>  exit_task_work include/linux/task_work.h:22 [inline]
>>>  do_exit+0xb34/0x2dd0 kernel/exit.c:795
>>>  do_group_exit+0x125/0x340 kernel/exit.c:893
>>>  get_signal+0x47b/0x24e0 kernel/signal.c:2739
>>>  do_signal+0x81/0x2240 arch/x86/kernel/signal.c:784
>>>  exit_to_usermode_loop+0x26c/0x360 arch/x86/entry/common.c:161
>>>  prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
>>>  syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
>>>  do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305
>>>  entry_SYSCALL_64_after_hwframe+0x49/0xb3
>>> RIP: 0033:0x4166ca
>>> Code: Bad RIP value.
>>> RSP: 002b:00007ffd4022d478 EFLAGS: 00000246 ORIG_RAX: 000000000000003d
>>> RAX: fffffffffffffe00 RBX: 0000000001d60940 RCX: 00000000004166ca
>>> RDX: 0000000040000000 RSI: 00007ffd4022d4b0 RDI: ffffffffffffffff
>>> RBP: 0000000000002996 R08: 0000000000000001 R09: 0000000000000001
>>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
>>> R13: 00007ffd4022d4b0 R14: 0000000001d6099b R15: 00007ffd4022d4c0
>>>
>>> Showing all locks held in the system:
>>> 1 lock held by khungtaskd/1125:
>>>  #0: ffffffff899beb00 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:5754
>>> 3 locks held by kworker/1:5/2724:
>>>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: __write_once_size include/linux/compiler.h:226 [inline]
>>>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>>>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:855 [inline]
>>>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
>>>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:615 [inline]
>>>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:642 [inline]
>>>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x844/0x16a0 kernel/workqueue.c:2239
>>>  #1: ffffc90008367dc0 ((linkwatch_work).work){+.+.}-{0:0}, at: process_one_work+0x878/0x16a0 kernel/workqueue.c:2243
>>>  #2: ffffffff8a582268 (rtnl_mutex){+.+.}-{3:3}, at: linkwatch_event+0xb/0x60 net/core/link_watch.c:242
>>> 1 lock held by in:imklog/6717:
>>>  #0: ffff888098d271b0 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:826
>>> 2 locks held by syz-executor.0/7053:
>>>  #0: ffffffff8a582268 (rtnl_mutex){+.+.}-{3:3}, at: tun_detach drivers/net/tun.c:704 [inline]
>>>  #0: ffffffff8a582268 (rtnl_mutex){+.+.}-{3:3}, at: tun_chr_close+0x3a/0x180 drivers/net/tun.c:3413
>>>  #1: ffffffff89979ad0 (cpu_hotplug_lock){++++}-{0:0}, at: get_online_cpus include/linux/cpu.h:143 [inline]
>>>  #1: ffffffff89979ad0 (cpu_hotplug_lock){++++}-{0:0}, at: flush_all_backlogs net/core/dev.c:5520 [inline]
>>>  #1: ffffffff89979ad0 (cpu_hotplug_lock){++++}-{0:0}, at: rollback_registered_many+0x45b/0xe70 net/core/dev.c:8813
>>> 3 locks held by kworker/1:6/14336:
>>>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: __write_once_size include/linux/compiler.h:226 [inline]
>>>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>>>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:855 [inline]
>>>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
>>>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:615 [inline]
>>>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:642 [inline]
>>>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: process_one_work+0x844/0x16a0 kernel/workqueue.c:2239
>>>  #1: ffffc90004637dc0 ((addr_chk_work).work){+.+.}-{0:0}, at: process_one_work+0x878/0x16a0 kernel/workqueue.c:2243
>>>  #2: ffffffff8a582268 (rtnl_mutex){+.+.}-{3:3}, at: addrconf_verify_work+0xa/0x20 net/ipv6/addrconf.c:4584
>>>
>>> =============================================
>>>
>>> NMI backtrace for cpu 1
>>> CPU: 1 PID: 1125 Comm: khungtaskd Not tainted 5.7.0-rc2-syzkaller #0
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>> Call Trace:
>>>  __dump_stack lib/dump_stack.c:77 [inline]
>>>  dump_stack+0x188/0x20d lib/dump_stack.c:118
>>>  nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101
>>>  nmi_trigger_cpumask_backtrace+0x231/0x27e lib/nmi_backtrace.c:62
>>>  trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
>>>  check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline]
>>>  watchdog+0xa8c/0x1010 kernel/hung_task.c:289
>>>  kthread+0x388/0x470 kernel/kthread.c:268
>>>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
>>> Sending NMI from CPU 1 to CPUs 0:
>>> NMI backtrace for cpu 0
>>> CPU: 0 PID: 28894 Comm: syz-executor.0 Not tainted 5.7.0-rc2-syzkaller #0
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>> RIP: 0010:io_ring_ctx_wait_and_kill+0x98/0x5e0 fs/io_uring.c:7329
>>> Code: 01 00 00 4d 89 f4 48 b8 00 00 00 00 00 fc ff df 4c 89 ed 49 c1 ec 03 48 c1 ed 03 49 01 c4 48 01 c5 eb 1c e8 6a f2 9d ff f3 90 <41> 80 3c 24 00 0f 85 b0 04 00 00 48 83 bb 10 01 00 00 00 74 21 e8
>>> RSP: 0018:ffffc90004e17a48 EFLAGS: 00000293
>>> RAX: ffff888091758480 RBX: ffff888094860000 RCX: 1ffff920009c2f36
>>> RDX: 0000000000000000 RSI: ffffffff81d53c26 RDI: ffff888094860300
>>> RBP: ffffed101290c02c R08: 0000000000000001 R09: ffffed101290c061
>>> R10: ffff888094860307 R11: ffffed101290c060 R12: ffffed101290c022
>>> R13: ffff888094860160 R14: ffff888094860110 R15: ffffffff81d54170
>>> FS:  00007fac6c1a8700(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000
>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 0000560ad6a654a7 CR3: 0000000009879000 CR4: 00000000001406f0
>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>> Call Trace:
>>>  io_uring_release+0x3e/0x50 fs/io_uring.c:7352
>>>  __fput+0x33e/0x880 fs/file_table.c:280
>>>  task_work_run+0xf4/0x1b0 kernel/task_work.c:123
>>>  exit_task_work include/linux/task_work.h:22 [inline]
>>>  do_exit+0xb34/0x2dd0 kernel/exit.c:795
>>>  do_group_exit+0x125/0x340 kernel/exit.c:893
>>>  get_signal+0x47b/0x24e0 kernel/signal.c:2739
>>>  do_signal+0x81/0x2240 arch/x86/kernel/signal.c:784
>>>  exit_to_usermode_loop+0x26c/0x360 arch/x86/entry/common.c:161
>>>  prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
>>>  syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
>>>  do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305
>>>  entry_SYSCALL_64_after_hwframe+0x49/0xb3
>>> RIP: 0033:0x45c829
>>> Code: Bad RIP value.
>>> RSP: 002b:00007fac6c1a7c78 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9
>>> RAX: 0000000000000003 RBX: 00000000004e0bc0 RCX: 000000000045c829
>>> RDX: 0000000000000000 RSI: 0000000020000580 RDI: 00000000000000f1
>>> RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
>>> R13: 0000000000000204 R14: 00000000004c425f R15: 00007fac6c1a86d4
>>>
>>>
>>> ---
>>> This bug is generated by a bot. It may contain errors.
>>> See https://goo.gl/tpsmEJ for more information about syzbot.
>>> syzbot engineers can be reached at [email protected].
>>>
>>> syzbot will keep track of this bug report. See:
>>> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>>> For information about bisection process see: https://goo.gl/tpsmEJ#bisection
>>> syzbot can test patches for this bug, for details see:
>>> https://goo.gl/tpsmEJ#testing-patches
> 
> .
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: INFO: task hung in linkwatch_event (2)
       [not found] <[email protected]>
@ 2020-05-06 11:22 ` Yunsheng Lin
  0 siblings, 0 replies; 7+ messages in thread
From: Yunsheng Lin @ 2020-05-06 11:22 UTC (permalink / raw)
  To: Hillf Danton, syzbot
  Cc: allison, aviad.krawczyk, axboe, davem, gregkh, io-uring, kuba,
	linux-fsdevel, linux-kernel, luobin9, netdev, syzkaller-bugs,
	tglx, viro, xiaoguang.wang, [email protected]

+cc Xiaoguang & Jens
On 2020/5/6 14:56, Hillf Danton wrote:
> 
> Wed, 29 Apr 2020 02:59:13 -0700
>> syzbot found the following crash on:
>>
>> HEAD commit:    b4f63322 Merge branch 'for-linus' of git://git.kernel.org/..
>> git tree:       upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=1558936fe00000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=b7a70e992f2f9b68
>> dashboard link: https://syzkaller.appspot.com/bug?extid=96ff6cfc4551fcc29342
>> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=14a57828100000
>>
>> The bug was bisected to:
>>
>> commit 386d4716fd91869e07c731657f2cde5a33086516
>> Author: Luo bin <[email protected]>
>> Date:   Thu Feb 27 06:34:44 2020 +0000
>>
>>     hinic: fix a bug of rss configuration
>>
>> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=16626fcfe00000
>> final crash:    https://syzkaller.appspot.com/x/report.txt?x=15626fcfe00000
>> console output: https://syzkaller.appspot.com/x/log.txt?x=11626fcfe00000
>>
>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: [email protected]
>> Fixes: 386d4716fd91 ("hinic: fix a bug of rss configuration")
>>
>> INFO: task kworker/1:5:2724 blocked for more than 143 seconds.
>>       Not tainted 5.7.0-rc2-syzkaller #0
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> kworker/1:5     D27416  2724      2 0x80004000
>> Workqueue: events linkwatch_event
>> Call Trace:
>>  schedule+0xd0/0x2a0 kernel/sched/core.c:4163
>>  schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:4222
>>  __mutex_lock_common kernel/locking/mutex.c:1033 [inline]
>>  __mutex_lock+0x7ab/0x13c0 kernel/locking/mutex.c:1103
>>  linkwatch_event+0xb/0x60 net/core/link_watch.c:242
>>  process_one_work+0x965/0x16a0 kernel/workqueue.c:2268
>>  worker_thread+0x96/0xe20 kernel/workqueue.c:2414
>>  kthread+0x388/0x470 kernel/kthread.c:268
>>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
>> INFO: task syz-executor.0:7053 blocked for more than 143 seconds.
>>       Not tainted 5.7.0-rc2-syzkaller #0
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> syz-executor.0  D23512  7053      1 0x80004006
>> Call Trace:
>>  schedule+0xd0/0x2a0 kernel/sched/core.c:4163
>>  schedule_timeout+0x55b/0x850 kernel/time/timer.c:1874
>>  do_wait_for_common kernel/sched/completion.c:85 [inline]
>>  __wait_for_common kernel/sched/completion.c:106 [inline]
>>  wait_for_common kernel/sched/completion.c:117 [inline]
>>  wait_for_completion+0x16a/0x270 kernel/sched/completion.c:138
>>  __flush_work+0x4fd/0xa80 kernel/workqueue.c:3045
>>  flush_all_backlogs net/core/dev.c:5527 [inline]
>>  rollback_registered_many+0x562/0xe70 net/core/dev.c:8813
>>  rollback_registered+0xf2/0x1c0 net/core/dev.c:8873
>>  unregister_netdevice_queue net/core/dev.c:9969 [inline]
>>  unregister_netdevice_queue+0x1d7/0x2b0 net/core/dev.c:9962
>>  unregister_netdevice include/linux/netdevice.h:2725 [inline]
>>  __tun_detach+0xe42/0x1110 drivers/net/tun.c:690
>>  tun_detach drivers/net/tun.c:707 [inline]
>>  tun_chr_close+0xd9/0x180 drivers/net/tun.c:3413
>>  __fput+0x33e/0x880 fs/file_table.c:280
>>  task_work_run+0xf4/0x1b0 kernel/task_work.c:123
>>  exit_task_work include/linux/task_work.h:22 [inline]
>>  do_exit+0xb34/0x2dd0 kernel/exit.c:795
>>  do_group_exit+0x125/0x340 kernel/exit.c:893
>>  get_signal+0x47b/0x24e0 kernel/signal.c:2739
>>  do_signal+0x81/0x2240 arch/x86/kernel/signal.c:784
>>  exit_to_usermode_loop+0x26c/0x360 arch/x86/entry/common.c:161
>>  prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
>>  syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
>>  do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305
>>  entry_SYSCALL_64_after_hwframe+0x49/0xb3
>> RIP: 0033:0x4166ca
>> Code: Bad RIP value.
>> RSP: 002b:00007ffd4022d478 EFLAGS: 00000246 ORIG_RAX: 000000000000003d
>> RAX: fffffffffffffe00 RBX: 0000000001d60940 RCX: 00000000004166ca
>> RDX: 0000000040000000 RSI: 00007ffd4022d4b0 RDI: ffffffffffffffff
>> RBP: 0000000000002996 R08: 0000000000000001 R09: 0000000000000001
>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
>> R13: 00007ffd4022d4b0 R14: 0000000001d6099b R15: 00007ffd4022d4c0
>>
>> Showing all locks held in the system:
>> 1 lock held by khungtaskd/1125:
>>  #0: ffffffff899beb00 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:5754
>> 3 locks held by kworker/1:5/2724:
>>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: __write_once_size include/linux/compiler.h:226 [inline]
>>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:855 [inline]
>>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
>>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:615 [inline]
>>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:642 [inline]
>>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x844/0x16a0 kernel/workqueue.c:2239
>>  #1: ffffc90008367dc0 ((linkwatch_work).work){+.+.}-{0:0}, at: process_one_work+0x878/0x16a0 kernel/workqueue.c:2243
>>  #2: ffffffff8a582268 (rtnl_mutex){+.+.}-{3:3}, at: linkwatch_event+0xb/0x60 net/core/link_watch.c:242
>> 1 lock held by in:imklog/6717:
>>  #0: ffff888098d271b0 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:826
>> 2 locks held by syz-executor.0/7053:
>>  #0: ffffffff8a582268 (rtnl_mutex){+.+.}-{3:3}, at: tun_detach drivers/net/tun.c:704 [inline]
>>  #0: ffffffff8a582268 (rtnl_mutex){+.+.}-{3:3}, at: tun_chr_close+0x3a/0x180 drivers/net/tun.c:3413
>>  #1: ffffffff89979ad0 (cpu_hotplug_lock){++++}-{0:0}, at: get_online_cpus include/linux/cpu.h:143 [inline]
>>  #1: ffffffff89979ad0 (cpu_hotplug_lock){++++}-{0:0}, at: flush_all_backlogs net/core/dev.c:5520 [inline]
>>  #1: ffffffff89979ad0 (cpu_hotplug_lock){++++}-{0:0}, at: rollback_registered_many+0x45b/0xe70 net/core/dev.c:8813
>> 3 locks held by kworker/1:6/14336:
>>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: __write_once_size include/linux/compiler.h:226 [inline]
>>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:855 [inline]
>>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
>>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:615 [inline]
>>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:642 [inline]
>>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: process_one_work+0x844/0x16a0 kernel/workqueue.c:2239
>>  #1: ffffc90004637dc0 ((addr_chk_work).work){+.+.}-{0:0}, at: process_one_work+0x878/0x16a0 kernel/workqueue.c:2243
>>  #2: ffffffff8a582268 (rtnl_mutex){+.+.}-{3:3}, at: addrconf_verify_work+0xa/0x20 net/ipv6/addrconf.c:4584
>>
>> =============================================
>>
>> NMI backtrace for cpu 1
>> CPU: 1 PID: 1125 Comm: khungtaskd Not tainted 5.7.0-rc2-syzkaller #0
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:77 [inline]
>>  dump_stack+0x188/0x20d lib/dump_stack.c:118
>>  nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101
>>  nmi_trigger_cpumask_backtrace+0x231/0x27e lib/nmi_backtrace.c:62
>>  trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
>>  check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline]
>>  watchdog+0xa8c/0x1010 kernel/hung_task.c:289
>>  kthread+0x388/0x470 kernel/kthread.c:268
>>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
>> Sending NMI from CPU 1 to CPUs 0:
>> NMI backtrace for cpu 0
>> CPU: 0 PID: 28894 Comm: syz-executor.0 Not tainted 5.7.0-rc2-syzkaller #0
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>> RIP: 0010:io_ring_ctx_wait_and_kill+0x98/0x5e0 fs/io_uring.c:7329
> 
> Suspect 3fd44c86711f ("io_uring: use cond_resched() in
> io_ring_ctx_wait_and_kill()") is the right cure.
> 
>> Code: 01 00 00 4d 89 f4 48 b8 00 00 00 00 00 fc ff df 4c 89 ed 49 c1 ec 03 48 c1 ed 03 49 01 c4 48 01 c5 eb 1c e8 6a f2 9d ff f3 90 <41> 80 3c 24 00 0f 85 b0 04 00 00 48 83 bb 10 01 00 00 00 74 21 e8
>> RSP: 0018:ffffc90004e17a48 EFLAGS: 00000293
>> RAX: ffff888091758480 RBX: ffff888094860000 RCX: 1ffff920009c2f36
>> RDX: 0000000000000000 RSI: ffffffff81d53c26 RDI: ffff888094860300
>> RBP: ffffed101290c02c R08: 0000000000000001 R09: ffffed101290c061
>> R10: ffff888094860307 R11: ffffed101290c060 R12: ffffed101290c022
>> R13: ffff888094860160 R14: ffff888094860110 R15: ffffffff81d54170
>> FS:  00007fac6c1a8700(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000560ad6a654a7 CR3: 0000000009879000 CR4: 00000000001406f0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> Call Trace:
>>  io_uring_release+0x3e/0x50 fs/io_uring.c:7352
>>  __fput+0x33e/0x880 fs/file_table.c:280
>>  task_work_run+0xf4/0x1b0 kernel/task_work.c:123
>>  exit_task_work include/linux/task_work.h:22 [inline]
>>  do_exit+0xb34/0x2dd0 kernel/exit.c:795
>>  do_group_exit+0x125/0x340 kernel/exit.c:893
>>  get_signal+0x47b/0x24e0 kernel/signal.c:2739
>>  do_signal+0x81/0x2240 arch/x86/kernel/signal.c:784
>>  exit_to_usermode_loop+0x26c/0x360 arch/x86/entry/common.c:161
>>  prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
>>  syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
>>  do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305
>>  entry_SYSCALL_64_after_hwframe+0x49/0xb3
>> RIP: 0033:0x45c829
>> Code: Bad RIP value.
>> RSP: 002b:00007fac6c1a7c78 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9
>> RAX: 0000000000000003 RBX: 00000000004e0bc0 RCX: 000000000045c829
>> RDX: 0000000000000000 RSI: 0000000020000580 RDI: 00000000000000f1
>> RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
>> R13: 0000000000000204 R14: 00000000004c425f R15: 00007fac6c1a86d4
>>
>>
>> ---
>> This bug is generated by a bot. It may contain errors.
>> See https://goo.gl/tpsmEJ for more information about syzbot.
>> syzbot engineers can be reached at [email protected].
>>
>> syzbot will keep track of this bug report. See:
>> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>> For information about bisection process see: https://goo.gl/tpsmEJ#bisection
>> syzbot can test patches for this bug, for details see:
>> https://goo.gl/tpsmEJ#testing-patches
> 
> .
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: INFO: task hung in linkwatch_event (2)
  2020-04-29  9:59 INFO: task hung in linkwatch_event (2) syzbot
  2020-05-06  1:38 ` Yunsheng Lin
       [not found] ` <[email protected]>
@ 2020-12-11  2:25 ` syzbot
  2022-04-05  7:38 ` [syzbot] " syzbot
  3 siblings, 0 replies; 7+ messages in thread
From: syzbot @ 2020-12-11  2:25 UTC (permalink / raw)
  To: allison, andrew, aviad.krawczyk, axboe, davem, gregkh, hdanton,
	io-uring, kuba, linux-fsdevel, linux-kernel, linyunsheng, luobin9,
	netdev, syzkaller-bugs, tglx, viro, xiaoguang.wang

syzbot has found a reproducer for the following issue on:

HEAD commit:    a7105e34 Merge branch 'hns3-next'
git tree:       net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=155af80f500000
kernel config:  https://syzkaller.appspot.com/x/.config?x=2ac2dabe250b3a58
dashboard link: https://syzkaller.appspot.com/bug?extid=96ff6cfc4551fcc29342
compiler:       gcc (GCC) 10.1.0-syz 20200507
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=11bc7b13500000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1674046b500000

The issue was bisected to:

commit 386d4716fd91869e07c731657f2cde5a33086516
Author: Luo bin <[email protected]>
Date:   Thu Feb 27 06:34:44 2020 +0000

    hinic: fix a bug of rss configuration

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=16626fcfe00000
final oops:     https://syzkaller.appspot.com/x/report.txt?x=15626fcfe00000
console output: https://syzkaller.appspot.com/x/log.txt?x=11626fcfe00000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]
Fixes: 386d4716fd91 ("hinic: fix a bug of rss configuration")

INFO: task kworker/0:2:3004 blocked for more than 143 seconds.
      Not tainted 5.10.0-rc6-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/0:2     state:D stack:28448 pid: 3004 ppid:     2 flags:0x00004000
Workqueue: events linkwatch_event
Call Trace:
 context_switch kernel/sched/core.c:3779 [inline]
 __schedule+0x893/0x2130 kernel/sched/core.c:4528
 schedule+0xcf/0x270 kernel/sched/core.c:4606
 schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:4665
 __mutex_lock_common kernel/locking/mutex.c:1033 [inline]
 __mutex_lock+0x3e2/0x10e0 kernel/locking/mutex.c:1103
 linkwatch_event+0xb/0x60 net/core/link_watch.c:250
 process_one_work+0x933/0x15a0 kernel/workqueue.c:2272
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2418
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
INFO: task kworker/0:0:8837 blocked for more than 143 seconds.
      Not tainted 5.10.0-rc6-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/0:0     state:D stack:29768 pid: 8837 ppid:     2 flags:0x00004000
Workqueue: ipv6_addrconf addrconf_verify_work
Call Trace:
 context_switch kernel/sched/core.c:3779 [inline]
 __schedule+0x893/0x2130 kernel/sched/core.c:4528
 schedule+0xcf/0x270 kernel/sched/core.c:4606
 schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:4665
 __mutex_lock_common kernel/locking/mutex.c:1033 [inline]
 __mutex_lock+0x3e2/0x10e0 kernel/locking/mutex.c:1103
 addrconf_verify_work+0xa/0x20 net/ipv6/addrconf.c:4569
 process_one_work+0x933/0x15a0 kernel/workqueue.c:2272
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2418
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296

Showing all locks held in the system:
1 lock held by khungtaskd/1655:
 #0: ffffffff8b337a20 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:6254
3 locks held by kworker/0:2/3004:
 #0: ffff888010064d38 ((wq_completion)events){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff888010064d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: ffff888010064d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: ffff888010064d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:616 [inline]
 #0: ffff888010064d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: ffff888010064d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x821/0x15a0 kernel/workqueue.c:2243
 #1: ffffc90001dafda8 ((linkwatch_work).work){+.+.}-{0:0}, at: process_one_work+0x854/0x15a0 kernel/workqueue.c:2247
 #2: ffffffff8c92d448 (rtnl_mutex){+.+.}-{3:3}, at: linkwatch_event+0xb/0x60 net/core/link_watch.c:250
1 lock held by in:imklog/8186:
 #0: ffff888017c900f0 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:932
2 locks held by syz-executor047/8830:
3 locks held by kworker/0:0/8837:
 #0: ffff888147499138 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff888147499138 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: ffff888147499138 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: ffff888147499138 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:616 [inline]
 #0: ffff888147499138 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: ffff888147499138 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: process_one_work+0x821/0x15a0 kernel/workqueue.c:2243
 #1: ffffc90001aefda8 ((addr_chk_work).work){+.+.}-{0:0}, at: process_one_work+0x854/0x15a0 kernel/workqueue.c:2247
 #2: ffffffff8c92d448 (rtnl_mutex){+.+.}-{3:3}, at: addrconf_verify_work+0xa/0x20 net/ipv6/addrconf.c:4569

=============================================

NMI backtrace for cpu 1
CPU: 1 PID: 1655 Comm: khungtaskd Not tainted 5.10.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:118
 nmi_cpu_backtrace.cold+0x44/0xd7 lib/nmi_backtrace.c:105
 nmi_trigger_cpumask_backtrace+0x1b3/0x230 lib/nmi_backtrace.c:62
 trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
 check_hung_uninterruptible_tasks kernel/hung_task.c:209 [inline]
 watchdog+0xd43/0xfa0 kernel/hung_task.c:294
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 8830 Comm: syz-executor047 Not tainted 5.10.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__this_cpu_preempt_check+0x0/0x20 lib/smp_processor_id.c:64
Code: 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 c7 c6 00 ae 9d 89 48 c7 c7 40 ae 9d 89 e9 b8 fe ff ff 0f 1f 84 00 00 00 00 00 <55> 48 89 fd 0f 1f 44 00 00 48 89 ee 5d 48 c7 c7 80 ae 9d 89 e9 97
RSP: 0018:ffffc90001a2eb50 EFLAGS: 00000082
RAX: 0000000000000001 RBX: 1ffff92000345d6d RCX: 0000000000000001
RDX: 1ffff11002f507b2 RSI: 0000000000000008 RDI: ffffffff894b60c0
RBP: 0000000000000001 R08: 0000000000000000 R09: ffffffff8ebb6727
R10: fffffbfff1d76ce4 R11: 0000000000000000 R12: 0000000000000000
R13: ffff88801433fa68 R14: 0000000000000000 R15: 0000000000000000
FS:  00007fc7c7ab9700(0000) GS:ffff8880b9e00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fc7c7a97e78 CR3: 000000001292b000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 lockdep_recursion_finish kernel/locking/lockdep.c:437 [inline]
 lock_acquire kernel/locking/lockdep.c:5439 [inline]
 lock_acquire+0x2ad/0x740 kernel/locking/lockdep.c:5402
 __mutex_lock_common kernel/locking/mutex.c:956 [inline]
 __mutex_lock+0x134/0x10e0 kernel/locking/mutex.c:1103
 tcf_idr_check_alloc+0x78/0x3b0 net/sched/act_api.c:549
 tcf_police_init+0x347/0x13a0 net/sched/act_police.c:81
 tcf_action_init_1+0x1a3/0x990 net/sched/act_api.c:1013
 tcf_exts_validate+0x138/0x420 net/sched/cls_api.c:3046
 cls_bpf_set_parms net/sched/cls_bpf.c:422 [inline]
 cls_bpf_change+0x60b/0x1b80 net/sched/cls_bpf.c:506
 tc_new_tfilter+0x1394/0x2120 net/sched/cls_api.c:2127
 rtnetlink_rcv_msg+0x80e/0xad0 net/core/rtnetlink.c:5553
 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2494
 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
 netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1330
 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1919
 sock_sendmsg_nosec net/socket.c:651 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:671
 ____sys_sendmsg+0x331/0x810 net/socket.c:2331
 ___sys_sendmsg+0xf3/0x170 net/socket.c:2385
 __sys_sendmmsg+0x195/0x470 net/socket.c:2475
 __do_sys_sendmmsg net/socket.c:2504 [inline]
 __se_sys_sendmmsg net/socket.c:2501 [inline]
 __x64_sys_sendmmsg+0x99/0x100 net/socket.c:2501
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x447219
Code: e8 bc b4 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fc7c7ab8d98 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
RAX: ffffffffffffffda RBX: 00000000006dcc88 RCX: 0000000000447219
RDX: 010efe10675dec16 RSI: 0000000020000200 RDI: 0000000000000004
RBP: 00000000006dcc80 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dcc8c
R13: 0000000000000000 R14: 0000000000000000 R15: 0507002400000074


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] INFO: task hung in linkwatch_event (2)
  2020-04-29  9:59 INFO: task hung in linkwatch_event (2) syzbot
                   ` (2 preceding siblings ...)
  2020-12-11  2:25 ` syzbot
@ 2022-04-05  7:38 ` syzbot
  2022-05-12 13:26   ` Dmitry Vyukov
  3 siblings, 1 reply; 7+ messages in thread
From: syzbot @ 2022-04-05  7:38 UTC (permalink / raw)
  To: allison, andrew, aviad.krawczyk, axboe, davem, gregkh, hdanton,
	io-uring, johannes.berg, johannes, kuba, linux-fsdevel,
	linux-kernel, linux-wireless, linyunsheng, luobin9, netdev,
	pabeni, phind.uet, syzkaller-bugs, tglx, viro, xiaoguang.wang

syzbot suspects this issue was fixed by commit:

commit 563fbefed46ae4c1f70cffb8eb54c02df480b2c2
Author: Nguyen Dinh Phi <[email protected]>
Date:   Wed Oct 27 17:37:22 2021 +0000

    cfg80211: call cfg80211_stop_ap when switch from P2P_GO type

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1048725f700000
start commit:   dd86e7fa07a3 Merge tag 'pci-v5.11-fixes-2' of git://git.ke..
git tree:       upstream
kernel config:  https://syzkaller.appspot.com/x/.config?x=e83e68d0a6aba5f6
dashboard link: https://syzkaller.appspot.com/bug?extid=96ff6cfc4551fcc29342
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=11847bc4d00000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1267e5a0d00000

If the result looks correct, please mark the issue as fixed by replying with:

#syz fix: cfg80211: call cfg80211_stop_ap when switch from P2P_GO type

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] INFO: task hung in linkwatch_event (2)
  2022-04-05  7:38 ` [syzbot] " syzbot
@ 2022-05-12 13:26   ` Dmitry Vyukov
  0 siblings, 0 replies; 7+ messages in thread
From: Dmitry Vyukov @ 2022-05-12 13:26 UTC (permalink / raw)
  To: syzbot
  Cc: allison, andrew, aviad.krawczyk, axboe, davem, gregkh, hdanton,
	io-uring, johannes.berg, johannes, kuba, linux-fsdevel,
	linux-kernel, linux-wireless, linyunsheng, luobin9, netdev,
	pabeni, phind.uet, syzkaller-bugs, tglx, viro, xiaoguang.wang

On Tue, 5 Apr 2022 at 09:38, syzbot
<[email protected]> wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit 563fbefed46ae4c1f70cffb8eb54c02df480b2c2
> Author: Nguyen Dinh Phi <[email protected]>
> Date:   Wed Oct 27 17:37:22 2021 +0000
>
>     cfg80211: call cfg80211_stop_ap when switch from P2P_GO type
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1048725f700000
> start commit:   dd86e7fa07a3 Merge tag 'pci-v5.11-fixes-2' of git://git.ke..
> git tree:       upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=e83e68d0a6aba5f6
> dashboard link: https://syzkaller.appspot.com/bug?extid=96ff6cfc4551fcc29342
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=11847bc4d00000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1267e5a0d00000
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: cfg80211: call cfg80211_stop_ap when switch from P2P_GO type

Looks possible:

#syz fix: cfg80211: call cfg80211_stop_ap when switch from P2P_GO type

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-05-12 13:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-04-29  9:59 INFO: task hung in linkwatch_event (2) syzbot
2020-05-06  1:38 ` Yunsheng Lin
     [not found] ` <[email protected]>
2020-05-06 11:20   ` Yunsheng Lin
2020-12-11  2:25 ` syzbot
2022-04-05  7:38 ` [syzbot] " syzbot
2022-05-12 13:26   ` Dmitry Vyukov
     [not found] <[email protected]>
2020-05-06 11:22 ` Yunsheng Lin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox