public inbox for [email protected]
 help / color / mirror / Atom feed
* INFO: task hung in __io_uring_files_cancel
@ 2020-11-21 22:35 syzbot
       [not found] ` <[email protected]>
  2020-11-24  9:39 ` syzbot
  0 siblings, 2 replies; 5+ messages in thread
From: syzbot @ 2020-11-21 22:35 UTC (permalink / raw)
  To: axboe, io-uring, linux-fsdevel, linux-kernel, mingo, mingo,
	peterz, rostedt, syzkaller-bugs, viro, will

Hello,

syzbot found the following issue on:

HEAD commit:    7c8ca812 Add linux-next specific files for 20201117
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=12d2d191500000
kernel config:  https://syzkaller.appspot.com/x/.config?x=ff4bc71371dc5b13
dashboard link: https://syzkaller.appspot.com/bug?extid=c0d52d0b3c0c3ffb9525
compiler:       gcc (GCC) 10.1.0-syz 20200507
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1481166a500000

The issue was bisected to:

commit 4d004099a668c41522242aa146a38cc4eb59cb1e
Author: Peter Zijlstra <[email protected]>
Date:   Fri Oct 2 09:04:21 2020 +0000

    lockdep: Fix lockdep recursion

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=10401726500000
final oops:     https://syzkaller.appspot.com/x/report.txt?x=12401726500000
console output: https://syzkaller.appspot.com/x/log.txt?x=14401726500000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]
Fixes: 4d004099a668 ("lockdep: Fix lockdep recursion")

INFO: task syz-executor.0:9557 blocked for more than 143 seconds.
      Not tainted 5.10.0-rc4-next-20201117-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor.0  state:D stack:28584 pid: 9557 ppid:  8485 flags:0x00004002
Call Trace:
 context_switch kernel/sched/core.c:4269 [inline]
 __schedule+0x890/0x2030 kernel/sched/core.c:5019
 schedule+0xcf/0x270 kernel/sched/core.c:5098
 io_uring_cancel_files fs/io_uring.c:8720 [inline]
 io_uring_cancel_task_requests fs/io_uring.c:8772 [inline]
 __io_uring_files_cancel+0xc4d/0x14b0 fs/io_uring.c:8868
 io_uring_files_cancel include/linux/io_uring.h:51 [inline]
 exit_files+0xe4/0x170 fs/file.c:456
 do_exit+0xb61/0x29f0 kernel/exit.c:818
 do_group_exit+0x125/0x310 kernel/exit.c:920
 get_signal+0x3ea/0x1f70 kernel/signal.c:2750
 arch_do_signal_or_restart+0x2a6/0x1ea0 arch/x86/kernel/signal.c:811
 handle_signal_work kernel/entry/common.c:145 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:169 [inline]
 exit_to_user_mode_prepare+0x124/0x200 kernel/entry/common.c:199
 syscall_exit_to_user_mode+0x38/0x260 kernel/entry/common.c:274
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45deb9
Code: Unable to access opcode bytes at RIP 0x45de8f.
RSP: 002b:00007fa68397ccf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 000000000118bf28 RCX: 000000000045deb9
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 000000000118bf28
RBP: 000000000118bf20 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000118bf2c
R13: 00007fff50acc9af R14: 00007fa68397d9c0 R15: 000000000118bf2c

Showing all locks held in the system:
1 lock held by khungtaskd/1654:
 #0: ffffffff8b339ca0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:6252
1 lock held by in:imklog/8177:
 #0: ffff88801cf22d70 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:932

=============================================

NMI backtrace for cpu 1
CPU: 1 PID: 1654 Comm: khungtaskd Not tainted 5.10.0-rc4-next-20201117-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 nmi_cpu_backtrace.cold+0x44/0xd7 lib/nmi_backtrace.c:105
 nmi_trigger_cpumask_backtrace+0x1b3/0x230 lib/nmi_backtrace.c:62
 trigger_all_cpu_backtrace include/linux/nmi.h:147 [inline]
 check_hung_uninterruptible_tasks kernel/hung_task.c:253 [inline]
 watchdog+0xd89/0xf30 kernel/hung_task.c:338
 kthread+0x3af/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 4887 Comm: systemd-journal Not tainted 5.10.0-rc4-next-20201117-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__sanitizer_cov_trace_const_cmp4+0x4/0x20 kernel/kcov.c:284
Code: 84 00 00 00 00 00 48 8b 0c 24 0f b7 d6 0f b7 f7 bf 03 00 00 00 e9 cc fe ff ff 66 90 66 2e 0f 1f 84 00 00 00 00 00 48 8b 0c 24 <89> f2 89 fe bf 05 00 00 00 e9 ae fe ff ff 0f 1f 40 00 66 2e 0f 1f
RSP: 0018:ffffc90000f0fe90 EFLAGS: 00000206
RAX: 0000000000000000 RBX: 0000000000080042 RCX: ffffffff81bd2a21
RDX: ffff88802594b500 RSI: 0000000000000040 RDI: 0000000000000000
RBP: 1ffff920001e1fd3 R08: 0000000000000000 R09: ffff8880110f3c17
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000040
R13: 00000000000001a0 R14: 00005616bdf3c270 R15: 0000000000000000
FS:  00007f934ff098c0(0000) GS:ffff8880b9e00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000a47b58 CR3: 0000000026162000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 build_open_how fs/open.c:989 [inline]
 do_sys_open fs/open.c:1183 [inline]
 __do_sys_open fs/open.c:1192 [inline]
 __se_sys_open fs/open.c:1188 [inline]
 __x64_sys_open+0x171/0x1c0 fs/open.c:1188
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f934f499840
Code: 73 01 c3 48 8b 0d 68 77 20 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 89 bb 20 00 00 75 10 b8 02 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 1e f6 ff ff 48 89 04 24
RSP: 002b:00007ffc6cf30568 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
RAX: ffffffffffffffda RBX: 00007ffc6cf30870 RCX: 00007f934f499840
RDX: 00000000000001a0 RSI: 0000000000080042 RDI: 00005616bdf3c270
RBP: 000000000000000d R08: 0000000000000000 R09: 00000000ffffffff
R10: 0000000000000069 R11: 0000000000000246 R12: 00000000ffffffff
R13: 00005616bdf30040 R14: 00007ffc6cf30830 R15: 00005616bdf3cef0


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: INFO: task hung in __io_uring_files_cancel
       [not found] ` <[email protected]>
@ 2020-11-22  3:32   ` Pavel Begunkov
       [not found]   ` <[email protected]>
  1 sibling, 0 replies; 5+ messages in thread
From: Pavel Begunkov @ 2020-11-22  3:32 UTC (permalink / raw)
  To: Hillf Danton, syzbot
  Cc: axboe, io-uring, linux-fsdevel, linux-kernel, peterz,
	syzkaller-bugs, viro, will

On 22/11/2020 03:04, Hillf Danton wrote:
> On Sat, 21 Nov 2020 14:35:15 -0800
>> syzbot found the following issue on:
>>
>> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=10401726500000
>> final oops:     https://syzkaller.appspot.com/x/report.txt?x=12401726500000
>> console output: https://syzkaller.appspot.com/x/log.txt?x=14401726500000
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: [email protected]
>> Fixes: 4d004099a668 ("lockdep: Fix lockdep recursion")
>>
>> INFO: task syz-executor.0:9557 blocked for more than 143 seconds.
>>       Not tainted 5.10.0-rc4-next-20201117-syzkaller #0
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> task:syz-executor.0  state:D stack:28584 pid: 9557 ppid:  8485 flags:0x00004002
>> Call Trace:
>>  context_switch kernel/sched/core.c:4269 [inline]
>>  __schedule+0x890/0x2030 kernel/sched/core.c:5019
>>  schedule+0xcf/0x270 kernel/sched/core.c:5098
>>  io_uring_cancel_files fs/io_uring.c:8720 [inline]
>>  io_uring_cancel_task_requests fs/io_uring.c:8772 [inline]
>>  __io_uring_files_cancel+0xc4d/0x14b0 fs/io_uring.c:8868
>>  io_uring_files_cancel include/linux/io_uring.h:51 [inline]
>>  exit_files+0xe4/0x170 fs/file.c:456
>>  do_exit+0xb61/0x29f0 kernel/exit.c:818
>>  do_group_exit+0x125/0x310 kernel/exit.c:920
>>  get_signal+0x3ea/0x1f70 kernel/signal.c:2750
>>  arch_do_signal_or_restart+0x2a6/0x1ea0 arch/x86/kernel/signal.c:811
>>  handle_signal_work kernel/entry/common.c:145 [inline]
>>  exit_to_user_mode_loop kernel/entry/common.c:169 [inline]
>>  exit_to_user_mode_prepare+0x124/0x200 kernel/entry/common.c:199
>>  syscall_exit_to_user_mode+0x38/0x260 kernel/entry/common.c:274
>>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> RIP: 0033:0x45deb9
>> Code: Unable to access opcode bytes at RIP 0x45de8f.
>> RSP: 002b:00007fa68397ccf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
>> RAX: fffffffffffffe00 RBX: 000000000118bf28 RCX: 000000000045deb9
>> RDX: 0000000000000000 RSI: 0000000000000080 RDI: 000000000118bf28
>> RBP: 000000000118bf20 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000246 R12: 000000000118bf2c
>> R13: 00007fff50acc9af R14: 00007fa68397d9c0 R15: 000000000118bf2c
>>
...
> 
> Fix 311daef8013a ("io_uring: replace inflight_wait with tctx->wait")
> by cutting the pre-condition to wakeup because waitqueue_active()
> speaks the language in the East End while atomic_read() may speak
> the language in Paris.

Your description doesn't help, why do you think this is the problem?
->in_idle is always set when io_uring_cancel_files() sleeps on it,
and ->inflight_lock should guarantee ordering.

> 
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -6082,8 +6082,7 @@ static void io_req_drop_files(struct io_
>  
>  	spin_lock_irqsave(&ctx->inflight_lock, flags);
>  	list_del(&req->inflight_entry);
> -	if (atomic_read(&tctx->in_idle))
> -		wake_up(&tctx->wait);
> +	wake_up(&tctx->wait);
>  	spin_unlock_irqrestore(&ctx->inflight_lock, flags);
>  	req->flags &= ~REQ_F_INFLIGHT;
>  	put_files_struct(req->work.identity->files);
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: INFO: task hung in __io_uring_files_cancel
       [not found]   ` <[email protected]>
@ 2020-11-22 15:57     ` Pavel Begunkov
       [not found]     ` <[email protected]>
  1 sibling, 0 replies; 5+ messages in thread
From: Pavel Begunkov @ 2020-11-22 15:57 UTC (permalink / raw)
  To: Hillf Danton
  Cc: syzbot, axboe, io-uring, linux-fsdevel, linux-kernel, peterz,
	syzkaller-bugs, viro, will

On 22/11/2020 09:20, Hillf Danton wrote:
> On Sun, 22 Nov 2020 03:32:01 +0000 Pavel Begunkov wrote:
>> On 22/11/2020 03:04, Hillf Danton wrote:
>>> On Sat, 21 Nov 2020 14:35:15 -0800
>>>> syzbot found the following issue on:
>>>>
>>>> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=10401726500000
>>>> final oops:     https://syzkaller.appspot.com/x/report.txt?x=12401726500000
>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=14401726500000
>>>>
>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>> Reported-by: [email protected]
>>>> Fixes: 4d004099a668 ("lockdep: Fix lockdep recursion")
>>>>
>>>> INFO: task syz-executor.0:9557 blocked for more than 143 seconds.
>>>>       Not tainted 5.10.0-rc4-next-20201117-syzkaller #0
>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>> task:syz-executor.0  state:D stack:28584 pid: 9557 ppid:  8485 flags:0x00004002
>>>> Call Trace:
>>>>  context_switch kernel/sched/core.c:4269 [inline]
>>>>  __schedule+0x890/0x2030 kernel/sched/core.c:5019
>>>>  schedule+0xcf/0x270 kernel/sched/core.c:5098
>>>>  io_uring_cancel_files fs/io_uring.c:8720 [inline]
>>>>  io_uring_cancel_task_requests fs/io_uring.c:8772 [inline]
>>>>  __io_uring_files_cancel+0xc4d/0x14b0 fs/io_uring.c:8868
>>>>  io_uring_files_cancel include/linux/io_uring.h:51 [inline]
>>>>  exit_files+0xe4/0x170 fs/file.c:456
>>>>  do_exit+0xb61/0x29f0 kernel/exit.c:818
>>>>  do_group_exit+0x125/0x310 kernel/exit.c:920
>>>>  get_signal+0x3ea/0x1f70 kernel/signal.c:2750
>>>>  arch_do_signal_or_restart+0x2a6/0x1ea0 arch/x86/kernel/signal.c:811
>>>>  handle_signal_work kernel/entry/common.c:145 [inline]
>>>>  exit_to_user_mode_loop kernel/entry/common.c:169 [inline]
>>>>  exit_to_user_mode_prepare+0x124/0x200 kernel/entry/common.c:199
>>>>  syscall_exit_to_user_mode+0x38/0x260 kernel/entry/common.c:274
>>>>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>>> RIP: 0033:0x45deb9
>>>> Code: Unable to access opcode bytes at RIP 0x45de8f.
>>>> RSP: 002b:00007fa68397ccf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
>>>> RAX: fffffffffffffe00 RBX: 000000000118bf28 RCX: 000000000045deb9
>>>> RDX: 0000000000000000 RSI: 0000000000000080 RDI: 000000000118bf28
>>>> RBP: 000000000118bf20 R08: 0000000000000000 R09: 0000000000000000
>>>> R10: 0000000000000000 R11: 0000000000000246 R12: 000000000118bf2c
>>>> R13: 00007fff50acc9af R14: 00007fa68397d9c0 R15: 000000000118bf2c
>>>>
>> ...
>>>
>>> Fix 311daef8013a ("io_uring: replace inflight_wait with tctx->wait")
>>> by cutting the pre-condition to wakeup because waitqueue_active()
>>> speaks the language in the East End while atomic_read() may speak
>>> the language in Paris.
>>
>> Your description doesn't help,
> 
> That is what I could find to describe the changs added in
> 311daef8013a, given that waitqueue_active() and atomic_read() IMO are
> primary bricks without natural links prebuilt. I dont think either
> of them could be replaced with another even in 311daef8013a.

There are not linked, but that doesn't make it automatically wrong.
I still don't understand how it fixes the problem, and it's better
to find the root cause because there are similar places that might
still be similarly flawed.

>> why do you think this is the problem?
> 
> This is not a tough question, thanks to the reproducer.
> 
>> ->in_idle is always set when io_uring_cancel_files() sleeps on it,
>> and ->inflight_lock should guarantee ordering.
> 
> The syzbot report makes sense, right?
> 
>>>
>>> --- a/fs/io_uring.c
>>> +++ b/fs/io_uring.c
>>> @@ -6082,8 +6082,7 @@ static void io_req_drop_files(struct io_
>>>  
>>>  	spin_lock_irqsave(&ctx->inflight_lock, flags);
>>>  	list_del(&req->inflight_entry);
>>> -	if (atomic_read(&tctx->in_idle))
>>> -		wake_up(&tctx->wait);
>>> +	wake_up(&tctx->wait);
>>>  	spin_unlock_irqrestore(&ctx->inflight_lock, flags);
>>>  	req->flags &= ~REQ_F_INFLIGHT;
>>>  	put_files_struct(req->work.identity->files);

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: INFO: task hung in __io_uring_files_cancel
       [not found]     ` <[email protected]>
@ 2020-11-23 20:25       ` Pavel Begunkov
  0 siblings, 0 replies; 5+ messages in thread
From: Pavel Begunkov @ 2020-11-23 20:25 UTC (permalink / raw)
  To: Hillf Danton
  Cc: syzbot, axboe, io-uring, linux-fsdevel, linux-kernel, peterz,
	syzkaller-bugs, viro, will

On 23/11/2020 07:34, Hillf Danton wrote:
[...]
> After staring the report and 311daef8013a a bit more, it seems that we
> can have a simpler fix without the help of wakeup. It is implemented
> by busy waiting until there is no more request in flight found.

I think it's better not. It doesn't happen instantly, so may take a
lot of spinning in some cases. Moreover, I don't want an unkillable
task eating up all CPU if this hangs again (eg because of some other
case).

And, I'd love to keep it working similarly to __io_uring_task_cancel()
to not think twice about all corner cases.

> 
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -6077,13 +6077,10 @@ static int io_req_defer(struct io_kiocb
>  static void io_req_drop_files(struct io_kiocb *req)
>  {
>  	struct io_ring_ctx *ctx = req->ctx;
> -	struct io_uring_task *tctx = req->task->io_uring;
>  	unsigned long flags;
>  
>  	spin_lock_irqsave(&ctx->inflight_lock, flags);
>  	list_del(&req->inflight_entry);
> -	if (atomic_read(&tctx->in_idle))
> -		wake_up(&tctx->wait);
>  	spin_unlock_irqrestore(&ctx->inflight_lock, flags);
>  	req->flags &= ~REQ_F_INFLIGHT;
>  	put_files_struct(req->work.identity->files);
> @@ -8706,7 +8703,6 @@ static void io_uring_cancel_files(struct
>  	while (!list_empty_careful(&ctx->inflight_list)) {
>  		struct io_task_cancel cancel = { .task = task, .files = NULL, };
>  		struct io_kiocb *req;
> -		DEFINE_WAIT(wait);
>  		bool found = false;
>  
>  		spin_lock_irq(&ctx->inflight_lock);
> @@ -8718,9 +8714,6 @@ static void io_uring_cancel_files(struct
>  			found = true;
>  			break;
>  		}
> -		if (found)
> -			prepare_to_wait(&task->io_uring->wait, &wait,
> -					TASK_UNINTERRUPTIBLE);
>  		spin_unlock_irq(&ctx->inflight_lock);
>  
>  		/* We need to keep going until we don't find a matching req */
> @@ -8733,7 +8726,6 @@ static void io_uring_cancel_files(struct
>  		/* cancellations _may_ trigger task work */
>  		io_run_task_work();
>  		schedule();
> -		finish_wait(&task->io_uring->wait, &wait);
>  	}
>  }
>  
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: INFO: task hung in __io_uring_files_cancel
  2020-11-21 22:35 INFO: task hung in __io_uring_files_cancel syzbot
       [not found] ` <[email protected]>
@ 2020-11-24  9:39 ` syzbot
  1 sibling, 0 replies; 5+ messages in thread
From: syzbot @ 2020-11-24  9:39 UTC (permalink / raw)
  To: asml.silence, axboe, hdanton, io-uring, linux-fsdevel,
	linux-kernel, mingo, mingo, peterz, rostedt, syzkaller-bugs, viro,
	will

syzbot has found a reproducer for the following issue on:

HEAD commit:    62918e6f Add linux-next specific files for 20201123
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=16efe369500000
kernel config:  https://syzkaller.appspot.com/x/.config?x=5c3a457d7bf812a8
dashboard link: https://syzkaller.appspot.com/bug?extid=c0d52d0b3c0c3ffb9525
compiler:       gcc (GCC) 10.1.0-syz 20200507
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=153ffcae500000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10b2120d500000

The issue was bisected to:

commit 4d004099a668c41522242aa146a38cc4eb59cb1e
Author: Peter Zijlstra <[email protected]>
Date:   Fri Oct 2 09:04:21 2020 +0000

    lockdep: Fix lockdep recursion

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=10401726500000
final oops:     https://syzkaller.appspot.com/x/report.txt?x=12401726500000
console output: https://syzkaller.appspot.com/x/log.txt?x=14401726500000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]
Fixes: 4d004099a668 ("lockdep: Fix lockdep recursion")

INFO: task syz-executor136:9222 blocked for more than 143 seconds.
      Not tainted 5.10.0-rc4-next-20201123-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor136 state:D stack:28952 pid: 9222 ppid:  8504 flags:0x00004000
Call Trace:
 context_switch kernel/sched/core.c:4316 [inline]
 __schedule+0x896/0x2050 kernel/sched/core.c:5067
 schedule+0xcf/0x270 kernel/sched/core.c:5146
 io_uring_cancel_files fs/io_uring.c:8745 [inline]
 io_uring_cancel_task_requests fs/io_uring.c:8797 [inline]
 __io_uring_files_cancel+0xc4d/0x14b0 fs/io_uring.c:8893
 io_uring_files_cancel include/linux/io_uring.h:51 [inline]
 exit_files+0xe4/0x170 fs/file.c:456
 do_exit+0xb61/0x29f0 kernel/exit.c:818
 do_group_exit+0x125/0x310 kernel/exit.c:920
 get_signal+0x3ec/0x2010 kernel/signal.c:2770
 arch_do_signal_or_restart+0x2a8/0x1eb0 arch/x86/kernel/signal.c:811
 handle_signal_work kernel/entry/common.c:144 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
 exit_to_user_mode_prepare+0x124/0x200 kernel/entry/common.c:198
 syscall_exit_to_user_mode+0x36/0x260 kernel/entry/common.c:275
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x446c69
RSP: 002b:00007f96da049da8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00000000006dbc58 RCX: 0000000000446c69
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00000000006dbc58
RBP: 00000000006dbc50 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dbc5c
R13: 0000000000000004 R14: 00007f96da04a9c0 R15: 000000000000002d
INFO: task syz-executor136:9685 blocked for more than 143 seconds.
      Not tainted 5.10.0-rc4-next-20201123-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor136 state:D stack:28712 pid: 9685 ppid:  8502 flags:0x00004000
Call Trace:
 context_switch kernel/sched/core.c:4316 [inline]
 __schedule+0x896/0x2050 kernel/sched/core.c:5067
 schedule+0xcf/0x270 kernel/sched/core.c:5146
 io_uring_cancel_files fs/io_uring.c:8745 [inline]
 io_uring_cancel_task_requests fs/io_uring.c:8797 [inline]
 __io_uring_files_cancel+0xc4d/0x14b0 fs/io_uring.c:8893
 io_uring_files_cancel include/linux/io_uring.h:51 [inline]
 exit_files+0xe4/0x170 fs/file.c:456
 do_exit+0xb61/0x29f0 kernel/exit.c:818
 do_group_exit+0x125/0x310 kernel/exit.c:920
 get_signal+0x3ec/0x2010 kernel/signal.c:2770
 arch_do_signal_or_restart+0x2a8/0x1eb0 arch/x86/kernel/signal.c:811
 handle_signal_work kernel/entry/common.c:144 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
 exit_to_user_mode_prepare+0x124/0x200 kernel/entry/common.c:198
 syscall_exit_to_user_mode+0x36/0x260 kernel/entry/common.c:275
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x446c69
RSP: 002b:00007f96da049da8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00000000006dbc58 RCX: 0000000000446c69
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00000000006dbc58
RBP: 00000000006dbc50 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dbc5c
R13: 0000000000000004 R14: 00007f96da04a9c0 R15: 000000000000002d

Showing all locks held in the system:
2 locks held by kworker/u4:3/102:
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:616 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x821/0x15a0 kernel/workqueue.c:2243
 #1: ffff8880b9f20088 (&per_cpu_ptr(group->pcpu, cpu)->seq){-.-.}-{0:0}, at: psi_task_switch+0x305/0x440 kernel/sched/psi.c:833
1 lock held by khungtaskd/1644:
 #0: ffffffff8b339e60 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:6254
1 lock held by in:imklog/8173:
 #0: ffff8880209594f0 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:932
2 locks held by kworker/u4:0/8560:
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:616 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x821/0x15a0 kernel/workqueue.c:2243
 #1: ffffc900018efda8 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x854/0x15a0 kernel/workqueue.c:2247
2 locks held by kworker/u4:11/28297:
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:616 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x821/0x15a0 kernel/workqueue.c:2243
 #1: ffffc90003797da8 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x854/0x15a0 kernel/workqueue.c:2247
2 locks held by kworker/u4:13/28309:
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:616 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x821/0x15a0 kernel/workqueue.c:2243
 #1: ffffc900037b7da8 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x854/0x15a0 kernel/workqueue.c:2247
2 locks held by kworker/u4:15/28331:
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:616 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x821/0x15a0 kernel/workqueue.c:2243
 #1: ffffc90003567da8 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x854/0x15a0 kernel/workqueue.c:2247
2 locks held by kworker/u4:16/28341:
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:616 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x821/0x15a0 kernel/workqueue.c:2243
 #1: ffffc90003987da8 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x854/0x15a0 kernel/workqueue.c:2247
2 locks held by kworker/u4:17/28345:
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:616 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x821/0x15a0 kernel/workqueue.c:2243
 #1: ffffc900038f7da8 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x854/0x15a0 kernel/workqueue.c:2247
2 locks held by kworker/u4:18/28349:
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:616 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x821/0x15a0 kernel/workqueue.c:2243
 #1: ffffc900039b7da8 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x854/0x15a0 kernel/workqueue.c:2247
2 locks held by kworker/u4:19/28360:
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:616 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x821/0x15a0 kernel/workqueue.c:2243
 #1: ffffc900038e7da8 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x854/0x15a0 kernel/workqueue.c:2247
2 locks held by kworker/u4:20/28362:
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:616 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x821/0x15a0 kernel/workqueue.c:2243
 #1: ffffc900039d7da8 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x854/0x15a0 kernel/workqueue.c:2247
2 locks held by kworker/u4:21/28363:
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:616 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x821/0x15a0 kernel/workqueue.c:2243
 #1: ffffc90003787da8 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x854/0x15a0 kernel/workqueue.c:2247
3 locks held by syz-executor136/28466:

=============================================

NMI backtrace for cpu 1
CPU: 1 PID: 1644 Comm: khungtaskd Not tainted 5.10.0-rc4-next-20201123-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 nmi_cpu_backtrace.cold+0x44/0xd7 lib/nmi_backtrace.c:105
 nmi_trigger_cpumask_backtrace+0x1b3/0x230 lib/nmi_backtrace.c:62
 trigger_all_cpu_backtrace include/linux/nmi.h:147 [inline]
 check_hung_uninterruptible_tasks kernel/hung_task.c:253 [inline]
 watchdog+0xd89/0xf30 kernel/hung_task.c:338
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 28469 Comm: io_wqe_worker-0 Not tainted 5.10.0-rc4-next-20201123-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:get_current arch/x86/include/asm/current.h:15 [inline]
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x60 kernel/kcov.c:196
Code: fc ff ff 48 c7 c7 c0 01 39 8b 48 89 54 24 08 48 89 34 24 e8 d2 7a 60 02 48 8b 54 24 08 48 8b 34 24 e9 a1 fd ff ff 0f 1f 40 00 <65> 48 8b 14 25 00 f0 01 00 65 8b 05 90 64 91 7e a9 00 01 ff 00 48
RSP: 0018:ffffc90003e3fc48 EFLAGS: 00000046
RAX: 0000000000000000 RBX: ffffc90003e3fd58 RCX: ffffffff81600e1b
RDX: 0000000000000000 RSI: ffff888147e63580 RDI: 0000000000000001
RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff8ced564f
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
R13: ffffc90003e3fd60 R14: 0000000000000293 R15: ffff888147e1c800
FS:  0000000000000000(0000) GS:ffff8880b9e00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000002001d06c CR3: 0000000013499000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 __hlist_del include/linux/list.h:835 [inline]
 detach_timer kernel/time/timer.c:825 [inline]
 detach_if_pending+0xee/0x470 kernel/time/timer.c:844
 try_to_del_timer_sync+0xaa/0x110 kernel/time/timer.c:1232
 del_timer_sync+0xf5/0x160 kernel/time/timer.c:1378
 schedule_timeout+0x150/0x250 kernel/time/timer.c:1879
 io_wqe_worker+0x5b4/0x10e0 fs/io-wq.c:625
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-11-24  9:39 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-11-21 22:35 INFO: task hung in __io_uring_files_cancel syzbot
     [not found] ` <[email protected]>
2020-11-22  3:32   ` Pavel Begunkov
     [not found]   ` <[email protected]>
2020-11-22 15:57     ` Pavel Begunkov
     [not found]     ` <[email protected]>
2020-11-23 20:25       ` Pavel Begunkov
2020-11-24  9:39 ` syzbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox