* INFO: task hung in io_queue_file_removal
@ 2020-03-21 5:50 syzbot
0 siblings, 0 replies; 3+ messages in thread
From: syzbot @ 2020-03-21 5:50 UTC (permalink / raw)
To: axboe, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs,
viro
Hello,
syzbot found the following crash on:
HEAD commit: cd607737 Merge tag '5.6-rc6-smb3-fixes' of git://git.samba..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1730c023e00000
kernel config: https://syzkaller.appspot.com/x/.config?x=9f894bd92023de02
dashboard link: https://syzkaller.appspot.com/bug?extid=538d1957ce178382a394
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=108ebbe3e00000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=139fb973e00000
The bug was bisected to:
commit 05f3fb3c5397524feae2e73ee8e150a9090a7da2
Author: Jens Axboe <[email protected]>
Date: Mon Dec 9 18:22:50 2019 +0000
io_uring: avoid ring quiesce for fixed file set unregister and update
bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1237ad73e00000
final crash: https://syzkaller.appspot.com/x/report.txt?x=1137ad73e00000
console output: https://syzkaller.appspot.com/x/log.txt?x=1637ad73e00000
IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: [email protected]
Fixes: 05f3fb3c5397 ("io_uring: avoid ring quiesce for fixed file set unregister and update")
INFO: task syz-executor975:9880 blocked for more than 143 seconds.
Not tainted 5.6.0-rc6-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor975 D27576 9880 9878 0x80004000
Call Trace:
schedule+0xd0/0x2a0 kernel/sched/core.c:4154
schedule_timeout+0x6db/0xba0 kernel/time/timer.c:1871
do_wait_for_common kernel/sched/completion.c:83 [inline]
__wait_for_common kernel/sched/completion.c:104 [inline]
wait_for_common kernel/sched/completion.c:115 [inline]
wait_for_completion+0x26a/0x3c0 kernel/sched/completion.c:136
io_queue_file_removal+0x1af/0x1e0 fs/io_uring.c:5826
__io_sqe_files_update.isra.0+0x3a1/0xb00 fs/io_uring.c:5867
io_sqe_files_update fs/io_uring.c:5918 [inline]
__io_uring_register+0x377/0x2c00 fs/io_uring.c:7131
__do_sys_io_uring_register fs/io_uring.c:7202 [inline]
__se_sys_io_uring_register fs/io_uring.c:7184 [inline]
__x64_sys_io_uring_register+0x192/0x560 fs/io_uring.c:7184
do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x440659
Code: Bad RIP value.
RSP: 002b:00007ffc4689a358 EFLAGS: 00000246 ORIG_RAX: 00000000000001ab
RAX: ffffffffffffffda RBX: 00007ffc4689a360 RCX: 0000000000440659
RDX: 0000000020000300 RSI: 0000000000000006 RDI: 0000000000000003
RBP: 0000000000000005 R08: 0000000000000001 R09: 00007ffc46890031
R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000401f40
R13: 0000000000401fd0 R14: 0000000000000000 R15: 0000000000000000
Showing all locks held in the system:
1 lock held by khungtaskd/1137:
#0: ffffffff897accc0 (rcu_read_lock){....}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:5331
1 lock held by rsyslogd/9761:
#0: ffff8880a8f3ada0 (&f->f_pos_lock){+.+.}, at: __fdget_pos+0xe3/0x100 fs/file.c:821
2 locks held by getty/9850:
#0: ffff88809fad3090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:267
#1: ffffc900017bb2e0 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x21d/0x1b30 drivers/tty/n_tty.c:2156
2 locks held by getty/9851:
#0: ffff8880a7b96090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:267
#1: ffffc900017cb2e0 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x21d/0x1b30 drivers/tty/n_tty.c:2156
2 locks held by getty/9852:
#0: ffff88809e41c090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:267
#1: ffffc900017eb2e0 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x21d/0x1b30 drivers/tty/n_tty.c:2156
2 locks held by getty/9853:
#0: ffff888090392090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:267
#1: ffffc900017ab2e0 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x21d/0x1b30 drivers/tty/n_tty.c:2156
2 locks held by getty/9854:
#0: ffff88809fb1b090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:267
#1: ffffc900017db2e0 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x21d/0x1b30 drivers/tty/n_tty.c:2156
2 locks held by getty/9855:
#0: ffff88809a302090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:267
#1: ffffc9000178b2e0 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x21d/0x1b30 drivers/tty/n_tty.c:2156
2 locks held by getty/9856:
#0: ffff88809d9dc090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:267
#1: ffffc9000172b2e0 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x21d/0x1b30 drivers/tty/n_tty.c:2156
1 lock held by syz-executor975/9880:
#0: ffff88808f392320 (&ctx->uring_lock){+.+.}, at: __do_sys_io_uring_register fs/io_uring.c:7201 [inline]
#0: ffff88808f392320 (&ctx->uring_lock){+.+.}, at: __se_sys_io_uring_register fs/io_uring.c:7184 [inline]
#0: ffff88808f392320 (&ctx->uring_lock){+.+.}, at: __x64_sys_io_uring_register+0x181/0x560 fs/io_uring.c:7184
=============================================
NMI backtrace for cpu 1
CPU: 1 PID: 1137 Comm: khungtaskd Not tainted 5.6.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x188/0x20d lib/dump_stack.c:118
nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101
nmi_trigger_cpumask_backtrace+0x231/0x27e lib/nmi_backtrace.c:62
trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline]
watchdog+0xa8c/0x1010 kernel/hung_task.c:289
kthread+0x357/0x430 kernel/kthread.c:255
ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:60
---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].
syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: INFO: task hung in io_queue_file_removal
[not found] <[email protected]>
@ 2020-03-21 20:03 ` Jens Axboe
[not found] ` <[email protected]>
1 sibling, 0 replies; 3+ messages in thread
From: Jens Axboe @ 2020-03-21 20:03 UTC (permalink / raw)
To: Hillf Danton, syzbot
Cc: io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs, viro
On 3/21/20 6:38 AM, Hillf Danton wrote:
>
> On Fri, 20 Mar 2020 22:50:12 -0700
>> syzbot found the following crash on:
>>
>> HEAD commit: cd607737 Merge tag '5.6-rc6-smb3-fixes' of git://git.samba..
>> git tree: upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=1730c023e00000
>> kernel config: https://syzkaller.appspot.com/x/.config?x=9f894bd92023de02
>> dashboard link: https://syzkaller.appspot.com/bug?extid=538d1957ce178382a394
>> compiler: gcc (GCC) 9.0.0 20181231 (experimental)
>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=108ebbe3e00000
>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=139fb973e00000
>>
>> The bug was bisected to:
>>
>> commit 05f3fb3c5397524feae2e73ee8e150a9090a7da2
>> Author: Jens Axboe <[email protected]>
>> Date: Mon Dec 9 18:22:50 2019 +0000
>>
>> io_uring: avoid ring quiesce for fixed file set unregister and update
>>
>> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1237ad73e00000
>> final crash: https://syzkaller.appspot.com/x/report.txt?x=1137ad73e00000
>> console output: https://syzkaller.appspot.com/x/log.txt?x=1637ad73e00000
>>
>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: [email protected]
>> Fixes: 05f3fb3c5397 ("io_uring: avoid ring quiesce for fixed file set unregister and update")
>>
>> INFO: task syz-executor975:9880 blocked for more than 143 seconds.
>> Not tainted 5.6.0-rc6-syzkaller #0
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> syz-executor975 D27576 9880 9878 0x80004000
>> Call Trace:
>> schedule+0xd0/0x2a0 kernel/sched/core.c:4154
>> schedule_timeout+0x6db/0xba0 kernel/time/timer.c:1871
>> do_wait_for_common kernel/sched/completion.c:83 [inline]
>> __wait_for_common kernel/sched/completion.c:104 [inline]
>> wait_for_common kernel/sched/completion.c:115 [inline]
>> wait_for_completion+0x26a/0x3c0 kernel/sched/completion.c:136
>> io_queue_file_removal+0x1af/0x1e0 fs/io_uring.c:5826
>> __io_sqe_files_update.isra.0+0x3a1/0xb00 fs/io_uring.c:5867
>> io_sqe_files_update fs/io_uring.c:5918 [inline]
>> __io_uring_register+0x377/0x2c00 fs/io_uring.c:7131
>> __do_sys_io_uring_register fs/io_uring.c:7202 [inline]
>> __se_sys_io_uring_register fs/io_uring.c:7184 [inline]
>> __x64_sys_io_uring_register+0x192/0x560 fs/io_uring.c:7184
>> do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:294
>> entry_SYSCALL_64_after_hwframe+0x49/0xbe
>> RIP: 0033:0x440659
>> Code: Bad RIP value.
>> RSP: 002b:00007ffc4689a358 EFLAGS: 00000246 ORIG_RAX: 00000000000001ab
>> RAX: ffffffffffffffda RBX: 00007ffc4689a360 RCX: 0000000000440659
>> RDX: 0000000020000300 RSI: 0000000000000006 RDI: 0000000000000003
>> RBP: 0000000000000005 R08: 0000000000000001 R09: 00007ffc46890031
>> R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000401f40
>> R13: 0000000000401fd0 R14: 0000000000000000 R15: 0000000000000000
>>
>> Showing all locks held in the system:
>> 1 lock held by khungtaskd/1137:
>> #0: ffffffff897accc0 (rcu_read_lock){....}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:5331
>> 1 lock held by rsyslogd/9761:
>> #0: ffff8880a8f3ada0 (&f->f_pos_lock){+.+.}, at: __fdget_pos+0xe3/0x100 fs/file.c:821
>> 2 locks held by getty/9850:
>> #0: ffff88809fad3090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:267
>> #1: ffffc900017bb2e0 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x21d/0x1b30 drivers/tty/n_tty.c:2156
>> 2 locks held by getty/9851:
>> #0: ffff8880a7b96090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:267
>> #1: ffffc900017cb2e0 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x21d/0x1b30 drivers/tty/n_tty.c:2156
>> 2 locks held by getty/9852:
>> #0: ffff88809e41c090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:267
>> #1: ffffc900017eb2e0 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x21d/0x1b30 drivers/tty/n_tty.c:2156
>> 2 locks held by getty/9853:
>> #0: ffff888090392090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:267
>> #1: ffffc900017ab2e0 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x21d/0x1b30 drivers/tty/n_tty.c:2156
>> 2 locks held by getty/9854:
>> #0: ffff88809fb1b090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:267
>> #1: ffffc900017db2e0 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x21d/0x1b30 drivers/tty/n_tty.c:2156
>> 2 locks held by getty/9855:
>> #0: ffff88809a302090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:267
>> #1: ffffc9000178b2e0 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x21d/0x1b30 drivers/tty/n_tty.c:2156
>> 2 locks held by getty/9856:
>> #0: ffff88809d9dc090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:267
>> #1: ffffc9000172b2e0 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x21d/0x1b30 drivers/tty/n_tty.c:2156
>> 1 lock held by syz-executor975/9880:
>> #0: ffff88808f392320 (&ctx->uring_lock){+.+.}, at: __do_sys_io_uring_register fs/io_uring.c:7201 [inline]
>> #0: ffff88808f392320 (&ctx->uring_lock){+.+.}, at: __se_sys_io_uring_register fs/io_uring.c:7184 [inline]
>> #0: ffff88808f392320 (&ctx->uring_lock){+.+.}, at: __x64_sys_io_uring_register+0x181/0x560 fs/io_uring.c:7184
>>
>> =============================================
>>
>> NMI backtrace for cpu 1
>> CPU: 1 PID: 1137 Comm: khungtaskd Not tainted 5.6.0-rc6-syzkaller #0
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>> Call Trace:
>> __dump_stack lib/dump_stack.c:77 [inline]
>> dump_stack+0x188/0x20d lib/dump_stack.c:118
>> nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101
>> nmi_trigger_cpumask_backtrace+0x231/0x27e lib/nmi_backtrace.c:62
>> trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
>> check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline]
>> watchdog+0xa8c/0x1010 kernel/hung_task.c:289
>> kthread+0x357/0x430 kernel/kthread.c:255
>> ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
>> Sending NMI from CPU 1 to CPUs 0:
>> NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:60
>
> Flush work before waiting for completion.
>
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -5823,8 +5823,8 @@ static bool io_queue_file_removal(struct
>
> if (pfile == &pfile_stack) {
> percpu_ref_switch_to_atomic(&data->refs, io_atomic_switch);
> - wait_for_completion(&done);
> flush_work(&data->ref_work);
> + wait_for_completion(&done);
> return false;
> }
>
> --
>
> And perhaps a tiny cleanup: no deed to wait for completion as
> flushing work itself will wait until the work is done.
Care to send this version as a real patch? Seems kind of pointless to
just do the above change with that in mind. And then at the same time
turn ->done into ->do_file_put or something, and make it a bool.
--
Jens Axboe
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: INFO: task hung in io_queue_file_removal
[not found] ` <[email protected]>
@ 2020-03-22 20:17 ` Jens Axboe
0 siblings, 0 replies; 3+ messages in thread
From: Jens Axboe @ 2020-03-22 20:17 UTC (permalink / raw)
To: Hillf Danton
Cc: syzbot, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs,
viro
On 3/21/20 8:06 PM, Hillf Danton wrote:
>
> On Sat, 21 Mar 2020 14:03:24 -0600 Jens Axboe wrote:
>>
>> On 3/21/20 6:38 AM, Hillf Danton wrote:
>>>
>>> Flush work before waiting for completion.
>>>
>>> --- a/fs/io_uring.c
>>> +++ b/fs/io_uring.c
>>> @@ -5823,8 +5823,8 @@ static bool io_queue_file_removal(struct
>>>
>>> if (pfile == &pfile_stack) {
>>> percpu_ref_switch_to_atomic(&data->refs, io_atomic_switch);
>>> - wait_for_completion(&done);
>>> flush_work(&data->ref_work);
>>> + wait_for_completion(&done);
>>> return false;
>>> }
>>>
>>> --
>>>
>>> And perhaps a tiny cleanup: no deed to wait for completion as
>>> flushing work itself will wait until the work is done.
>>
>> Care to send this version as a real patch? Seems kind of pointless to
>> just do the above change with that in mind. And then at the same time
>> turn ->done into ->do_file_put or something, and make it a bool.
>
> Have trouble making a patch with the ideas in your mind all folded in so
> it may be better that you do it this time leaving me a chance to learn
> a lesson.
Maybe my explanation wasn't quite clear! What I meant was that since
we're no longer using pfile->done, turn that ->done into a ->needs_kfree
or something, and make that a bool. So basically the same patch as the
one you posted, just making that naming (and type) change as well.
Does that help? Would prefer if you sent a patch, you already did
99% of the hard work.
--
Jens Axboe
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-03-22 20:17 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-03-21 5:50 INFO: task hung in io_queue_file_removal syzbot
[not found] <[email protected]>
2020-03-21 20:03 ` Jens Axboe
[not found] ` <[email protected]>
2020-03-22 20:17 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox