* INFO: rcu detected stall in io_uring_release
@ 2020-04-18 18:59 syzbot
[not found] ` <[email protected]>
2020-05-12 2:17 ` syzbot
0 siblings, 2 replies; 6+ messages in thread
From: syzbot @ 2020-04-18 18:59 UTC (permalink / raw)
To: axboe, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs,
viro
Hello,
syzbot found the following crash on:
HEAD commit: 8f3d9f35 Linux 5.7-rc1
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=115720c3e00000
kernel config: https://syzkaller.appspot.com/x/.config?x=5d351a1019ed81a2
dashboard link: https://syzkaller.appspot.com/bug?extid=66243bb7126c410cefe6
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
Unfortunately, I don't have any reproducer for this crash yet.
IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: [email protected]
rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 0-....: (10500 ticks this GP) idle=57e/1/0x4000000000000002 softirq=44329/44329 fqs=5245
(t=10502 jiffies g=79401 q=2096)
NMI backtrace for cpu 0
CPU: 0 PID: 23184 Comm: syz-executor.5 Not tainted 5.7.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x188/0x20d lib/dump_stack.c:118
nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101
nmi_trigger_cpumask_backtrace+0x231/0x27e lib/nmi_backtrace.c:62
trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
rcu_dump_cpu_stacks+0x19b/0x1e5 kernel/rcu/tree_stall.h:254
print_cpu_stall kernel/rcu/tree_stall.h:475 [inline]
check_cpu_stall kernel/rcu/tree_stall.h:549 [inline]
rcu_pending kernel/rcu/tree.c:3225 [inline]
rcu_sched_clock_irq.cold+0x55d/0xcfa kernel/rcu/tree.c:2296
update_process_times+0x25/0x60 kernel/time/timer.c:1727
tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:176
tick_sched_timer+0x4e/0x140 kernel/time/tick-sched.c:1320
__run_hrtimer kernel/time/hrtimer.c:1520 [inline]
__hrtimer_run_queues+0x5ca/0xed0 kernel/time/hrtimer.c:1584
hrtimer_interrupt+0x312/0x770 kernel/time/hrtimer.c:1646
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1113 [inline]
smp_apic_timer_interrupt+0x15b/0x600 arch/x86/kernel/apic/apic.c:1138
apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829
</IRQ>
RIP: 0010:io_ring_ctx_wait_and_kill+0x98/0x5a0 fs/io_uring.c:7301
Code: 01 00 00 4d 89 f4 48 b8 00 00 00 00 00 fc ff df 4c 89 ed 49 c1 ec 03 48 c1 ed 03 49 01 c4 48 01 c5 eb 1c e8 3a ea 9d ff f3 90 <41> 80 3c 24 00 0f 85 53 04 00 00 48 83 bb 10 01 00 00 00 74 21 e8
RSP: 0018:ffffc9000897fdf0 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13
RAX: ffff888024082080 RBX: ffff88808df8e000 RCX: 1ffff9200112ffab
RDX: 0000000000000000 RSI: ffffffff81d549c6 RDI: ffff88808df8e300
RBP: ffffed1011bf1c2c R08: 0000000000000001 R09: ffffed1011bf1c61
R10: ffff88808df8e307 R11: ffffed1011bf1c60 R12: ffffed1011bf1c22
R13: ffff88808df8e160 R14: ffff88808df8e110 R15: ffffffff81d54ed0
io_uring_release+0x3e/0x50 fs/io_uring.c:7324
__fput+0x33e/0x880 fs/file_table.c:280
task_work_run+0xf4/0x1b0 kernel/task_work.c:123
tracehook_notify_resume include/linux/tracehook.h:188 [inline]
exit_to_usermode_loop+0x2fa/0x360 arch/x86/entry/common.c:165
prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305
entry_SYSCALL_64_after_hwframe+0x49/0xb3
RIP: 0033:0x416421
Code: 75 14 b8 03 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 04 1b 00 00 c3 48 83 ec 08 e8 0a fc ff ff 48 89 04 24 b8 03 00 00 00 0f 05 <48> 8b 3c 24 48 89 c2 e8 53 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01
RSP: 002b:00007ffc3c9f63d0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000416421
RDX: 00000000000f4240 RSI: 0000000000000081 RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000000000000 R09: 01ffffffffffffff
R10: 0000000000770b20 R11: 0000000000000293 R12: 000000000076bfa0
R13: 0000000000770b30 R14: 0000000000000001 R15: 000000000076bfac
---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].
syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <[email protected]>]
* Re: INFO: rcu detected stall in io_uring_release [not found] ` <[email protected]> @ 2020-04-19 19:57 ` Jens Axboe 2020-04-20 11:47 ` Dan Carpenter 1 sibling, 0 replies; 6+ messages in thread From: Jens Axboe @ 2020-04-19 19:57 UTC (permalink / raw) To: Hillf Danton, syzbot Cc: io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs, viro On 4/18/20 10:06 PM, Hillf Danton wrote: > > Sat, 18 Apr 2020 11:59:13 -0700 >> >> syzbot found the following crash on: >> >> HEAD commit: 8f3d9f35 Linux 5.7-rc1 >> git tree: upstream >> console output: https://syzkaller.appspot.com/x/log.txt?x=115720c3e00000 >> kernel config: https://syzkaller.appspot.com/x/.config?x=5d351a1019ed81a2 >> dashboard link: https://syzkaller.appspot.com/bug?extid=66243bb7126c410cefe6 >> compiler: gcc (GCC) 9.0.0 20181231 (experimental) >> >> Unfortunately, I don't have any reproducer for this crash yet. >> >> IMPORTANT: if you fix the bug, please add the following tag to the commit: >> Reported-by: [email protected] >> >> rcu: INFO: rcu_preempt self-detected stall on CPU >> rcu: 0-....: (10500 ticks this GP) idle=57e/1/0x4000000000000002 softirq=44329/44329 fqs=5245 >> (t=10502 jiffies g=79401 q=2096) >> NMI backtrace for cpu 0 >> CPU: 0 PID: 23184 Comm: syz-executor.5 Not tainted 5.7.0-rc1-syzkaller #0 >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >> Call Trace: >> <IRQ> >> __dump_stack lib/dump_stack.c:77 [inline] >> dump_stack+0x188/0x20d lib/dump_stack.c:118 >> nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101 >> nmi_trigger_cpumask_backtrace+0x231/0x27e lib/nmi_backtrace.c:62 >> trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline] >> rcu_dump_cpu_stacks+0x19b/0x1e5 kernel/rcu/tree_stall.h:254 >> print_cpu_stall kernel/rcu/tree_stall.h:475 [inline] >> check_cpu_stall kernel/rcu/tree_stall.h:549 [inline] >> rcu_pending kernel/rcu/tree.c:3225 [inline] >> rcu_sched_clock_irq.cold+0x55d/0xcfa kernel/rcu/tree.c:2296 >> update_process_times+0x25/0x60 kernel/time/timer.c:1727 >> tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:176 >> tick_sched_timer+0x4e/0x140 kernel/time/tick-sched.c:1320 >> __run_hrtimer kernel/time/hrtimer.c:1520 [inline] >> __hrtimer_run_queues+0x5ca/0xed0 kernel/time/hrtimer.c:1584 >> hrtimer_interrupt+0x312/0x770 kernel/time/hrtimer.c:1646 >> local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1113 [inline] >> smp_apic_timer_interrupt+0x15b/0x600 arch/x86/kernel/apic/apic.c:1138 >> apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829 >> </IRQ> >> RIP: 0010:io_ring_ctx_wait_and_kill+0x98/0x5a0 fs/io_uring.c:7301 >> Code: 01 00 00 4d 89 f4 48 b8 00 00 00 00 00 fc ff df 4c 89 ed 49 c1 ec 03 48 c1 ed 03 49 01 c4 48 01 c5 eb 1c e8 3a ea 9d ff f3 90 <41> 80 3c 24 00 0f 85 53 04 00 00 48 83 bb 10 01 00 00 00 74 21 e8 >> RSP: 0018:ffffc9000897fdf0 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13 >> RAX: ffff888024082080 RBX: ffff88808df8e000 RCX: 1ffff9200112ffab >> RDX: 0000000000000000 RSI: ffffffff81d549c6 RDI: ffff88808df8e300 >> RBP: ffffed1011bf1c2c R08: 0000000000000001 R09: ffffed1011bf1c61 >> R10: ffff88808df8e307 R11: ffffed1011bf1c60 R12: ffffed1011bf1c22 >> R13: ffff88808df8e160 R14: ffff88808df8e110 R15: ffffffff81d54ed0 >> io_uring_release+0x3e/0x50 fs/io_uring.c:7324 >> __fput+0x33e/0x880 fs/file_table.c:280 >> task_work_run+0xf4/0x1b0 kernel/task_work.c:123 >> tracehook_notify_resume include/linux/tracehook.h:188 [inline] >> exit_to_usermode_loop+0x2fa/0x360 arch/x86/entry/common.c:165 >> prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline] >> syscall_return_slowpath arch/x86/entry/common.c:279 [inline] >> do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305 >> entry_SYSCALL_64_after_hwframe+0x49/0xb3 > > Make io ring ctx's percpu_ref balanced. > > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -5904,6 +5904,7 @@ static int io_submit_sqes(struct io_ring > fail_req: > io_cqring_add_event(req, err); > io_double_put_req(req); > + --submitted; > break; > } Not sure this is right, need to look closer. But if we post a completion event, the event has been consumed and should be accounted as such. -- Jens Axboe ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: INFO: rcu detected stall in io_uring_release [not found] ` <[email protected]> 2020-04-19 19:57 ` Jens Axboe @ 2020-04-20 11:47 ` Dan Carpenter 2020-04-20 12:57 ` Pavel Begunkov 1 sibling, 1 reply; 6+ messages in thread From: Dan Carpenter @ 2020-04-20 11:47 UTC (permalink / raw) To: Hillf Danton Cc: syzbot, axboe, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs, viro On Sun, Apr 19, 2020 at 12:06:26PM +0800, Hillf Danton wrote: > > Sat, 18 Apr 2020 11:59:13 -0700 > > > > syzbot found the following crash on: > > > > HEAD commit: 8f3d9f35 Linux 5.7-rc1 > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=115720c3e00000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=5d351a1019ed81a2 > > dashboard link: https://syzkaller.appspot.com/bug?extid=66243bb7126c410cefe6 > > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > > > > Unfortunately, I don't have any reproducer for this crash yet. > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > > Reported-by: [email protected] > > > > rcu: INFO: rcu_preempt self-detected stall on CPU > > rcu: 0-....: (10500 ticks this GP) idle=57e/1/0x4000000000000002 softirq=44329/44329 fqs=5245 > > (t=10502 jiffies g=79401 q=2096) > > NMI backtrace for cpu 0 > > CPU: 0 PID: 23184 Comm: syz-executor.5 Not tainted 5.7.0-rc1-syzkaller #0 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > > Call Trace: > > <IRQ> > > __dump_stack lib/dump_stack.c:77 [inline] > > dump_stack+0x188/0x20d lib/dump_stack.c:118 > > nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101 > > nmi_trigger_cpumask_backtrace+0x231/0x27e lib/nmi_backtrace.c:62 > > trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline] > > rcu_dump_cpu_stacks+0x19b/0x1e5 kernel/rcu/tree_stall.h:254 > > print_cpu_stall kernel/rcu/tree_stall.h:475 [inline] > > check_cpu_stall kernel/rcu/tree_stall.h:549 [inline] > > rcu_pending kernel/rcu/tree.c:3225 [inline] > > rcu_sched_clock_irq.cold+0x55d/0xcfa kernel/rcu/tree.c:2296 > > update_process_times+0x25/0x60 kernel/time/timer.c:1727 > > tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:176 > > tick_sched_timer+0x4e/0x140 kernel/time/tick-sched.c:1320 > > __run_hrtimer kernel/time/hrtimer.c:1520 [inline] > > __hrtimer_run_queues+0x5ca/0xed0 kernel/time/hrtimer.c:1584 > > hrtimer_interrupt+0x312/0x770 kernel/time/hrtimer.c:1646 > > local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1113 [inline] > > smp_apic_timer_interrupt+0x15b/0x600 arch/x86/kernel/apic/apic.c:1138 > > apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829 > > </IRQ> > > RIP: 0010:io_ring_ctx_wait_and_kill+0x98/0x5a0 fs/io_uring.c:7301 > > Code: 01 00 00 4d 89 f4 48 b8 00 00 00 00 00 fc ff df 4c 89 ed 49 c1 ec 03 48 c1 ed 03 49 01 c4 48 01 c5 eb 1c e8 3a ea 9d ff f3 90 <41> 80 3c 24 00 0f 85 53 04 00 00 48 83 bb 10 01 00 00 00 74 21 e8 > > RSP: 0018:ffffc9000897fdf0 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13 > > RAX: ffff888024082080 RBX: ffff88808df8e000 RCX: 1ffff9200112ffab > > RDX: 0000000000000000 RSI: ffffffff81d549c6 RDI: ffff88808df8e300 > > RBP: ffffed1011bf1c2c R08: 0000000000000001 R09: ffffed1011bf1c61 > > R10: ffff88808df8e307 R11: ffffed1011bf1c60 R12: ffffed1011bf1c22 > > R13: ffff88808df8e160 R14: ffff88808df8e110 R15: ffffffff81d54ed0 > > io_uring_release+0x3e/0x50 fs/io_uring.c:7324 > > __fput+0x33e/0x880 fs/file_table.c:280 > > task_work_run+0xf4/0x1b0 kernel/task_work.c:123 > > tracehook_notify_resume include/linux/tracehook.h:188 [inline] > > exit_to_usermode_loop+0x2fa/0x360 arch/x86/entry/common.c:165 > > prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline] > > syscall_return_slowpath arch/x86/entry/common.c:279 [inline] > > do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305 > > entry_SYSCALL_64_after_hwframe+0x49/0xb3 > > Make io ring ctx's percpu_ref balanced. > > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -5904,6 +5904,7 @@ static int io_submit_sqes(struct io_ring > fail_req: > io_cqring_add_event(req, err); > io_double_put_req(req); > + --submitted; > break; > } fs/io_uring.c 5880 for (i = 0; i < nr; i++) { 5881 const struct io_uring_sqe *sqe; 5882 struct io_kiocb *req; 5883 int err; 5884 5885 sqe = io_get_sqe(ctx); 5886 if (unlikely(!sqe)) { 5887 io_consume_sqe(ctx); 5888 break; 5889 } 5890 req = io_alloc_req(ctx, statep); 5891 if (unlikely(!req)) { 5892 if (!submitted) 5893 submitted = -EAGAIN; 5894 break; 5895 } 5896 5897 err = io_init_req(ctx, req, sqe, statep, async); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ On the success path io_init_req() takes some references like: get_cred(req->work.creds); That one is probably buggy and should be put if the call to: return io_req_set_file(state, req, fd, sqe_flags); fails... But io_req_set_file() takes some other references if it succeeds like percpu_ref_get(req->fixed_file_refs); and it's not clear that those are released if io_submit_sqe() fails. 5898 io_consume_sqe(ctx); 5899 /* will complete beyond this point, count as submitted */ 5900 submitted++; 5901 5902 if (unlikely(err)) { 5903 fail_req: 5904 io_cqring_add_event(req, err); 5905 io_double_put_req(req); 5906 break; 5907 } 5908 5909 trace_io_uring_submit_sqe(ctx, req->opcode, req->user_data, 5910 true, async); 5911 err = io_submit_sqe(req, sqe, statep, &link); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ here 5912 if (err) 5913 goto fail_req; 5914 } regards, dan carpenter ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: INFO: rcu detected stall in io_uring_release 2020-04-20 11:47 ` Dan Carpenter @ 2020-04-20 12:57 ` Pavel Begunkov 0 siblings, 0 replies; 6+ messages in thread From: Pavel Begunkov @ 2020-04-20 12:57 UTC (permalink / raw) To: Dan Carpenter, Hillf Danton Cc: syzbot, axboe, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs, viro On 4/20/2020 2:47 PM, Dan Carpenter wrote: > On Sun, Apr 19, 2020 at 12:06:26PM +0800, Hillf Danton wrote: >> >> Sat, 18 Apr 2020 11:59:13 -0700 >>> >>> syzbot found the following crash on: >>> >>> HEAD commit: 8f3d9f35 Linux 5.7-rc1 >>> git tree: upstream >>> console output: https://syzkaller.appspot.com/x/log.txt?x=115720c3e00000 >>> kernel config: https://syzkaller.appspot.com/x/.config?x=5d351a1019ed81a2 >>> dashboard link: https://syzkaller.appspot.com/bug?extid=66243bb7126c410cefe6 >>> compiler: gcc (GCC) 9.0.0 20181231 (experimental) >>> >>> Unfortunately, I don't have any reproducer for this crash yet. >>> >>> IMPORTANT: if you fix the bug, please add the following tag to the commit: >>> Reported-by: [email protected] >>> >>> rcu: INFO: rcu_preempt self-detected stall on CPU >>> rcu: 0-....: (10500 ticks this GP) idle=57e/1/0x4000000000000002 softirq=44329/44329 fqs=5245 >>> (t=10502 jiffies g=79401 q=2096) >>> NMI backtrace for cpu 0 >>> CPU: 0 PID: 23184 Comm: syz-executor.5 Not tainted 5.7.0-rc1-syzkaller #0 >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >>> Call Trace: >>> <IRQ> >>> __dump_stack lib/dump_stack.c:77 [inline] >>> dump_stack+0x188/0x20d lib/dump_stack.c:118 >>> nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101 >>> nmi_trigger_cpumask_backtrace+0x231/0x27e lib/nmi_backtrace.c:62 >>> trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline] >>> rcu_dump_cpu_stacks+0x19b/0x1e5 kernel/rcu/tree_stall.h:254 >>> print_cpu_stall kernel/rcu/tree_stall.h:475 [inline] >>> check_cpu_stall kernel/rcu/tree_stall.h:549 [inline] >>> rcu_pending kernel/rcu/tree.c:3225 [inline] >>> rcu_sched_clock_irq.cold+0x55d/0xcfa kernel/rcu/tree.c:2296 >>> update_process_times+0x25/0x60 kernel/time/timer.c:1727 >>> tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:176 >>> tick_sched_timer+0x4e/0x140 kernel/time/tick-sched.c:1320 >>> __run_hrtimer kernel/time/hrtimer.c:1520 [inline] >>> __hrtimer_run_queues+0x5ca/0xed0 kernel/time/hrtimer.c:1584 >>> hrtimer_interrupt+0x312/0x770 kernel/time/hrtimer.c:1646 >>> local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1113 [inline] >>> smp_apic_timer_interrupt+0x15b/0x600 arch/x86/kernel/apic/apic.c:1138 >>> apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829 >>> </IRQ> >>> RIP: 0010:io_ring_ctx_wait_and_kill+0x98/0x5a0 fs/io_uring.c:7301 >>> Code: 01 00 00 4d 89 f4 48 b8 00 00 00 00 00 fc ff df 4c 89 ed 49 c1 ec 03 48 c1 ed 03 49 01 c4 48 01 c5 eb 1c e8 3a ea 9d ff f3 90 <41> 80 3c 24 00 0f 85 53 04 00 00 48 83 bb 10 01 00 00 00 74 21 e8 >>> RSP: 0018:ffffc9000897fdf0 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13 >>> RAX: ffff888024082080 RBX: ffff88808df8e000 RCX: 1ffff9200112ffab >>> RDX: 0000000000000000 RSI: ffffffff81d549c6 RDI: ffff88808df8e300 >>> RBP: ffffed1011bf1c2c R08: 0000000000000001 R09: ffffed1011bf1c61 >>> R10: ffff88808df8e307 R11: ffffed1011bf1c60 R12: ffffed1011bf1c22 >>> R13: ffff88808df8e160 R14: ffff88808df8e110 R15: ffffffff81d54ed0 >>> io_uring_release+0x3e/0x50 fs/io_uring.c:7324 >>> __fput+0x33e/0x880 fs/file_table.c:280 >>> task_work_run+0xf4/0x1b0 kernel/task_work.c:123 >>> tracehook_notify_resume include/linux/tracehook.h:188 [inline] >>> exit_to_usermode_loop+0x2fa/0x360 arch/x86/entry/common.c:165 >>> prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline] >>> syscall_return_slowpath arch/x86/entry/common.c:279 [inline] >>> do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305 >>> entry_SYSCALL_64_after_hwframe+0x49/0xb3 >> >> Make io ring ctx's percpu_ref balanced. >> >> --- a/fs/io_uring.c >> +++ b/fs/io_uring.c >> @@ -5904,6 +5904,7 @@ static int io_submit_sqes(struct io_ring >> fail_req: >> io_cqring_add_event(req, err); >> io_double_put_req(req); >> + --submitted; >> break; >> } > > > fs/io_uring.c > 5880 for (i = 0; i < nr; i++) { > 5881 const struct io_uring_sqe *sqe; > 5882 struct io_kiocb *req; > 5883 int err; > 5884 > 5885 sqe = io_get_sqe(ctx); > 5886 if (unlikely(!sqe)) { > 5887 io_consume_sqe(ctx); > 5888 break; > 5889 } > 5890 req = io_alloc_req(ctx, statep); > 5891 if (unlikely(!req)) { > 5892 if (!submitted) > 5893 submitted = -EAGAIN; > 5894 break; > 5895 } > 5896 > 5897 err = io_init_req(ctx, req, sqe, statep, async); > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > On the success path io_init_req() takes some references like: > > get_cred(req->work.creds); If a req have got into io_init_req(), than it'll be put at some point with io_put_req(). io_req_work_drop_env() called from there will clean up req->work.creds. > > That one is probably buggy and should be put if the call to: > > return io_req_set_file(state, req, fd, sqe_flags); > > fails... But io_req_set_file() takes some other references if it > succeeds like percpu_ref_get(req->fixed_file_refs); and it's not clear > that those are released if io_submit_sqe() fails. The same should happen with req->fixed_file_refs, though I don't remember in details. > > 5898 io_consume_sqe(ctx); > 5899 /* will complete beyond this point, count as submitted */ > 5900 submitted++; Regarding, "--submitted" patch -- we take 1 ctx->refs per request, which is put in io_put_req(). So after a request passes the line above (5900), it's ref will be eventually dropped in io_put_req() and friends. And it's a bit more peculiar because io_submit_sqes() batch-takes N refs first, and then puts unused back at the end. > 5901 > 5902 if (unlikely(err)) { > 5903 fail_req: > 5904 io_cqring_add_event(req, err); > 5905 io_double_put_req(req); > 5906 break; > 5907 } > 5908 > 5909 trace_io_uring_submit_sqe(ctx, req->opcode, req->user_data, > 5910 true, async); > 5911 err = io_submit_sqe(req, sqe, statep, &link); > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > here > > 5912 if (err) > 5913 goto fail_req; > 5914 } > > regards, > dan carpenter > -- Pavel Begunkov ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: INFO: rcu detected stall in io_uring_release 2020-04-18 18:59 INFO: rcu detected stall in io_uring_release syzbot [not found] ` <[email protected]> @ 2020-05-12 2:17 ` syzbot 2020-05-12 15:25 ` Jens Axboe 1 sibling, 1 reply; 6+ messages in thread From: syzbot @ 2020-05-12 2:17 UTC (permalink / raw) To: asml.silence, axboe, dan.carpenter, hdanton, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs, viro syzbot has found a reproducer for the following crash on: HEAD commit: ac935d22 Add linux-next specific files for 20200415 git tree: linux-next console output: https://syzkaller.appspot.com/x/log.txt?x=13a38f0a100000 kernel config: https://syzkaller.appspot.com/x/.config?x=bc498783097e9019 dashboard link: https://syzkaller.appspot.com/bug?extid=66243bb7126c410cefe6 compiler: gcc (GCC) 9.0.0 20181231 (experimental) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=179b3b32100000 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: [email protected] rcu: INFO: rcu_preempt self-detected stall on CPU rcu: 0-....: (10533 ticks this GP) idle=aea/1/0x4000000000000002 softirq=10553/10555 fqs=5249 (t=10501 jiffies g=12777 q=556) NMI backtrace for cpu 0 CPU: 0 PID: 8729 Comm: syz-executor.3 Not tainted 5.7.0-rc1-next-20200415-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x188/0x20d lib/dump_stack.c:118 nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101 nmi_trigger_cpumask_backtrace+0x231/0x27e lib/nmi_backtrace.c:62 trigger_single_cpu_backtrace include/linux/nmi.h:165 [inline] rcu_dump_cpu_stacks+0x19b/0x1e5 kernel/rcu/tree_stall.h:254 print_cpu_stall kernel/rcu/tree_stall.h:475 [inline] check_cpu_stall kernel/rcu/tree_stall.h:549 [inline] rcu_pending kernel/rcu/tree.c:3225 [inline] rcu_sched_clock_irq.cold+0x55d/0xd00 kernel/rcu/tree.c:2296 update_process_times+0x25/0x60 kernel/time/timer.c:1727 tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:176 tick_sched_timer+0x4e/0x140 kernel/time/tick-sched.c:1320 __run_hrtimer kernel/time/hrtimer.c:1520 [inline] __hrtimer_run_queues+0x5ca/0xed0 kernel/time/hrtimer.c:1584 hrtimer_interrupt+0x312/0x770 kernel/time/hrtimer.c:1646 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1113 [inline] smp_apic_timer_interrupt+0x15b/0x600 arch/x86/kernel/apic/apic.c:1138 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829 </IRQ> RIP: 0010:io_ring_ctx_wait_and_kill+0x98/0x5a0 fs/io_uring.c:7301 Code: 01 00 00 4d 89 f4 48 b8 00 00 00 00 00 fc ff df 4c 89 ed 49 c1 ec 03 48 c1 ed 03 49 01 c4 48 01 c5 eb 1c e8 ba 65 9d ff f3 90 <41> 80 3c 24 00 0f 85 53 04 00 00 48 83 bb 10 01 00 00 00 74 21 e8 RSP: 0018:ffffc900054d7a50 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13 RAX: ffff888097c14400 RBX: ffff8880963a0000 RCX: 1ffff92000a9af37 RDX: 0000000000000000 RSI: ffffffff81d5ced6 RDI: ffff8880963a0300 RBP: ffffed1012c7402c R08: 0000000000000001 R09: ffffed1012c74061 R10: ffff8880963a0307 R11: ffffed1012c74060 R12: ffffed1012c74022 R13: ffff8880963a0160 R14: ffff8880963a0110 R15: ffffffff81d5d3e0 io_uring_release+0x3e/0x50 fs/io_uring.c:7324 __fput+0x33e/0x880 fs/file_table.c:280 task_work_run+0xf4/0x1b0 kernel/task_work.c:123 exit_task_work include/linux/task_work.h:22 [inline] do_exit+0xb53/0x2e10 kernel/exit.c:795 do_group_exit+0x125/0x340 kernel/exit.c:893 get_signal+0x47b/0x24e0 kernel/signal.c:2735 do_signal+0x81/0x2240 arch/x86/kernel/signal.c:784 exit_to_usermode_loop+0x26c/0x360 arch/x86/entry/common.c:161 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline] syscall_return_slowpath arch/x86/entry/common.c:279 [inline] do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305 entry_SYSCALL_64_after_hwframe+0x49/0xb3 RIP: 0033:0x45c829 Code: Bad RIP value. RSP: 002b:00007fdd2a37fc78 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9 RAX: 0000000000000003 RBX: 00000000004e0ce0 RCX: 000000000045c829 RDX: 0000000000000000 RSI: 0000000020000580 RDI: 00000000000000f1 RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 0000000000000204 R14: 00000000004c425f R15: 00007fdd2a3806d4 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: INFO: rcu detected stall in io_uring_release 2020-05-12 2:17 ` syzbot @ 2020-05-12 15:25 ` Jens Axboe 0 siblings, 0 replies; 6+ messages in thread From: Jens Axboe @ 2020-05-12 15:25 UTC (permalink / raw) To: syzbot, asml.silence, dan.carpenter, hdanton, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs, viro On 5/11/20 8:17 PM, syzbot wrote: > syzbot has found a reproducer for the following crash on: > > HEAD commit: ac935d22 Add linux-next specific files for 20200415 > git tree: linux-next > console output: https://syzkaller.appspot.com/x/log.txt?x=13a38f0a100000 > kernel config: https://syzkaller.appspot.com/x/.config?x=bc498783097e9019 > dashboard link: https://syzkaller.appspot.com/bug?extid=66243bb7126c410cefe6 > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=179b3b32100000 > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: [email protected] > > rcu: INFO: rcu_preempt self-detected stall on CPU > rcu: 0-....: (10533 ticks this GP) idle=aea/1/0x4000000000000002 softirq=10553/10555 fqs=5249 > (t=10501 jiffies g=12777 q=556) > NMI backtrace for cpu 0 > CPU: 0 PID: 8729 Comm: syz-executor.3 Not tainted 5.7.0-rc1-next-20200415-syzkaller #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > Call Trace: > <IRQ> > __dump_stack lib/dump_stack.c:77 [inline] > dump_stack+0x188/0x20d lib/dump_stack.c:118 > nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101 > nmi_trigger_cpumask_backtrace+0x231/0x27e lib/nmi_backtrace.c:62 > trigger_single_cpu_backtrace include/linux/nmi.h:165 [inline] > rcu_dump_cpu_stacks+0x19b/0x1e5 kernel/rcu/tree_stall.h:254 > print_cpu_stall kernel/rcu/tree_stall.h:475 [inline] > check_cpu_stall kernel/rcu/tree_stall.h:549 [inline] > rcu_pending kernel/rcu/tree.c:3225 [inline] > rcu_sched_clock_irq.cold+0x55d/0xd00 kernel/rcu/tree.c:2296 > update_process_times+0x25/0x60 kernel/time/timer.c:1727 > tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:176 > tick_sched_timer+0x4e/0x140 kernel/time/tick-sched.c:1320 > __run_hrtimer kernel/time/hrtimer.c:1520 [inline] > __hrtimer_run_queues+0x5ca/0xed0 kernel/time/hrtimer.c:1584 > hrtimer_interrupt+0x312/0x770 kernel/time/hrtimer.c:1646 > local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1113 [inline] > smp_apic_timer_interrupt+0x15b/0x600 arch/x86/kernel/apic/apic.c:1138 > apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829 > </IRQ> > RIP: 0010:io_ring_ctx_wait_and_kill+0x98/0x5a0 fs/io_uring.c:7301 > Code: 01 00 00 4d 89 f4 48 b8 00 00 00 00 00 fc ff df 4c 89 ed 49 c1 ec 03 48 c1 ed 03 49 01 c4 48 01 c5 eb 1c e8 ba 65 9d ff f3 90 <41> 80 3c 24 00 0f 85 53 04 00 00 48 83 bb 10 01 00 00 00 74 21 e8 > RSP: 0018:ffffc900054d7a50 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13 > RAX: ffff888097c14400 RBX: ffff8880963a0000 RCX: 1ffff92000a9af37 > RDX: 0000000000000000 RSI: ffffffff81d5ced6 RDI: ffff8880963a0300 > RBP: ffffed1012c7402c R08: 0000000000000001 R09: ffffed1012c74061 > R10: ffff8880963a0307 R11: ffffed1012c74060 R12: ffffed1012c74022 > R13: ffff8880963a0160 R14: ffff8880963a0110 R15: ffffffff81d5d3e0 > io_uring_release+0x3e/0x50 fs/io_uring.c:7324 > __fput+0x33e/0x880 fs/file_table.c:280 > task_work_run+0xf4/0x1b0 kernel/task_work.c:123 > exit_task_work include/linux/task_work.h:22 [inline] > do_exit+0xb53/0x2e10 kernel/exit.c:795 > do_group_exit+0x125/0x340 kernel/exit.c:893 > get_signal+0x47b/0x24e0 kernel/signal.c:2735 > do_signal+0x81/0x2240 arch/x86/kernel/signal.c:784 > exit_to_usermode_loop+0x26c/0x360 arch/x86/entry/common.c:161 > prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline] > syscall_return_slowpath arch/x86/entry/common.c:279 [inline] > do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305 > entry_SYSCALL_64_after_hwframe+0x49/0xb3 > RIP: 0033:0x45c829 > Code: Bad RIP value. > RSP: 002b:00007fdd2a37fc78 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9 > RAX: 0000000000000003 RBX: 00000000004e0ce0 RCX: 000000000045c829 > RDX: 0000000000000000 RSI: 0000000020000580 RDI: 00000000000000f1 > RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff > R13: 0000000000000204 R14: 00000000004c425f R15: 00007fdd2a3806d4 #syz fix: io_uring: use cond_resched() in io_ring_ctx_wait_and_kill() -- Jens Axboe ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-05-12 15:26 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-04-18 18:59 INFO: rcu detected stall in io_uring_release syzbot
[not found] ` <[email protected]>
2020-04-19 19:57 ` Jens Axboe
2020-04-20 11:47 ` Dan Carpenter
2020-04-20 12:57 ` Pavel Begunkov
2020-05-12 2:17 ` syzbot
2020-05-12 15:25 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox