* possible deadlock in __io_queue_deferred @ 2020-08-10 15:36 syzbot 2020-08-10 15:55 ` Jens Axboe 0 siblings, 1 reply; 6+ messages in thread From: syzbot @ 2020-08-10 15:36 UTC (permalink / raw) To: axboe, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs, viro Hello, syzbot found the following issue on: HEAD commit: 449dc8c9 Merge tag 'for-v5.9' of git://git.kernel.org/pub/.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=14d41e02900000 kernel config: https://syzkaller.appspot.com/x/.config?x=9d25235bf0162fbc dashboard link: https://syzkaller.appspot.com/bug?extid=996f91b6ec3812c48042 compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=133c9006900000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1191cb1a900000 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: [email protected] ============================================ WARNING: possible recursive locking detected 5.8.0-syzkaller #0 Not tainted -------------------------------------------- syz-executor287/6816 is trying to acquire lock: ffff888093cdb4d8 (&ctx->completion_lock){....}-{2:2}, at: spin_lock_irq include/linux/spinlock.h:379 [inline] ffff888093cdb4d8 (&ctx->completion_lock){....}-{2:2}, at: io_queue_linked_timeout fs/io_uring.c:5928 [inline] ffff888093cdb4d8 (&ctx->completion_lock){....}-{2:2}, at: __io_queue_async_work fs/io_uring.c:1192 [inline] ffff888093cdb4d8 (&ctx->completion_lock){....}-{2:2}, at: __io_queue_deferred+0x36a/0x790 fs/io_uring.c:1237 but task is already holding lock: ffff888093cdb4d8 (&ctx->completion_lock){....}-{2:2}, at: io_cqring_overflow_flush+0xc6/0xab0 fs/io_uring.c:1333 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&ctx->completion_lock); lock(&ctx->completion_lock); *** DEADLOCK *** May be due to missing lock nesting notation 1 lock held by syz-executor287/6816: #0: ffff888093cdb4d8 (&ctx->completion_lock){....}-{2:2}, at: io_cqring_overflow_flush+0xc6/0xab0 fs/io_uring.c:1333 stack backtrace: CPU: 1 PID: 6816 Comm: syz-executor287 Not tainted 5.8.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1f0/0x31e lib/dump_stack.c:118 print_deadlock_bug kernel/locking/lockdep.c:2391 [inline] check_deadlock kernel/locking/lockdep.c:2432 [inline] validate_chain+0x69a4/0x88a0 kernel/locking/lockdep.c:3202 __lock_acquire+0x1161/0x2ab0 kernel/locking/lockdep.c:4426 lock_acquire+0x160/0x730 kernel/locking/lockdep.c:5005 __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline] _raw_spin_lock_irq+0x67/0x80 kernel/locking/spinlock.c:167 spin_lock_irq include/linux/spinlock.h:379 [inline] io_queue_linked_timeout fs/io_uring.c:5928 [inline] __io_queue_async_work fs/io_uring.c:1192 [inline] __io_queue_deferred+0x36a/0x790 fs/io_uring.c:1237 io_cqring_overflow_flush+0x774/0xab0 fs/io_uring.c:1359 io_ring_ctx_wait_and_kill+0x2a1/0x570 fs/io_uring.c:7808 io_uring_release+0x59/0x70 fs/io_uring.c:7829 __fput+0x34f/0x7b0 fs/file_table.c:281 task_work_run+0x137/0x1c0 kernel/task_work.c:135 exit_task_work include/linux/task_work.h:25 [inline] do_exit+0x5f3/0x1f20 kernel/exit.c:806 do_group_exit+0x161/0x2d0 kernel/exit.c:903 __do_sys_exit_group+0x13/0x20 kernel/exit.c:914 __se_sys_exit_group+0x10/0x10 kernel/exit.c:912 __x64_sys_exit_group+0x37/0x40 kernel/exit.c:912 do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x43f598 Code: Bad RIP value. RSP: 002b:00007fffdac2bf58 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000043f598 RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000 RBP: 00000000004beda8 R08: 00000000000000e7 R09: ffffffffffffffd0 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 R13: 00000000006d11a0 R14: 0000000000000000 R15: 0000000000000000 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at [email protected]. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. syzbot can test patches for this issue, for details see: https://goo.gl/tpsmEJ#testing-patches ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: possible deadlock in __io_queue_deferred 2020-08-10 15:36 possible deadlock in __io_queue_deferred syzbot @ 2020-08-10 15:55 ` Jens Axboe 2020-08-11 14:00 ` Stefano Garzarella 0 siblings, 1 reply; 6+ messages in thread From: Jens Axboe @ 2020-08-10 15:55 UTC (permalink / raw) To: syzbot, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs, viro On 8/10/20 9:36 AM, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit: 449dc8c9 Merge tag 'for-v5.9' of git://git.kernel.org/pub/.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=14d41e02900000 > kernel config: https://syzkaller.appspot.com/x/.config?x=9d25235bf0162fbc > dashboard link: https://syzkaller.appspot.com/bug?extid=996f91b6ec3812c48042 > compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81) > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=133c9006900000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1191cb1a900000 > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: [email protected] Thanks, the below should fix this one. diff --git a/fs/io_uring.c b/fs/io_uring.c index 443eecdfeda9..f9be665d1c5e 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -898,6 +898,7 @@ static void io_put_req(struct io_kiocb *req); static void io_double_put_req(struct io_kiocb *req); static void __io_double_put_req(struct io_kiocb *req); static struct io_kiocb *io_prep_linked_timeout(struct io_kiocb *req); +static void __io_queue_linked_timeout(struct io_kiocb *req); static void io_queue_linked_timeout(struct io_kiocb *req); static int __io_sqe_files_update(struct io_ring_ctx *ctx, struct io_uring_files_update *ip, @@ -1179,7 +1180,7 @@ static void io_prep_async_link(struct io_kiocb *req) io_prep_async_work(cur); } -static void __io_queue_async_work(struct io_kiocb *req) +static struct io_kiocb *__io_queue_async_work(struct io_kiocb *req) { struct io_ring_ctx *ctx = req->ctx; struct io_kiocb *link = io_prep_linked_timeout(req); @@ -1187,16 +1188,19 @@ static void __io_queue_async_work(struct io_kiocb *req) trace_io_uring_queue_async_work(ctx, io_wq_is_hashed(&req->work), req, &req->work, req->flags); io_wq_enqueue(ctx->io_wq, &req->work); - - if (link) - io_queue_linked_timeout(link); + return link; } static void io_queue_async_work(struct io_kiocb *req) { + struct io_kiocb *link; + /* init ->work of the whole link before punting */ io_prep_async_link(req); - __io_queue_async_work(req); + link = __io_queue_async_work(req); + + if (link) + io_queue_linked_timeout(link); } static void io_kill_timeout(struct io_kiocb *req) @@ -1229,12 +1233,19 @@ static void __io_queue_deferred(struct io_ring_ctx *ctx) do { struct io_defer_entry *de = list_first_entry(&ctx->defer_list, struct io_defer_entry, list); + struct io_kiocb *link; if (req_need_defer(de->req, de->seq)) break; list_del_init(&de->list); /* punt-init is done before queueing for defer */ - __io_queue_async_work(de->req); + link = __io_queue_async_work(de->req); + if (link) { + __io_queue_linked_timeout(link); + /* drop submission reference */ + link->flags |= REQ_F_COMP_LOCKED; + io_put_req(link); + } kfree(de); } while (!list_empty(&ctx->defer_list)); } @@ -5945,15 +5956,12 @@ static enum hrtimer_restart io_link_timeout_fn(struct hrtimer *timer) return HRTIMER_NORESTART; } -static void io_queue_linked_timeout(struct io_kiocb *req) +static void __io_queue_linked_timeout(struct io_kiocb *req) { - struct io_ring_ctx *ctx = req->ctx; - /* * If the list is now empty, then our linked request finished before * we got a chance to setup the timer */ - spin_lock_irq(&ctx->completion_lock); if (!list_empty(&req->link_list)) { struct io_timeout_data *data = &req->io->timeout; @@ -5961,6 +5969,14 @@ static void io_queue_linked_timeout(struct io_kiocb *req) hrtimer_start(&data->timer, timespec64_to_ktime(data->ts), data->mode); } +} + +static void io_queue_linked_timeout(struct io_kiocb *req) +{ + struct io_ring_ctx *ctx = req->ctx; + + spin_lock_irq(&ctx->completion_lock); + __io_queue_linked_timeout(req); spin_unlock_irq(&ctx->completion_lock); /* drop submission reference */ -- Jens Axboe ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: possible deadlock in __io_queue_deferred 2020-08-10 15:55 ` Jens Axboe @ 2020-08-11 14:00 ` Stefano Garzarella 2020-08-11 14:21 ` Jens Axboe 0 siblings, 1 reply; 6+ messages in thread From: Stefano Garzarella @ 2020-08-11 14:00 UTC (permalink / raw) To: Jens Axboe Cc: syzbot, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs, viro On Mon, Aug 10, 2020 at 09:55:17AM -0600, Jens Axboe wrote: > On 8/10/20 9:36 AM, syzbot wrote: > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit: 449dc8c9 Merge tag 'for-v5.9' of git://git.kernel.org/pub/.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=14d41e02900000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=9d25235bf0162fbc > > dashboard link: https://syzkaller.appspot.com/bug?extid=996f91b6ec3812c48042 > > compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81) > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=133c9006900000 > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1191cb1a900000 > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: [email protected] > > Thanks, the below should fix this one. Yeah, it seems right to me, since only __io_queue_deferred() (invoked by io_commit_cqring()) can be called with 'completion_lock' held. Just out of curiosity, while exploring the code I noticed that we call io_commit_cqring() always with the 'completion_lock' held, except in the io_poll_* functions. That's because then there can't be any concurrency? Thanks, Stefano > > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index 443eecdfeda9..f9be665d1c5e 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -898,6 +898,7 @@ static void io_put_req(struct io_kiocb *req); > static void io_double_put_req(struct io_kiocb *req); > static void __io_double_put_req(struct io_kiocb *req); > static struct io_kiocb *io_prep_linked_timeout(struct io_kiocb *req); > +static void __io_queue_linked_timeout(struct io_kiocb *req); > static void io_queue_linked_timeout(struct io_kiocb *req); > static int __io_sqe_files_update(struct io_ring_ctx *ctx, > struct io_uring_files_update *ip, > @@ -1179,7 +1180,7 @@ static void io_prep_async_link(struct io_kiocb *req) > io_prep_async_work(cur); > } > > -static void __io_queue_async_work(struct io_kiocb *req) > +static struct io_kiocb *__io_queue_async_work(struct io_kiocb *req) > { > struct io_ring_ctx *ctx = req->ctx; > struct io_kiocb *link = io_prep_linked_timeout(req); > @@ -1187,16 +1188,19 @@ static void __io_queue_async_work(struct io_kiocb *req) > trace_io_uring_queue_async_work(ctx, io_wq_is_hashed(&req->work), req, > &req->work, req->flags); > io_wq_enqueue(ctx->io_wq, &req->work); > - > - if (link) > - io_queue_linked_timeout(link); > + return link; > } > > static void io_queue_async_work(struct io_kiocb *req) > { > + struct io_kiocb *link; > + > /* init ->work of the whole link before punting */ > io_prep_async_link(req); > - __io_queue_async_work(req); > + link = __io_queue_async_work(req); > + > + if (link) > + io_queue_linked_timeout(link); > } > > static void io_kill_timeout(struct io_kiocb *req) > @@ -1229,12 +1233,19 @@ static void __io_queue_deferred(struct io_ring_ctx *ctx) > do { > struct io_defer_entry *de = list_first_entry(&ctx->defer_list, > struct io_defer_entry, list); > + struct io_kiocb *link; > > if (req_need_defer(de->req, de->seq)) > break; > list_del_init(&de->list); > /* punt-init is done before queueing for defer */ > - __io_queue_async_work(de->req); > + link = __io_queue_async_work(de->req); > + if (link) { > + __io_queue_linked_timeout(link); > + /* drop submission reference */ > + link->flags |= REQ_F_COMP_LOCKED; > + io_put_req(link); > + } > kfree(de); > } while (!list_empty(&ctx->defer_list)); > } > @@ -5945,15 +5956,12 @@ static enum hrtimer_restart io_link_timeout_fn(struct hrtimer *timer) > return HRTIMER_NORESTART; > } > > -static void io_queue_linked_timeout(struct io_kiocb *req) > +static void __io_queue_linked_timeout(struct io_kiocb *req) > { > - struct io_ring_ctx *ctx = req->ctx; > - > /* > * If the list is now empty, then our linked request finished before > * we got a chance to setup the timer > */ > - spin_lock_irq(&ctx->completion_lock); > if (!list_empty(&req->link_list)) { > struct io_timeout_data *data = &req->io->timeout; > > @@ -5961,6 +5969,14 @@ static void io_queue_linked_timeout(struct io_kiocb *req) > hrtimer_start(&data->timer, timespec64_to_ktime(data->ts), > data->mode); > } > +} > + > +static void io_queue_linked_timeout(struct io_kiocb *req) > +{ > + struct io_ring_ctx *ctx = req->ctx; > + > + spin_lock_irq(&ctx->completion_lock); > + __io_queue_linked_timeout(req); > spin_unlock_irq(&ctx->completion_lock); > > /* drop submission reference */ > > -- > Jens Axboe > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: possible deadlock in __io_queue_deferred 2020-08-11 14:00 ` Stefano Garzarella @ 2020-08-11 14:21 ` Jens Axboe 2020-08-11 14:44 ` Stefano Garzarella 0 siblings, 1 reply; 6+ messages in thread From: Jens Axboe @ 2020-08-11 14:21 UTC (permalink / raw) To: Stefano Garzarella Cc: syzbot, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs, viro On 8/11/20 8:00 AM, Stefano Garzarella wrote: > On Mon, Aug 10, 2020 at 09:55:17AM -0600, Jens Axboe wrote: >> On 8/10/20 9:36 AM, syzbot wrote: >>> Hello, >>> >>> syzbot found the following issue on: >>> >>> HEAD commit: 449dc8c9 Merge tag 'for-v5.9' of git://git.kernel.org/pub/.. >>> git tree: upstream >>> console output: https://syzkaller.appspot.com/x/log.txt?x=14d41e02900000 >>> kernel config: https://syzkaller.appspot.com/x/.config?x=9d25235bf0162fbc >>> dashboard link: https://syzkaller.appspot.com/bug?extid=996f91b6ec3812c48042 >>> compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81) >>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=133c9006900000 >>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1191cb1a900000 >>> >>> IMPORTANT: if you fix the issue, please add the following tag to the commit: >>> Reported-by: [email protected] >> >> Thanks, the below should fix this one. > > Yeah, it seems right to me, since only __io_queue_deferred() (invoked by > io_commit_cqring()) can be called with 'completion_lock' held. Right > Just out of curiosity, while exploring the code I noticed that we call > io_commit_cqring() always with the 'completion_lock' held, except in the > io_poll_* functions. > > That's because then there can't be any concurrency? Do you mean the iopoll functions? Because we're definitely holding it for the io_poll_* functions. -- Jens Axboe ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: possible deadlock in __io_queue_deferred 2020-08-11 14:21 ` Jens Axboe @ 2020-08-11 14:44 ` Stefano Garzarella 2020-08-11 14:45 ` Jens Axboe 0 siblings, 1 reply; 6+ messages in thread From: Stefano Garzarella @ 2020-08-11 14:44 UTC (permalink / raw) To: Jens Axboe Cc: syzbot, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs, viro On Tue, Aug 11, 2020 at 08:21:12AM -0600, Jens Axboe wrote: > On 8/11/20 8:00 AM, Stefano Garzarella wrote: > > On Mon, Aug 10, 2020 at 09:55:17AM -0600, Jens Axboe wrote: > >> On 8/10/20 9:36 AM, syzbot wrote: > >>> Hello, > >>> > >>> syzbot found the following issue on: > >>> > >>> HEAD commit: 449dc8c9 Merge tag 'for-v5.9' of git://git.kernel.org/pub/.. > >>> git tree: upstream > >>> console output: https://syzkaller.appspot.com/x/log.txt?x=14d41e02900000 > >>> kernel config: https://syzkaller.appspot.com/x/.config?x=9d25235bf0162fbc > >>> dashboard link: https://syzkaller.appspot.com/bug?extid=996f91b6ec3812c48042 > >>> compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81) > >>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=133c9006900000 > >>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1191cb1a900000 > >>> > >>> IMPORTANT: if you fix the issue, please add the following tag to the commit: > >>> Reported-by: [email protected] > >> > >> Thanks, the below should fix this one. > > > > Yeah, it seems right to me, since only __io_queue_deferred() (invoked by > > io_commit_cqring()) can be called with 'completion_lock' held. > > Right > > > Just out of curiosity, while exploring the code I noticed that we call > > io_commit_cqring() always with the 'completion_lock' held, except in the > > io_poll_* functions. > > > > That's because then there can't be any concurrency? > > Do you mean the iopoll functions? Because we're definitely holding it > for the io_poll_* functions. Right, the only one seems io_iopoll_complete(). So, IIUC, in this case we are actively polling the level below, so there shouldn't be any asynchronous events, is it right? Thanks, Stefano ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: possible deadlock in __io_queue_deferred 2020-08-11 14:44 ` Stefano Garzarella @ 2020-08-11 14:45 ` Jens Axboe 0 siblings, 0 replies; 6+ messages in thread From: Jens Axboe @ 2020-08-11 14:45 UTC (permalink / raw) To: Stefano Garzarella Cc: syzbot, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs, viro On 8/11/20 8:44 AM, Stefano Garzarella wrote: > On Tue, Aug 11, 2020 at 08:21:12AM -0600, Jens Axboe wrote: >> On 8/11/20 8:00 AM, Stefano Garzarella wrote: >>> On Mon, Aug 10, 2020 at 09:55:17AM -0600, Jens Axboe wrote: >>>> On 8/10/20 9:36 AM, syzbot wrote: >>>>> Hello, >>>>> >>>>> syzbot found the following issue on: >>>>> >>>>> HEAD commit: 449dc8c9 Merge tag 'for-v5.9' of git://git.kernel.org/pub/.. >>>>> git tree: upstream >>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=14d41e02900000 >>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=9d25235bf0162fbc >>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=996f91b6ec3812c48042 >>>>> compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81) >>>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=133c9006900000 >>>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1191cb1a900000 >>>>> >>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit: >>>>> Reported-by: [email protected] >>>> >>>> Thanks, the below should fix this one. >>> >>> Yeah, it seems right to me, since only __io_queue_deferred() (invoked by >>> io_commit_cqring()) can be called with 'completion_lock' held. >> >> Right >> >>> Just out of curiosity, while exploring the code I noticed that we call >>> io_commit_cqring() always with the 'completion_lock' held, except in the >>> io_poll_* functions. >>> >>> That's because then there can't be any concurrency? >> >> Do you mean the iopoll functions? Because we're definitely holding it >> for the io_poll_* functions. > > Right, the only one seems io_iopoll_complete(). > > So, IIUC, in this case we are actively polling the level below, > so there shouldn't be any asynchronous events, is it right? Right, that's serialized by itself. -- Jens Axboe ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-08-11 14:45 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-08-10 15:36 possible deadlock in __io_queue_deferred syzbot 2020-08-10 15:55 ` Jens Axboe 2020-08-11 14:00 ` Stefano Garzarella 2020-08-11 14:21 ` Jens Axboe 2020-08-11 14:44 ` Stefano Garzarella 2020-08-11 14:45 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox