* [PATCH v2] io_uring: fix possible deadlock in io_uring_poll
@ 2021-02-05 7:49 Hao Xu
2021-02-05 8:34 ` [PATCH v3] " Hao Xu
0 siblings, 1 reply; 5+ messages in thread
From: Hao Xu @ 2021-02-05 7:49 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi
Abaci reported follow issue:
[ 30.615891] ======================================================
[ 30.616648] WARNING: possible circular locking dependency detected
[ 30.617423] 5.11.0-rc3-next-20210115 #1 Not tainted
[ 30.618035] ------------------------------------------------------
[ 30.618914] a.out/1128 is trying to acquire lock:
[ 30.619520] ffff88810b063868 (&ep->mtx){+.+.}-{3:3}, at: __ep_eventpoll_poll+0x9f/0x220
[ 30.620505]
[ 30.620505] but task is already holding lock:
[ 30.621218] ffff88810e952be8 (&ctx->uring_lock){+.+.}-{3:3}, at: __x64_sys_io_uring_enter+0x3f0/0x5b0
[ 30.622349]
[ 30.622349] which lock already depends on the new lock.
[ 30.622349]
[ 30.623289]
[ 30.623289] the existing dependency chain (in reverse order) is:
[ 30.624243]
[ 30.624243] -> #1 (&ctx->uring_lock){+.+.}-{3:3}:
[ 30.625263] lock_acquire+0x2c7/0x390
[ 30.625868] __mutex_lock+0xae/0x9f0
[ 30.626451] io_cqring_overflow_flush.part.95+0x6d/0x70
[ 30.627278] io_uring_poll+0xcb/0xd0
[ 30.627890] ep_item_poll.isra.14+0x4e/0x90
[ 30.628531] do_epoll_ctl+0xb7e/0x1120
[ 30.629122] __x64_sys_epoll_ctl+0x70/0xb0
[ 30.629770] do_syscall_64+0x2d/0x40
[ 30.630332] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 30.631187]
[ 30.631187] -> #0 (&ep->mtx){+.+.}-{3:3}:
[ 30.631985] check_prevs_add+0x226/0xb00
[ 30.632584] __lock_acquire+0x1237/0x13a0
[ 30.633207] lock_acquire+0x2c7/0x390
[ 30.633740] __mutex_lock+0xae/0x9f0
[ 30.634258] __ep_eventpoll_poll+0x9f/0x220
[ 30.634879] __io_arm_poll_handler+0xbf/0x220
[ 30.635462] io_issue_sqe+0xa6b/0x13e0
[ 30.635982] __io_queue_sqe+0x10b/0x550
[ 30.636648] io_queue_sqe+0x235/0x470
[ 30.637281] io_submit_sqes+0xcce/0xf10
[ 30.637839] __x64_sys_io_uring_enter+0x3fb/0x5b0
[ 30.638465] do_syscall_64+0x2d/0x40
[ 30.638999] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 30.639643]
[ 30.639643] other info that might help us debug this:
[ 30.639643]
[ 30.640618] Possible unsafe locking scenario:
[ 30.640618]
[ 30.641402] CPU0 CPU1
[ 30.641938] ---- ----
[ 30.642664] lock(&ctx->uring_lock);
[ 30.643425] lock(&ep->mtx);
[ 30.644498] lock(&ctx->uring_lock);
[ 30.645668] lock(&ep->mtx);
[ 30.646321]
[ 30.646321] *** DEADLOCK ***
[ 30.646321]
[ 30.647642] 1 lock held by a.out/1128:
[ 30.648424] #0: ffff88810e952be8 (&ctx->uring_lock){+.+.}-{3:3}, at: __x64_sys_io_uring_enter+0x3f0/0x5b0
[ 30.649954]
[ 30.649954] stack backtrace:
[ 30.650592] CPU: 1 PID: 1128 Comm: a.out Not tainted 5.11.0-rc3-next-20210115 #1
[ 30.651554] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 30.652290] Call Trace:
[ 30.652688] dump_stack+0xac/0xe3
[ 30.653164] check_noncircular+0x11e/0x130
[ 30.653747] ? check_prevs_add+0x226/0xb00
[ 30.654303] check_prevs_add+0x226/0xb00
[ 30.654845] ? add_lock_to_list.constprop.49+0xac/0x1d0
[ 30.655564] __lock_acquire+0x1237/0x13a0
[ 30.656262] lock_acquire+0x2c7/0x390
[ 30.656788] ? __ep_eventpoll_poll+0x9f/0x220
[ 30.657379] ? __io_queue_proc.isra.88+0x180/0x180
[ 30.658014] __mutex_lock+0xae/0x9f0
[ 30.658524] ? __ep_eventpoll_poll+0x9f/0x220
[ 30.659112] ? mark_held_locks+0x5a/0x80
[ 30.659648] ? __ep_eventpoll_poll+0x9f/0x220
[ 30.660229] ? _raw_spin_unlock_irqrestore+0x2d/0x40
[ 30.660885] ? trace_hardirqs_on+0x46/0x110
[ 30.661471] ? __io_queue_proc.isra.88+0x180/0x180
[ 30.662102] ? __ep_eventpoll_poll+0x9f/0x220
[ 30.662696] __ep_eventpoll_poll+0x9f/0x220
[ 30.663273] ? __ep_eventpoll_poll+0x220/0x220
[ 30.663875] __io_arm_poll_handler+0xbf/0x220
[ 30.664463] io_issue_sqe+0xa6b/0x13e0
[ 30.664984] ? __lock_acquire+0x782/0x13a0
[ 30.665544] ? __io_queue_proc.isra.88+0x180/0x180
[ 30.666170] ? __io_queue_sqe+0x10b/0x550
[ 30.666725] __io_queue_sqe+0x10b/0x550
[ 30.667252] ? __fget_files+0x131/0x260
[ 30.667791] ? io_req_prep+0xd8/0x1090
[ 30.668316] ? io_queue_sqe+0x235/0x470
[ 30.668868] io_queue_sqe+0x235/0x470
[ 30.669398] io_submit_sqes+0xcce/0xf10
[ 30.669931] ? xa_load+0xe4/0x1c0
[ 30.670425] __x64_sys_io_uring_enter+0x3fb/0x5b0
[ 30.671051] ? lockdep_hardirqs_on_prepare+0xde/0x180
[ 30.671719] ? syscall_enter_from_user_mode+0x2b/0x80
[ 30.672380] do_syscall_64+0x2d/0x40
[ 30.672901] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 30.673503] RIP: 0033:0x7fd89c813239
[ 30.673962] Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 73 01 c3 48 8b 0d 27 ec 2c 00 f7 d8 64 89 01 48
[ 30.675920] RSP: 002b:00007ffc65a7c628 EFLAGS: 00000217 ORIG_RAX: 00000000000001aa
[ 30.676791] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd89c813239
[ 30.677594] RDX: 0000000000000000 RSI: 0000000000000014 RDI: 0000000000000003
[ 30.678678] RBP: 00007ffc65a7c720 R08: 0000000000000000 R09: 0000000003000000
[ 30.679492] R10: 0000000000000000 R11: 0000000000000217 R12: 0000000000400ff0
[ 30.680282] R13: 00007ffc65a7c840 R14: 0000000000000000 R15: 0000000000000000
This might happen if we do epoll_wait on a uring fd while reading/writing
the former epoll fd in a sqe in the former uring instance.
So let's don't flush cqring overflow list, just do a simple check.
Reported-by: Abaci <[email protected]>
Fixes: 6c503150ae33 ("io_uring: patch up IOPOLL overflow_flush sync")
Signed-off-by: Hao Xu <[email protected]>
---
I think we should put a note somewhere, otherwise users may think there
is nothing when they didn't get EPOLLIN | EPOLLRDNORM, they may
consider this as "there is definitely nothing" and aren't aware of
doing overflow flush then.
fs/io_uring.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 38c6cbe1ab38..fd7ea089a4c0 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -8718,8 +8718,15 @@ static __poll_t io_uring_poll(struct file *file, poll_table *wait)
smp_rmb();
if (!io_sqring_full(ctx))
mask |= EPOLLOUT | EPOLLWRNORM;
- io_cqring_overflow_flush(ctx, false, NULL, NULL);
- if (io_cqring_events(ctx))
+
+ /*
+ * Don't flush cqring overflow list here, just do a simple check.
+ * Otherwise there could possible be ABBA deadlock.
+ * Users may get EPOLLIN meanwhile seeing nothing in cqring, this
+ * pushs them to do the flush.
+ * More info in commit log.
+ */
+ if (io_cqring_events(ctx) || test_bit(0, &ctx->cq_check_overflow))
mask |= EPOLLIN | EPOLLRDNORM;
return mask;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v3] io_uring: fix possible deadlock in io_uring_poll
2021-02-05 7:49 [PATCH v2] io_uring: fix possible deadlock in io_uring_poll Hao Xu
@ 2021-02-05 8:34 ` Hao Xu
2021-02-05 12:58 ` Pavel Begunkov
2021-02-10 2:27 ` Jens Axboe
0 siblings, 2 replies; 5+ messages in thread
From: Hao Xu @ 2021-02-05 8:34 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi
Abaci reported follow issue:
[ 30.615891] ======================================================
[ 30.616648] WARNING: possible circular locking dependency detected
[ 30.617423] 5.11.0-rc3-next-20210115 #1 Not tainted
[ 30.618035] ------------------------------------------------------
[ 30.618914] a.out/1128 is trying to acquire lock:
[ 30.619520] ffff88810b063868 (&ep->mtx){+.+.}-{3:3}, at: __ep_eventpoll_poll+0x9f/0x220
[ 30.620505]
[ 30.620505] but task is already holding lock:
[ 30.621218] ffff88810e952be8 (&ctx->uring_lock){+.+.}-{3:3}, at: __x64_sys_io_uring_enter+0x3f0/0x5b0
[ 30.622349]
[ 30.622349] which lock already depends on the new lock.
[ 30.622349]
[ 30.623289]
[ 30.623289] the existing dependency chain (in reverse order) is:
[ 30.624243]
[ 30.624243] -> #1 (&ctx->uring_lock){+.+.}-{3:3}:
[ 30.625263] lock_acquire+0x2c7/0x390
[ 30.625868] __mutex_lock+0xae/0x9f0
[ 30.626451] io_cqring_overflow_flush.part.95+0x6d/0x70
[ 30.627278] io_uring_poll+0xcb/0xd0
[ 30.627890] ep_item_poll.isra.14+0x4e/0x90
[ 30.628531] do_epoll_ctl+0xb7e/0x1120
[ 30.629122] __x64_sys_epoll_ctl+0x70/0xb0
[ 30.629770] do_syscall_64+0x2d/0x40
[ 30.630332] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 30.631187]
[ 30.631187] -> #0 (&ep->mtx){+.+.}-{3:3}:
[ 30.631985] check_prevs_add+0x226/0xb00
[ 30.632584] __lock_acquire+0x1237/0x13a0
[ 30.633207] lock_acquire+0x2c7/0x390
[ 30.633740] __mutex_lock+0xae/0x9f0
[ 30.634258] __ep_eventpoll_poll+0x9f/0x220
[ 30.634879] __io_arm_poll_handler+0xbf/0x220
[ 30.635462] io_issue_sqe+0xa6b/0x13e0
[ 30.635982] __io_queue_sqe+0x10b/0x550
[ 30.636648] io_queue_sqe+0x235/0x470
[ 30.637281] io_submit_sqes+0xcce/0xf10
[ 30.637839] __x64_sys_io_uring_enter+0x3fb/0x5b0
[ 30.638465] do_syscall_64+0x2d/0x40
[ 30.638999] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 30.639643]
[ 30.639643] other info that might help us debug this:
[ 30.639643]
[ 30.640618] Possible unsafe locking scenario:
[ 30.640618]
[ 30.641402] CPU0 CPU1
[ 30.641938] ---- ----
[ 30.642664] lock(&ctx->uring_lock);
[ 30.643425] lock(&ep->mtx);
[ 30.644498] lock(&ctx->uring_lock);
[ 30.645668] lock(&ep->mtx);
[ 30.646321]
[ 30.646321] *** DEADLOCK ***
[ 30.646321]
[ 30.647642] 1 lock held by a.out/1128:
[ 30.648424] #0: ffff88810e952be8 (&ctx->uring_lock){+.+.}-{3:3}, at: __x64_sys_io_uring_enter+0x3f0/0x5b0
[ 30.649954]
[ 30.649954] stack backtrace:
[ 30.650592] CPU: 1 PID: 1128 Comm: a.out Not tainted 5.11.0-rc3-next-20210115 #1
[ 30.651554] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 30.652290] Call Trace:
[ 30.652688] dump_stack+0xac/0xe3
[ 30.653164] check_noncircular+0x11e/0x130
[ 30.653747] ? check_prevs_add+0x226/0xb00
[ 30.654303] check_prevs_add+0x226/0xb00
[ 30.654845] ? add_lock_to_list.constprop.49+0xac/0x1d0
[ 30.655564] __lock_acquire+0x1237/0x13a0
[ 30.656262] lock_acquire+0x2c7/0x390
[ 30.656788] ? __ep_eventpoll_poll+0x9f/0x220
[ 30.657379] ? __io_queue_proc.isra.88+0x180/0x180
[ 30.658014] __mutex_lock+0xae/0x9f0
[ 30.658524] ? __ep_eventpoll_poll+0x9f/0x220
[ 30.659112] ? mark_held_locks+0x5a/0x80
[ 30.659648] ? __ep_eventpoll_poll+0x9f/0x220
[ 30.660229] ? _raw_spin_unlock_irqrestore+0x2d/0x40
[ 30.660885] ? trace_hardirqs_on+0x46/0x110
[ 30.661471] ? __io_queue_proc.isra.88+0x180/0x180
[ 30.662102] ? __ep_eventpoll_poll+0x9f/0x220
[ 30.662696] __ep_eventpoll_poll+0x9f/0x220
[ 30.663273] ? __ep_eventpoll_poll+0x220/0x220
[ 30.663875] __io_arm_poll_handler+0xbf/0x220
[ 30.664463] io_issue_sqe+0xa6b/0x13e0
[ 30.664984] ? __lock_acquire+0x782/0x13a0
[ 30.665544] ? __io_queue_proc.isra.88+0x180/0x180
[ 30.666170] ? __io_queue_sqe+0x10b/0x550
[ 30.666725] __io_queue_sqe+0x10b/0x550
[ 30.667252] ? __fget_files+0x131/0x260
[ 30.667791] ? io_req_prep+0xd8/0x1090
[ 30.668316] ? io_queue_sqe+0x235/0x470
[ 30.668868] io_queue_sqe+0x235/0x470
[ 30.669398] io_submit_sqes+0xcce/0xf10
[ 30.669931] ? xa_load+0xe4/0x1c0
[ 30.670425] __x64_sys_io_uring_enter+0x3fb/0x5b0
[ 30.671051] ? lockdep_hardirqs_on_prepare+0xde/0x180
[ 30.671719] ? syscall_enter_from_user_mode+0x2b/0x80
[ 30.672380] do_syscall_64+0x2d/0x40
[ 30.672901] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 30.673503] RIP: 0033:0x7fd89c813239
[ 30.673962] Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 73 01 c3 48 8b 0d 27 ec 2c 00 f7 d8 64 89 01 48
[ 30.675920] RSP: 002b:00007ffc65a7c628 EFLAGS: 00000217 ORIG_RAX: 00000000000001aa
[ 30.676791] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd89c813239
[ 30.677594] RDX: 0000000000000000 RSI: 0000000000000014 RDI: 0000000000000003
[ 30.678678] RBP: 00007ffc65a7c720 R08: 0000000000000000 R09: 0000000003000000
[ 30.679492] R10: 0000000000000000 R11: 0000000000000217 R12: 0000000000400ff0
[ 30.680282] R13: 00007ffc65a7c840 R14: 0000000000000000 R15: 0000000000000000
This might happen if we do epoll_wait on a uring fd while reading/writing
the former epoll fd in a sqe in the former uring instance.
So let's don't flush cqring overflow list, just do a simple check.
Reported-by: Abaci <[email protected]>
Fixes: 6c503150ae33 ("io_uring: patch up IOPOLL overflow_flush sync")
Signed-off-by: Hao Xu <[email protected]>
---
fs/io_uring.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 38c6cbe1ab38..5f42ad6f2155 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -8718,8 +8718,21 @@ static __poll_t io_uring_poll(struct file *file, poll_table *wait)
smp_rmb();
if (!io_sqring_full(ctx))
mask |= EPOLLOUT | EPOLLWRNORM;
- io_cqring_overflow_flush(ctx, false, NULL, NULL);
- if (io_cqring_events(ctx))
+
+ /*
+ * Don't flush cqring overflow list here, just do a simple check.
+ * Otherwise there could possible be ABBA deadlock:
+ * CPU0 CPU1
+ * ---- ----
+ * lock(&ctx->uring_lock);
+ * lock(&ep->mtx);
+ * lock(&ctx->uring_lock);
+ * lock(&ep->mtx);
+ *
+ * Users may get EPOLLIN meanwhile seeing nothing in cqring, this
+ * pushs them to do the flush.
+ */
+ if (io_cqring_events(ctx) || test_bit(0, &ctx->cq_check_overflow))
mask |= EPOLLIN | EPOLLRDNORM;
return mask;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v3] io_uring: fix possible deadlock in io_uring_poll
2021-02-05 8:34 ` [PATCH v3] " Hao Xu
@ 2021-02-05 12:58 ` Pavel Begunkov
2021-02-09 23:30 ` Pavel Begunkov
2021-02-10 2:27 ` Jens Axboe
1 sibling, 1 reply; 5+ messages in thread
From: Pavel Begunkov @ 2021-02-05 12:58 UTC (permalink / raw)
To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi
On 05/02/2021 08:34, Hao Xu wrote:
> This might happen if we do epoll_wait on a uring fd while reading/writing
> the former epoll fd in a sqe in the former uring instance.
> So let's don't flush cqring overflow list, just do a simple check.
I haven't tested it but tried out identical before.
Reviewed-by: Pavel Begunkov <[email protected]>
Cc: [email protected] # 5.5+
>
> Reported-by: Abaci <[email protected]>
> Fixes: 6c503150ae33 ("io_uring: patch up IOPOLL overflow_flush sync")
> Signed-off-by: Hao Xu <[email protected]>
> ---
> fs/io_uring.c | 17 +++++++++++++++--
> 1 file changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 38c6cbe1ab38..5f42ad6f2155 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -8718,8 +8718,21 @@ static __poll_t io_uring_poll(struct file *file, poll_table *wait)
> smp_rmb();
> if (!io_sqring_full(ctx))
> mask |= EPOLLOUT | EPOLLWRNORM;
> - io_cqring_overflow_flush(ctx, false, NULL, NULL);
> - if (io_cqring_events(ctx))
> +
> + /*
> + * Don't flush cqring overflow list here, just do a simple check.
> + * Otherwise there could possible be ABBA deadlock:
> + * CPU0 CPU1
> + * ---- ----
> + * lock(&ctx->uring_lock);
> + * lock(&ep->mtx);
> + * lock(&ctx->uring_lock);
> + * lock(&ep->mtx);
> + *
> + * Users may get EPOLLIN meanwhile seeing nothing in cqring, this
> + * pushs them to do the flush.
> + */
> + if (io_cqring_events(ctx) || test_bit(0, &ctx->cq_check_overflow))
> mask |= EPOLLIN | EPOLLRDNORM;
>
> return mask;
>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3] io_uring: fix possible deadlock in io_uring_poll
2021-02-05 12:58 ` Pavel Begunkov
@ 2021-02-09 23:30 ` Pavel Begunkov
0 siblings, 0 replies; 5+ messages in thread
From: Pavel Begunkov @ 2021-02-09 23:30 UTC (permalink / raw)
To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi
On 05/02/2021 12:58, Pavel Begunkov wrote:
> On 05/02/2021 08:34, Hao Xu wrote:
>> This might happen if we do epoll_wait on a uring fd while reading/writing
>> the former epoll fd in a sqe in the former uring instance.
>> So let's don't flush cqring overflow list, just do a simple check.
>
> I haven't tested it but tried out identical before.
>
> Reviewed-by: Pavel Begunkov <[email protected]>
> Cc: [email protected] # 5.5+
Don't see it queued, so up, in case it was forgotten
>
>>
>> Reported-by: Abaci <[email protected]>
>> Fixes: 6c503150ae33 ("io_uring: patch up IOPOLL overflow_flush sync")
>> Signed-off-by: Hao Xu <[email protected]>
>> ---
>> fs/io_uring.c | 17 +++++++++++++++--
>> 1 file changed, 15 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 38c6cbe1ab38..5f42ad6f2155 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -8718,8 +8718,21 @@ static __poll_t io_uring_poll(struct file *file, poll_table *wait)
>> smp_rmb();
>> if (!io_sqring_full(ctx))
>> mask |= EPOLLOUT | EPOLLWRNORM;
>> - io_cqring_overflow_flush(ctx, false, NULL, NULL);
>> - if (io_cqring_events(ctx))
>> +
>> + /*
>> + * Don't flush cqring overflow list here, just do a simple check.
>> + * Otherwise there could possible be ABBA deadlock:
>> + * CPU0 CPU1
>> + * ---- ----
>> + * lock(&ctx->uring_lock);
>> + * lock(&ep->mtx);
>> + * lock(&ctx->uring_lock);
>> + * lock(&ep->mtx);
>> + *
>> + * Users may get EPOLLIN meanwhile seeing nothing in cqring, this
>> + * pushs them to do the flush.
>> + */
>> + if (io_cqring_events(ctx) || test_bit(0, &ctx->cq_check_overflow))
>> mask |= EPOLLIN | EPOLLRDNORM;
>>
>> return mask;
>>
>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3] io_uring: fix possible deadlock in io_uring_poll
2021-02-05 8:34 ` [PATCH v3] " Hao Xu
2021-02-05 12:58 ` Pavel Begunkov
@ 2021-02-10 2:27 ` Jens Axboe
1 sibling, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2021-02-10 2:27 UTC (permalink / raw)
To: Hao Xu; +Cc: io-uring, Pavel Begunkov, Joseph Qi
On 2/5/21 1:34 AM, Hao Xu wrote:
> Abaci reported follow issue:
>
> [ 30.615891] ======================================================
> [ 30.616648] WARNING: possible circular locking dependency detected
> [ 30.617423] 5.11.0-rc3-next-20210115 #1 Not tainted
> [ 30.618035] ------------------------------------------------------
> [ 30.618914] a.out/1128 is trying to acquire lock:
> [ 30.619520] ffff88810b063868 (&ep->mtx){+.+.}-{3:3}, at: __ep_eventpoll_poll+0x9f/0x220
> [ 30.620505]
> [ 30.620505] but task is already holding lock:
> [ 30.621218] ffff88810e952be8 (&ctx->uring_lock){+.+.}-{3:3}, at: __x64_sys_io_uring_enter+0x3f0/0x5b0
> [ 30.622349]
> [ 30.622349] which lock already depends on the new lock.
> [ 30.622349]
> [ 30.623289]
> [ 30.623289] the existing dependency chain (in reverse order) is:
> [ 30.624243]
> [ 30.624243] -> #1 (&ctx->uring_lock){+.+.}-{3:3}:
> [ 30.625263] lock_acquire+0x2c7/0x390
> [ 30.625868] __mutex_lock+0xae/0x9f0
> [ 30.626451] io_cqring_overflow_flush.part.95+0x6d/0x70
> [ 30.627278] io_uring_poll+0xcb/0xd0
> [ 30.627890] ep_item_poll.isra.14+0x4e/0x90
> [ 30.628531] do_epoll_ctl+0xb7e/0x1120
> [ 30.629122] __x64_sys_epoll_ctl+0x70/0xb0
> [ 30.629770] do_syscall_64+0x2d/0x40
> [ 30.630332] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 30.631187]
> [ 30.631187] -> #0 (&ep->mtx){+.+.}-{3:3}:
> [ 30.631985] check_prevs_add+0x226/0xb00
> [ 30.632584] __lock_acquire+0x1237/0x13a0
> [ 30.633207] lock_acquire+0x2c7/0x390
> [ 30.633740] __mutex_lock+0xae/0x9f0
> [ 30.634258] __ep_eventpoll_poll+0x9f/0x220
> [ 30.634879] __io_arm_poll_handler+0xbf/0x220
> [ 30.635462] io_issue_sqe+0xa6b/0x13e0
> [ 30.635982] __io_queue_sqe+0x10b/0x550
> [ 30.636648] io_queue_sqe+0x235/0x470
> [ 30.637281] io_submit_sqes+0xcce/0xf10
> [ 30.637839] __x64_sys_io_uring_enter+0x3fb/0x5b0
> [ 30.638465] do_syscall_64+0x2d/0x40
> [ 30.638999] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 30.639643]
> [ 30.639643] other info that might help us debug this:
> [ 30.639643]
> [ 30.640618] Possible unsafe locking scenario:
> [ 30.640618]
> [ 30.641402] CPU0 CPU1
> [ 30.641938] ---- ----
> [ 30.642664] lock(&ctx->uring_lock);
> [ 30.643425] lock(&ep->mtx);
> [ 30.644498] lock(&ctx->uring_lock);
> [ 30.645668] lock(&ep->mtx);
> [ 30.646321]
> [ 30.646321] *** DEADLOCK ***
> [ 30.646321]
> [ 30.647642] 1 lock held by a.out/1128:
> [ 30.648424] #0: ffff88810e952be8 (&ctx->uring_lock){+.+.}-{3:3}, at: __x64_sys_io_uring_enter+0x3f0/0x5b0
> [ 30.649954]
> [ 30.649954] stack backtrace:
> [ 30.650592] CPU: 1 PID: 1128 Comm: a.out Not tainted 5.11.0-rc3-next-20210115 #1
> [ 30.651554] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> [ 30.652290] Call Trace:
> [ 30.652688] dump_stack+0xac/0xe3
> [ 30.653164] check_noncircular+0x11e/0x130
> [ 30.653747] ? check_prevs_add+0x226/0xb00
> [ 30.654303] check_prevs_add+0x226/0xb00
> [ 30.654845] ? add_lock_to_list.constprop.49+0xac/0x1d0
> [ 30.655564] __lock_acquire+0x1237/0x13a0
> [ 30.656262] lock_acquire+0x2c7/0x390
> [ 30.656788] ? __ep_eventpoll_poll+0x9f/0x220
> [ 30.657379] ? __io_queue_proc.isra.88+0x180/0x180
> [ 30.658014] __mutex_lock+0xae/0x9f0
> [ 30.658524] ? __ep_eventpoll_poll+0x9f/0x220
> [ 30.659112] ? mark_held_locks+0x5a/0x80
> [ 30.659648] ? __ep_eventpoll_poll+0x9f/0x220
> [ 30.660229] ? _raw_spin_unlock_irqrestore+0x2d/0x40
> [ 30.660885] ? trace_hardirqs_on+0x46/0x110
> [ 30.661471] ? __io_queue_proc.isra.88+0x180/0x180
> [ 30.662102] ? __ep_eventpoll_poll+0x9f/0x220
> [ 30.662696] __ep_eventpoll_poll+0x9f/0x220
> [ 30.663273] ? __ep_eventpoll_poll+0x220/0x220
> [ 30.663875] __io_arm_poll_handler+0xbf/0x220
> [ 30.664463] io_issue_sqe+0xa6b/0x13e0
> [ 30.664984] ? __lock_acquire+0x782/0x13a0
> [ 30.665544] ? __io_queue_proc.isra.88+0x180/0x180
> [ 30.666170] ? __io_queue_sqe+0x10b/0x550
> [ 30.666725] __io_queue_sqe+0x10b/0x550
> [ 30.667252] ? __fget_files+0x131/0x260
> [ 30.667791] ? io_req_prep+0xd8/0x1090
> [ 30.668316] ? io_queue_sqe+0x235/0x470
> [ 30.668868] io_queue_sqe+0x235/0x470
> [ 30.669398] io_submit_sqes+0xcce/0xf10
> [ 30.669931] ? xa_load+0xe4/0x1c0
> [ 30.670425] __x64_sys_io_uring_enter+0x3fb/0x5b0
> [ 30.671051] ? lockdep_hardirqs_on_prepare+0xde/0x180
> [ 30.671719] ? syscall_enter_from_user_mode+0x2b/0x80
> [ 30.672380] do_syscall_64+0x2d/0x40
> [ 30.672901] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 30.673503] RIP: 0033:0x7fd89c813239
> [ 30.673962] Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 73 01 c3 48 8b 0d 27 ec 2c 00 f7 d8 64 89 01 48
> [ 30.675920] RSP: 002b:00007ffc65a7c628 EFLAGS: 00000217 ORIG_RAX: 00000000000001aa
> [ 30.676791] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd89c813239
> [ 30.677594] RDX: 0000000000000000 RSI: 0000000000000014 RDI: 0000000000000003
> [ 30.678678] RBP: 00007ffc65a7c720 R08: 0000000000000000 R09: 0000000003000000
> [ 30.679492] R10: 0000000000000000 R11: 0000000000000217 R12: 0000000000400ff0
> [ 30.680282] R13: 00007ffc65a7c840 R14: 0000000000000000 R15: 0000000000000000
>
> This might happen if we do epoll_wait on a uring fd while reading/writing
> the former epoll fd in a sqe in the former uring instance.
> So let's don't flush cqring overflow list, just do a simple check.
Applied, thanks.
--
Jens Axboe
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-02-10 2:29 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-02-05 7:49 [PATCH v2] io_uring: fix possible deadlock in io_uring_poll Hao Xu
2021-02-05 8:34 ` [PATCH v3] " Hao Xu
2021-02-05 12:58 ` Pavel Begunkov
2021-02-09 23:30 ` Pavel Begunkov
2021-02-10 2:27 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox