* [PATCH 1/3] io_uring: don't take uring_lock during iowq cancel
2021-02-18 22:32 [PATCH 0/3] fix nested uring_lock Pavel Begunkov
@ 2021-02-18 22:32 ` Pavel Begunkov
2021-02-18 22:32 ` [PATCH 2/3] io_uring: fail io-wq submission from a task_work Pavel Begunkov
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Pavel Begunkov @ 2021-02-18 22:32 UTC (permalink / raw)
To: Jens Axboe, io-uring; +Cc: stable, Abaci, Hao Xu
[ 97.866748] a.out/2890 is trying to acquire lock:
[ 97.867829] ffff8881046763e8 (&ctx->uring_lock){+.+.}-{3:3}, at:
io_wq_submit_work+0x155/0x240
[ 97.869735]
[ 97.869735] but task is already holding lock:
[ 97.871033] ffff88810dfe0be8 (&ctx->uring_lock){+.+.}-{3:3}, at:
__x64_sys_io_uring_enter+0x3f0/0x5b0
[ 97.873074]
[ 97.873074] other info that might help us debug this:
[ 97.874520] Possible unsafe locking scenario:
[ 97.874520]
[ 97.875845] CPU0
[ 97.876440] ----
[ 97.877048] lock(&ctx->uring_lock);
[ 97.877961] lock(&ctx->uring_lock);
[ 97.878881]
[ 97.878881] *** DEADLOCK ***
[ 97.878881]
[ 97.880341] May be due to missing lock nesting notation
[ 97.880341]
[ 97.881952] 1 lock held by a.out/2890:
[ 97.882873] #0: ffff88810dfe0be8 (&ctx->uring_lock){+.+.}-{3:3}, at:
__x64_sys_io_uring_enter+0x3f0/0x5b0
[ 97.885108]
[ 97.885108] stack backtrace:
[ 97.890457] Call Trace:
[ 97.891121] dump_stack+0xac/0xe3
[ 97.891972] __lock_acquire+0xab6/0x13a0
[ 97.892940] lock_acquire+0x2c3/0x390
[ 97.894894] __mutex_lock+0xae/0x9f0
[ 97.901101] io_wq_submit_work+0x155/0x240
[ 97.902112] io_wq_cancel_cb+0x162/0x490
[ 97.904126] io_async_find_and_cancel+0x3b/0x140
[ 97.905247] io_issue_sqe+0x86d/0x13e0
[ 97.909122] __io_queue_sqe+0x10b/0x550
[ 97.913971] io_queue_sqe+0x235/0x470
[ 97.914894] io_submit_sqes+0xcce/0xf10
[ 97.917872] __x64_sys_io_uring_enter+0x3fb/0x5b0
[ 97.921424] do_syscall_64+0x2d/0x40
[ 97.922329] entry_SYSCALL_64_after_hwframe+0x44/0xa9
While holding uring_lock, e.g. from inline execution, async cancel
request may attempt cancellations through io_wq_submit_work, which may
try to grab a lock. Delay it to task_work, so we do it from a clean
context and don't have to worry about locking.
Cc: <[email protected]> # 5.5+
Fixes: c07e6719511e ("io_uring: hold uring_lock while completing failed polled io in io_wq_submit_work()")
Reported-by: Abaci <[email protected]>
Reported-by: Hao Xu <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 2fdfe5fa00b0..8dab07f42b34 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2337,7 +2337,9 @@ static void io_req_task_cancel(struct callback_head *cb)
struct io_kiocb *req = container_of(cb, struct io_kiocb, task_work);
struct io_ring_ctx *ctx = req->ctx;
+ mutex_lock(&ctx->uring_lock);
__io_req_task_cancel(req, -ECANCELED);
+ mutex_unlock(&ctx->uring_lock);
percpu_ref_put(&ctx->refs);
}
@@ -6426,8 +6428,13 @@ static void io_wq_submit_work(struct io_wq_work *work)
if (timeout)
io_queue_linked_timeout(timeout);
- if (work->flags & IO_WQ_WORK_CANCEL)
- ret = -ECANCELED;
+ if (work->flags & IO_WQ_WORK_CANCEL) {
+ /* io-wq is going to take down one */
+ refcount_inc(&req->refs);
+ percpu_ref_get(&req->ctx->refs);
+ io_req_task_work_add_fallback(req, io_req_task_cancel);
+ return;
+ }
if (!ret) {
do {
--
2.24.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/3] io_uring: fail io-wq submission from a task_work
2021-02-18 22:32 [PATCH 0/3] fix nested uring_lock Pavel Begunkov
2021-02-18 22:32 ` [PATCH 1/3] io_uring: don't take uring_lock during iowq cancel Pavel Begunkov
@ 2021-02-18 22:32 ` Pavel Begunkov
2021-02-21 1:01 ` Pavel Begunkov
2021-02-18 22:32 ` [PATCH 3/3] io_uring: avoid taking ctx refs for task-cancel Pavel Begunkov
2021-02-18 23:16 ` [PATCH 0/3] fix nested uring_lock Jens Axboe
3 siblings, 1 reply; 8+ messages in thread
From: Pavel Begunkov @ 2021-02-18 22:32 UTC (permalink / raw)
To: Jens Axboe, io-uring
In case of failure io_wq_submit_work() needs to post an CQE and so
potentially take uring_lock. The safest way to deal with it is to do
that from under task_work where we can safely take the lock.
Also, as io_iopoll_check() holds the lock tight and releases it
reluctantly, it will play nicer in the furuter with notifying an
iopolling task about new such pending failed requests.
Signed-off-by: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 48 ++++++++++++++++++------------------------------
1 file changed, 18 insertions(+), 30 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 8dab07f42b34..1cb5e40d9822 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2338,7 +2338,7 @@ static void io_req_task_cancel(struct callback_head *cb)
struct io_ring_ctx *ctx = req->ctx;
mutex_lock(&ctx->uring_lock);
- __io_req_task_cancel(req, -ECANCELED);
+ __io_req_task_cancel(req, req->result);
mutex_unlock(&ctx->uring_lock);
percpu_ref_put(&ctx->refs);
}
@@ -2371,11 +2371,22 @@ static void io_req_task_queue(struct io_kiocb *req)
req->task_work.func = io_req_task_submit;
ret = io_req_task_work_add(req);
if (unlikely(ret)) {
+ ret = -ECANCELED;
percpu_ref_get(&req->ctx->refs);
io_req_task_work_add_fallback(req, io_req_task_cancel);
}
}
+static void io_req_task_queue_fail(struct io_kiocb *req, int ret)
+{
+ percpu_ref_get(&req->ctx->refs);
+ req->result = ret;
+ req->task_work.func = io_req_task_cancel;
+
+ if (unlikely(io_req_task_work_add(req)))
+ io_req_task_work_add_fallback(req, io_req_task_cancel);
+}
+
static inline void io_queue_next(struct io_kiocb *req)
{
struct io_kiocb *nxt = io_req_find_next(req);
@@ -6428,13 +6439,8 @@ static void io_wq_submit_work(struct io_wq_work *work)
if (timeout)
io_queue_linked_timeout(timeout);
- if (work->flags & IO_WQ_WORK_CANCEL) {
- /* io-wq is going to take down one */
- refcount_inc(&req->refs);
- percpu_ref_get(&req->ctx->refs);
- io_req_task_work_add_fallback(req, io_req_task_cancel);
- return;
- }
+ if (work->flags & IO_WQ_WORK_CANCEL)
+ ret = -ECANCELED;
if (!ret) {
do {
@@ -6450,29 +6456,11 @@ static void io_wq_submit_work(struct io_wq_work *work)
} while (1);
}
+ /* avoid locking problems by failing it from a clean context */
if (ret) {
- struct io_ring_ctx *lock_ctx = NULL;
-
- if (req->ctx->flags & IORING_SETUP_IOPOLL)
- lock_ctx = req->ctx;
-
- /*
- * io_iopoll_complete() does not hold completion_lock to
- * complete polled io, so here for polled io, we can not call
- * io_req_complete() directly, otherwise there maybe concurrent
- * access to cqring, defer_list, etc, which is not safe. Given
- * that io_iopoll_complete() is always called under uring_lock,
- * so here for polled io, we also get uring_lock to complete
- * it.
- */
- if (lock_ctx)
- mutex_lock(&lock_ctx->uring_lock);
-
- req_set_fail_links(req);
- io_req_complete(req, ret);
-
- if (lock_ctx)
- mutex_unlock(&lock_ctx->uring_lock);
+ /* io-wq is going to take one down */
+ refcount_inc(&req->refs);
+ io_req_task_queue_fail(req, ret);
}
}
--
2.24.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/3] io_uring: fail io-wq submission from a task_work
2021-02-18 22:32 ` [PATCH 2/3] io_uring: fail io-wq submission from a task_work Pavel Begunkov
@ 2021-02-21 1:01 ` Pavel Begunkov
2021-02-21 2:03 ` Jens Axboe
0 siblings, 1 reply; 8+ messages in thread
From: Pavel Begunkov @ 2021-02-21 1:01 UTC (permalink / raw)
To: Jens Axboe, io-uring
On 18/02/2021 22:32, Pavel Begunkov wrote:
> In case of failure io_wq_submit_work() needs to post an CQE and so
> potentially take uring_lock. The safest way to deal with it is to do
> that from under task_work where we can safely take the lock.
>
> Also, as io_iopoll_check() holds the lock tight and releases it
> reluctantly, it will play nicer in the furuter with notifying an
> iopolling task about new such pending failed requests.
>
> Signed-off-by: Pavel Begunkov <[email protected]>
> ---
> @@ -2371,11 +2371,22 @@ static void io_req_task_queue(struct io_kiocb *req)
> req->task_work.func = io_req_task_submit;
> ret = io_req_task_work_add(req);
> if (unlikely(ret)) {
> + ret = -ECANCELED;
That's a stupid mistake. Jens, any chance you can fold in a diff below?
> percpu_ref_get(&req->ctx->refs);
> io_req_task_work_add_fallback(req, io_req_task_cancel);
> }
> }
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 1cb5e40d9822..582306b1dfd1 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2371,7 +2371,7 @@ static void io_req_task_queue(struct io_kiocb *req)
req->task_work.func = io_req_task_submit;
ret = io_req_task_work_add(req);
if (unlikely(ret)) {
- ret = -ECANCELED;
+ req->result = -ECANCELED;
percpu_ref_get(&req->ctx->refs);
io_req_task_work_add_fallback(req, io_req_task_cancel);
}
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/3] io_uring: fail io-wq submission from a task_work
2021-02-21 1:01 ` Pavel Begunkov
@ 2021-02-21 2:03 ` Jens Axboe
0 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2021-02-21 2:03 UTC (permalink / raw)
To: Pavel Begunkov, io-uring
On 2/20/21 6:01 PM, Pavel Begunkov wrote:
> On 18/02/2021 22:32, Pavel Begunkov wrote:
>> In case of failure io_wq_submit_work() needs to post an CQE and so
>> potentially take uring_lock. The safest way to deal with it is to do
>> that from under task_work where we can safely take the lock.
>>
>> Also, as io_iopoll_check() holds the lock tight and releases it
>> reluctantly, it will play nicer in the furuter with notifying an
>> iopolling task about new such pending failed requests.
>>
>> Signed-off-by: Pavel Begunkov <[email protected]>
>> ---
>> @@ -2371,11 +2371,22 @@ static void io_req_task_queue(struct io_kiocb *req)
>> req->task_work.func = io_req_task_submit;
>> ret = io_req_task_work_add(req);
>> if (unlikely(ret)) {
>> + ret = -ECANCELED;
>
> That's a stupid mistake. Jens, any chance you can fold in a diff below?
Yup done.
--
Jens Axboe
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 3/3] io_uring: avoid taking ctx refs for task-cancel
2021-02-18 22:32 [PATCH 0/3] fix nested uring_lock Pavel Begunkov
2021-02-18 22:32 ` [PATCH 1/3] io_uring: don't take uring_lock during iowq cancel Pavel Begunkov
2021-02-18 22:32 ` [PATCH 2/3] io_uring: fail io-wq submission from a task_work Pavel Begunkov
@ 2021-02-18 22:32 ` Pavel Begunkov
2021-02-18 23:14 ` Jens Axboe
2021-02-18 23:16 ` [PATCH 0/3] fix nested uring_lock Jens Axboe
3 siblings, 1 reply; 8+ messages in thread
From: Pavel Begunkov @ 2021-02-18 22:32 UTC (permalink / raw)
To: Jens Axboe, io-uring
Don't bother to take a ctx->refs for io_req_task_cancel() because it
take uring_lock before putting a request, and the context is promised to
stay alive until unlock happens.
Signed-off-by: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 1cb5e40d9822..1bd25d4711cd 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2337,10 +2337,10 @@ static void io_req_task_cancel(struct callback_head *cb)
struct io_kiocb *req = container_of(cb, struct io_kiocb, task_work);
struct io_ring_ctx *ctx = req->ctx;
+ /* ctx is guaranteed to stay alive while we hold uring_lock */
mutex_lock(&ctx->uring_lock);
__io_req_task_cancel(req, req->result);
mutex_unlock(&ctx->uring_lock);
- percpu_ref_put(&ctx->refs);
}
static void __io_req_task_submit(struct io_kiocb *req)
@@ -2372,14 +2372,12 @@ static void io_req_task_queue(struct io_kiocb *req)
ret = io_req_task_work_add(req);
if (unlikely(ret)) {
ret = -ECANCELED;
- percpu_ref_get(&req->ctx->refs);
io_req_task_work_add_fallback(req, io_req_task_cancel);
}
}
static void io_req_task_queue_fail(struct io_kiocb *req, int ret)
{
- percpu_ref_get(&req->ctx->refs);
req->result = ret;
req->task_work.func = io_req_task_cancel;
--
2.24.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 0/3] fix nested uring_lock
2021-02-18 22:32 [PATCH 0/3] fix nested uring_lock Pavel Begunkov
` (2 preceding siblings ...)
2021-02-18 22:32 ` [PATCH 3/3] io_uring: avoid taking ctx refs for task-cancel Pavel Begunkov
@ 2021-02-18 23:16 ` Jens Axboe
3 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2021-02-18 23:16 UTC (permalink / raw)
To: Pavel Begunkov, io-uring
On 2/18/21 3:32 PM, Pavel Begunkov wrote:
> 1/3 is for backporting, two others are more churny, but do it right,
> including trying to punt it to the original task instead of fallback,
> aka io-wq manager.
Thanks, applied 1-2 for 5.12.
--
Jens Axboe
^ permalink raw reply [flat|nested] 8+ messages in thread