public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH 0/3] fix nested uring_lock
@ 2021-02-18 22:32 Pavel Begunkov
  2021-02-18 22:32 ` [PATCH 1/3] io_uring: don't take uring_lock during iowq cancel Pavel Begunkov
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Pavel Begunkov @ 2021-02-18 22:32 UTC (permalink / raw)
  To: Jens Axboe, io-uring

1/3 is for backporting, two others are more churny, but do it right,
including trying to punt it to the original task instead of fallback,
aka io-wq manager.

Pavel Begunkov (3):
  io_uring: don't take uring_lock during iowq cancel
  io_uring: fail io-wq submission from a task_work
  io_uring: avoid taking ctx refs for task-cancel

 fs/io_uring.c | 43 ++++++++++++++++++-------------------------
 1 file changed, 18 insertions(+), 25 deletions(-)

-- 
2.24.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] io_uring: don't take uring_lock during iowq cancel
  2021-02-18 22:32 [PATCH 0/3] fix nested uring_lock Pavel Begunkov
@ 2021-02-18 22:32 ` Pavel Begunkov
  2021-02-18 22:32 ` [PATCH 2/3] io_uring: fail io-wq submission from a task_work Pavel Begunkov
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Pavel Begunkov @ 2021-02-18 22:32 UTC (permalink / raw)
  To: Jens Axboe, io-uring; +Cc: stable, Abaci, Hao Xu

[   97.866748] a.out/2890 is trying to acquire lock:
[   97.867829] ffff8881046763e8 (&ctx->uring_lock){+.+.}-{3:3}, at:
io_wq_submit_work+0x155/0x240
[   97.869735]
[   97.869735] but task is already holding lock:
[   97.871033] ffff88810dfe0be8 (&ctx->uring_lock){+.+.}-{3:3}, at:
__x64_sys_io_uring_enter+0x3f0/0x5b0
[   97.873074]
[   97.873074] other info that might help us debug this:
[   97.874520]  Possible unsafe locking scenario:
[   97.874520]
[   97.875845]        CPU0
[   97.876440]        ----
[   97.877048]   lock(&ctx->uring_lock);
[   97.877961]   lock(&ctx->uring_lock);
[   97.878881]
[   97.878881]  *** DEADLOCK ***
[   97.878881]
[   97.880341]  May be due to missing lock nesting notation
[   97.880341]
[   97.881952] 1 lock held by a.out/2890:
[   97.882873]  #0: ffff88810dfe0be8 (&ctx->uring_lock){+.+.}-{3:3}, at:
__x64_sys_io_uring_enter+0x3f0/0x5b0
[   97.885108]
[   97.885108] stack backtrace:
[   97.890457] Call Trace:
[   97.891121]  dump_stack+0xac/0xe3
[   97.891972]  __lock_acquire+0xab6/0x13a0
[   97.892940]  lock_acquire+0x2c3/0x390
[   97.894894]  __mutex_lock+0xae/0x9f0
[   97.901101]  io_wq_submit_work+0x155/0x240
[   97.902112]  io_wq_cancel_cb+0x162/0x490
[   97.904126]  io_async_find_and_cancel+0x3b/0x140
[   97.905247]  io_issue_sqe+0x86d/0x13e0
[   97.909122]  __io_queue_sqe+0x10b/0x550
[   97.913971]  io_queue_sqe+0x235/0x470
[   97.914894]  io_submit_sqes+0xcce/0xf10
[   97.917872]  __x64_sys_io_uring_enter+0x3fb/0x5b0
[   97.921424]  do_syscall_64+0x2d/0x40
[   97.922329]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

While holding uring_lock, e.g. from inline execution, async cancel
request may attempt cancellations through io_wq_submit_work, which may
try to grab a lock. Delay it to task_work, so we do it from a clean
context and don't have to worry about locking.

Cc: <[email protected]> # 5.5+
Fixes: c07e6719511e ("io_uring: hold uring_lock while completing failed polled io in io_wq_submit_work()")
Reported-by: Abaci <[email protected]>
Reported-by: Hao Xu <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 2fdfe5fa00b0..8dab07f42b34 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2337,7 +2337,9 @@ static void io_req_task_cancel(struct callback_head *cb)
 	struct io_kiocb *req = container_of(cb, struct io_kiocb, task_work);
 	struct io_ring_ctx *ctx = req->ctx;
 
+	mutex_lock(&ctx->uring_lock);
 	__io_req_task_cancel(req, -ECANCELED);
+	mutex_unlock(&ctx->uring_lock);
 	percpu_ref_put(&ctx->refs);
 }
 
@@ -6426,8 +6428,13 @@ static void io_wq_submit_work(struct io_wq_work *work)
 	if (timeout)
 		io_queue_linked_timeout(timeout);
 
-	if (work->flags & IO_WQ_WORK_CANCEL)
-		ret = -ECANCELED;
+	if (work->flags & IO_WQ_WORK_CANCEL) {
+		/* io-wq is going to take down one */
+		refcount_inc(&req->refs);
+		percpu_ref_get(&req->ctx->refs);
+		io_req_task_work_add_fallback(req, io_req_task_cancel);
+		return;
+	}
 
 	if (!ret) {
 		do {
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] io_uring: fail io-wq submission from a task_work
  2021-02-18 22:32 [PATCH 0/3] fix nested uring_lock Pavel Begunkov
  2021-02-18 22:32 ` [PATCH 1/3] io_uring: don't take uring_lock during iowq cancel Pavel Begunkov
@ 2021-02-18 22:32 ` Pavel Begunkov
  2021-02-21  1:01   ` Pavel Begunkov
  2021-02-18 22:32 ` [PATCH 3/3] io_uring: avoid taking ctx refs for task-cancel Pavel Begunkov
  2021-02-18 23:16 ` [PATCH 0/3] fix nested uring_lock Jens Axboe
  3 siblings, 1 reply; 8+ messages in thread
From: Pavel Begunkov @ 2021-02-18 22:32 UTC (permalink / raw)
  To: Jens Axboe, io-uring

In case of failure io_wq_submit_work() needs to post an CQE and so
potentially take uring_lock. The safest way to deal with it is to do
that from under task_work where we can safely take the lock.

Also, as io_iopoll_check() holds the lock tight and releases it
reluctantly, it will play nicer in the furuter with notifying an
iopolling task about new such pending failed requests.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 48 ++++++++++++++++++------------------------------
 1 file changed, 18 insertions(+), 30 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 8dab07f42b34..1cb5e40d9822 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2338,7 +2338,7 @@ static void io_req_task_cancel(struct callback_head *cb)
 	struct io_ring_ctx *ctx = req->ctx;
 
 	mutex_lock(&ctx->uring_lock);
-	__io_req_task_cancel(req, -ECANCELED);
+	__io_req_task_cancel(req, req->result);
 	mutex_unlock(&ctx->uring_lock);
 	percpu_ref_put(&ctx->refs);
 }
@@ -2371,11 +2371,22 @@ static void io_req_task_queue(struct io_kiocb *req)
 	req->task_work.func = io_req_task_submit;
 	ret = io_req_task_work_add(req);
 	if (unlikely(ret)) {
+		ret = -ECANCELED;
 		percpu_ref_get(&req->ctx->refs);
 		io_req_task_work_add_fallback(req, io_req_task_cancel);
 	}
 }
 
+static void io_req_task_queue_fail(struct io_kiocb *req, int ret)
+{
+	percpu_ref_get(&req->ctx->refs);
+	req->result = ret;
+	req->task_work.func = io_req_task_cancel;
+
+	if (unlikely(io_req_task_work_add(req)))
+		io_req_task_work_add_fallback(req, io_req_task_cancel);
+}
+
 static inline void io_queue_next(struct io_kiocb *req)
 {
 	struct io_kiocb *nxt = io_req_find_next(req);
@@ -6428,13 +6439,8 @@ static void io_wq_submit_work(struct io_wq_work *work)
 	if (timeout)
 		io_queue_linked_timeout(timeout);
 
-	if (work->flags & IO_WQ_WORK_CANCEL) {
-		/* io-wq is going to take down one */
-		refcount_inc(&req->refs);
-		percpu_ref_get(&req->ctx->refs);
-		io_req_task_work_add_fallback(req, io_req_task_cancel);
-		return;
-	}
+	if (work->flags & IO_WQ_WORK_CANCEL)
+		ret = -ECANCELED;
 
 	if (!ret) {
 		do {
@@ -6450,29 +6456,11 @@ static void io_wq_submit_work(struct io_wq_work *work)
 		} while (1);
 	}
 
+	/* avoid locking problems by failing it from a clean context */
 	if (ret) {
-		struct io_ring_ctx *lock_ctx = NULL;
-
-		if (req->ctx->flags & IORING_SETUP_IOPOLL)
-			lock_ctx = req->ctx;
-
-		/*
-		 * io_iopoll_complete() does not hold completion_lock to
-		 * complete polled io, so here for polled io, we can not call
-		 * io_req_complete() directly, otherwise there maybe concurrent
-		 * access to cqring, defer_list, etc, which is not safe. Given
-		 * that io_iopoll_complete() is always called under uring_lock,
-		 * so here for polled io, we also get uring_lock to complete
-		 * it.
-		 */
-		if (lock_ctx)
-			mutex_lock(&lock_ctx->uring_lock);
-
-		req_set_fail_links(req);
-		io_req_complete(req, ret);
-
-		if (lock_ctx)
-			mutex_unlock(&lock_ctx->uring_lock);
+		/* io-wq is going to take one down */
+		refcount_inc(&req->refs);
+		io_req_task_queue_fail(req, ret);
 	}
 }
 
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] io_uring: avoid taking ctx refs for task-cancel
  2021-02-18 22:32 [PATCH 0/3] fix nested uring_lock Pavel Begunkov
  2021-02-18 22:32 ` [PATCH 1/3] io_uring: don't take uring_lock during iowq cancel Pavel Begunkov
  2021-02-18 22:32 ` [PATCH 2/3] io_uring: fail io-wq submission from a task_work Pavel Begunkov
@ 2021-02-18 22:32 ` Pavel Begunkov
  2021-02-18 23:14   ` Jens Axboe
  2021-02-18 23:16 ` [PATCH 0/3] fix nested uring_lock Jens Axboe
  3 siblings, 1 reply; 8+ messages in thread
From: Pavel Begunkov @ 2021-02-18 22:32 UTC (permalink / raw)
  To: Jens Axboe, io-uring

Don't bother to take a ctx->refs for io_req_task_cancel() because it
take uring_lock before putting a request, and the context is promised to
stay alive until unlock happens.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 1cb5e40d9822..1bd25d4711cd 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2337,10 +2337,10 @@ static void io_req_task_cancel(struct callback_head *cb)
 	struct io_kiocb *req = container_of(cb, struct io_kiocb, task_work);
 	struct io_ring_ctx *ctx = req->ctx;
 
+	/* ctx is guaranteed to stay alive while we hold uring_lock */
 	mutex_lock(&ctx->uring_lock);
 	__io_req_task_cancel(req, req->result);
 	mutex_unlock(&ctx->uring_lock);
-	percpu_ref_put(&ctx->refs);
 }
 
 static void __io_req_task_submit(struct io_kiocb *req)
@@ -2372,14 +2372,12 @@ static void io_req_task_queue(struct io_kiocb *req)
 	ret = io_req_task_work_add(req);
 	if (unlikely(ret)) {
 		ret = -ECANCELED;
-		percpu_ref_get(&req->ctx->refs);
 		io_req_task_work_add_fallback(req, io_req_task_cancel);
 	}
 }
 
 static void io_req_task_queue_fail(struct io_kiocb *req, int ret)
 {
-	percpu_ref_get(&req->ctx->refs);
 	req->result = ret;
 	req->task_work.func = io_req_task_cancel;
 
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] io_uring: avoid taking ctx refs for task-cancel
  2021-02-18 22:32 ` [PATCH 3/3] io_uring: avoid taking ctx refs for task-cancel Pavel Begunkov
@ 2021-02-18 23:14   ` Jens Axboe
  0 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2021-02-18 23:14 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 2/18/21 3:32 PM, Pavel Begunkov wrote:
> Don't bother to take a ctx->refs for io_req_task_cancel() because it
> take uring_lock before putting a request, and the context is promised to
> stay alive until unlock happens.

I agree this is fine, as we don't destroy the ctx ref before grabbing
the lock. I'll defer this one for 5.13 though. Which is another way
of saying "please resend it" when the merge window is over ;-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/3] fix nested uring_lock
  2021-02-18 22:32 [PATCH 0/3] fix nested uring_lock Pavel Begunkov
                   ` (2 preceding siblings ...)
  2021-02-18 22:32 ` [PATCH 3/3] io_uring: avoid taking ctx refs for task-cancel Pavel Begunkov
@ 2021-02-18 23:16 ` Jens Axboe
  3 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2021-02-18 23:16 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 2/18/21 3:32 PM, Pavel Begunkov wrote:
> 1/3 is for backporting, two others are more churny, but do it right,
> including trying to punt it to the original task instead of fallback,
> aka io-wq manager.

Thanks, applied 1-2 for 5.12.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] io_uring: fail io-wq submission from a task_work
  2021-02-18 22:32 ` [PATCH 2/3] io_uring: fail io-wq submission from a task_work Pavel Begunkov
@ 2021-02-21  1:01   ` Pavel Begunkov
  2021-02-21  2:03     ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Pavel Begunkov @ 2021-02-21  1:01 UTC (permalink / raw)
  To: Jens Axboe, io-uring

On 18/02/2021 22:32, Pavel Begunkov wrote:
> In case of failure io_wq_submit_work() needs to post an CQE and so
> potentially take uring_lock. The safest way to deal with it is to do
> that from under task_work where we can safely take the lock.
> 
> Also, as io_iopoll_check() holds the lock tight and releases it
> reluctantly, it will play nicer in the furuter with notifying an
> iopolling task about new such pending failed requests.
> 
> Signed-off-by: Pavel Begunkov <[email protected]>
> ---
> @@ -2371,11 +2371,22 @@ static void io_req_task_queue(struct io_kiocb *req)
>  	req->task_work.func = io_req_task_submit;
>  	ret = io_req_task_work_add(req);
>  	if (unlikely(ret)) {
> +		ret = -ECANCELED;

That's a stupid mistake. Jens, any chance you can fold in a diff below?

>  		percpu_ref_get(&req->ctx->refs);
>  		io_req_task_work_add_fallback(req, io_req_task_cancel);
>  	}
>  }

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 1cb5e40d9822..582306b1dfd1 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2371,7 +2371,7 @@ static void io_req_task_queue(struct io_kiocb *req)
 	req->task_work.func = io_req_task_submit;
 	ret = io_req_task_work_add(req);
 	if (unlikely(ret)) {
-		ret = -ECANCELED;
+		req->result = -ECANCELED;
 		percpu_ref_get(&req->ctx->refs);
 		io_req_task_work_add_fallback(req, io_req_task_cancel);
 	}

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] io_uring: fail io-wq submission from a task_work
  2021-02-21  1:01   ` Pavel Begunkov
@ 2021-02-21  2:03     ` Jens Axboe
  0 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2021-02-21  2:03 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 2/20/21 6:01 PM, Pavel Begunkov wrote:
> On 18/02/2021 22:32, Pavel Begunkov wrote:
>> In case of failure io_wq_submit_work() needs to post an CQE and so
>> potentially take uring_lock. The safest way to deal with it is to do
>> that from under task_work where we can safely take the lock.
>>
>> Also, as io_iopoll_check() holds the lock tight and releases it
>> reluctantly, it will play nicer in the furuter with notifying an
>> iopolling task about new such pending failed requests.
>>
>> Signed-off-by: Pavel Begunkov <[email protected]>
>> ---
>> @@ -2371,11 +2371,22 @@ static void io_req_task_queue(struct io_kiocb *req)
>>  	req->task_work.func = io_req_task_submit;
>>  	ret = io_req_task_work_add(req);
>>  	if (unlikely(ret)) {
>> +		ret = -ECANCELED;
> 
> That's a stupid mistake. Jens, any chance you can fold in a diff below?

Yup done.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-02-21  2:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-02-18 22:32 [PATCH 0/3] fix nested uring_lock Pavel Begunkov
2021-02-18 22:32 ` [PATCH 1/3] io_uring: don't take uring_lock during iowq cancel Pavel Begunkov
2021-02-18 22:32 ` [PATCH 2/3] io_uring: fail io-wq submission from a task_work Pavel Begunkov
2021-02-21  1:01   ` Pavel Begunkov
2021-02-21  2:03     ` Jens Axboe
2021-02-18 22:32 ` [PATCH 3/3] io_uring: avoid taking ctx refs for task-cancel Pavel Begunkov
2021-02-18 23:14   ` Jens Axboe
2021-02-18 23:16 ` [PATCH 0/3] fix nested uring_lock Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox