public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH] io_uring: fix io_try_cancel_userdata race for iowq
@ 2021-08-23 12:30 Pavel Begunkov
  2021-08-23 16:17 ` Jens Axboe
  0 siblings, 1 reply; 2+ messages in thread
From: Pavel Begunkov @ 2021-08-23 12:30 UTC (permalink / raw)
  To: Jens Axboe, io-uring; +Cc: asml.silence, syzbot+b0c9d1588ae92866515f

WARNING: CPU: 1 PID: 5870 at fs/io_uring.c:5975 io_try_cancel_userdata+0x30f/0x540 fs/io_uring.c:5975
CPU: 0 PID: 5870 Comm: iou-wrk-5860 Not tainted 5.14.0-rc6-next-20210820-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:io_try_cancel_userdata+0x30f/0x540 fs/io_uring.c:5975
Call Trace:
 io_async_cancel fs/io_uring.c:6014 [inline]
 io_issue_sqe+0x22d5/0x65a0 fs/io_uring.c:6407
 io_wq_submit_work+0x1dc/0x300 fs/io_uring.c:6511
 io_worker_handle_work+0xa45/0x1840 fs/io-wq.c:533
 io_wqe_worker+0x2cc/0xbb0 fs/io-wq.c:582
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295

io_try_cancel_userdata() can be called from io_async_cancel() executing
in the io-wq context, so the warning fires, which is there to alert
anyone accessing task->io_uring->io_wq in a racy way. However,
io_wq_put_and_exit() always first waits for all threads to complete,
so the only detail left is to zero tctx->io_wq after the context is
removed.

note: one little assumption is that when IO_WQ_WORK_CANCEL, the executor
won't touch ->io_wq, because io_wq_destroy() might cancel left pending
requests in such a way.

Cc: [email protected]
Reported-by: [email protected]
Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index d9534c72dc4b..027afe2f55d4 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -5804,7 +5804,7 @@ static int io_try_cancel_userdata(struct io_kiocb *req, u64 sqe_addr)
 	struct io_ring_ctx *ctx = req->ctx;
 	int ret;
 
-	WARN_ON_ONCE(req->task != current);
+	WARN_ON_ONCE(!io_wq_current_is_worker() && req->task != current);
 
 	ret = io_async_cancel_one(req->task->io_uring, sqe_addr, ctx);
 	if (ret != -ENOENT)
@@ -6309,6 +6309,7 @@ static void io_wq_submit_work(struct io_wq_work *work)
 	if (timeout)
 		io_queue_linked_timeout(timeout);
 
+	/* either cancelled or io-wq is dying, so don't touch tctx->iowq */
 	if (work->flags & IO_WQ_WORK_CANCEL)
 		ret = -ECANCELED;
 
@@ -9124,8 +9125,8 @@ static void io_uring_clean_tctx(struct io_uring_task *tctx)
 		 * Must be after io_uring_del_task_file() (removes nodes under
 		 * uring_lock) to avoid race with io_uring_try_cancel_iowq().
 		 */
-		tctx->io_wq = NULL;
 		io_wq_put_and_exit(wq);
+		tctx->io_wq = NULL;
 	}
 }
 
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] io_uring: fix io_try_cancel_userdata race for iowq
  2021-08-23 12:30 [PATCH] io_uring: fix io_try_cancel_userdata race for iowq Pavel Begunkov
@ 2021-08-23 16:17 ` Jens Axboe
  0 siblings, 0 replies; 2+ messages in thread
From: Jens Axboe @ 2021-08-23 16:17 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring; +Cc: syzbot+b0c9d1588ae92866515f

On 8/23/21 6:30 AM, Pavel Begunkov wrote:
> WARNING: CPU: 1 PID: 5870 at fs/io_uring.c:5975 io_try_cancel_userdata+0x30f/0x540 fs/io_uring.c:5975
> CPU: 0 PID: 5870 Comm: iou-wrk-5860 Not tainted 5.14.0-rc6-next-20210820-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:io_try_cancel_userdata+0x30f/0x540 fs/io_uring.c:5975
> Call Trace:
>  io_async_cancel fs/io_uring.c:6014 [inline]
>  io_issue_sqe+0x22d5/0x65a0 fs/io_uring.c:6407
>  io_wq_submit_work+0x1dc/0x300 fs/io_uring.c:6511
>  io_worker_handle_work+0xa45/0x1840 fs/io-wq.c:533
>  io_wqe_worker+0x2cc/0xbb0 fs/io-wq.c:582
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
> 
> io_try_cancel_userdata() can be called from io_async_cancel() executing
> in the io-wq context, so the warning fires, which is there to alert
> anyone accessing task->io_uring->io_wq in a racy way. However,
> io_wq_put_and_exit() always first waits for all threads to complete,
> so the only detail left is to zero tctx->io_wq after the context is
> removed.
> 
> note: one little assumption is that when IO_WQ_WORK_CANCEL, the executor
> won't touch ->io_wq, because io_wq_destroy() might cancel left pending
> requests in such a way.

Applied, thanks.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-08-23 16:17 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-08-23 12:30 [PATCH] io_uring: fix io_try_cancel_userdata race for iowq Pavel Begunkov
2021-08-23 16:17 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox