public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH 0/2] for-next fixes
@ 2023-09-07 12:50 Pavel Begunkov
  2023-09-07 12:50 ` [PATCH 1/2] io_uring: break out of iowq iopoll on teardown Pavel Begunkov
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Pavel Begunkov @ 2023-09-07 12:50 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence

Patch 1 fixes a potential iopoll/iowq live lock
Patch 2 fixes a recent problem in overflow locking

Pavel Begunkov (2):
  io_uring: break out of iowq iopoll on teardown
  io_uring: fix unprotected iopoll overflow

 io_uring/io-wq.c    | 10 ++++++++++
 io_uring/io-wq.h    |  1 +
 io_uring/io_uring.c |  6 ++++--
 3 files changed, 15 insertions(+), 2 deletions(-)

-- 
2.41.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/2] io_uring: break out of iowq iopoll on teardown
  2023-09-07 12:50 [PATCH 0/2] for-next fixes Pavel Begunkov
@ 2023-09-07 12:50 ` Pavel Begunkov
  2023-09-07 12:50 ` [PATCH 2/2] io_uring: fix unprotected iopoll overflow Pavel Begunkov
  2023-09-07 15:02 ` [PATCH 0/2] for-next fixes Jens Axboe
  2 siblings, 0 replies; 4+ messages in thread
From: Pavel Begunkov @ 2023-09-07 12:50 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence

io-wq will retry iopoll even when it failed with -EAGAIN. If that
races with task exit, which sets TIF_NOTIFY_SIGNAL for all its workers,
such workers might potentially infinitely spin retrying iopoll again and
again and each time failing on some allocation / waiting / etc. Don't
keep spinning if io-wq is dying.

Fixes: 561fb04a6a225 ("io_uring: replace workqueue usage with io-wq")
Cc: [email protected]
Signed-off-by: Pavel Begunkov <[email protected]>
---
 io_uring/io-wq.c    | 10 ++++++++++
 io_uring/io-wq.h    |  1 +
 io_uring/io_uring.c |  2 ++
 3 files changed, 13 insertions(+)

diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c
index 62f345587df5..1ecc8c748768 100644
--- a/io_uring/io-wq.c
+++ b/io_uring/io-wq.c
@@ -174,6 +174,16 @@ static void io_worker_ref_put(struct io_wq *wq)
 		complete(&wq->worker_done);
 }
 
+bool io_wq_worker_stopped(void)
+{
+	struct io_worker *worker = current->worker_private;
+
+	if (WARN_ON_ONCE(!io_wq_current_is_worker()))
+		return true;
+
+	return test_bit(IO_WQ_BIT_EXIT, &worker->wq->state);
+}
+
 static void io_worker_cancel_cb(struct io_worker *worker)
 {
 	struct io_wq_acct *acct = io_wq_get_acct(worker);
diff --git a/io_uring/io-wq.h b/io_uring/io-wq.h
index 06d9ca90c577..2b2a6406dd8e 100644
--- a/io_uring/io-wq.h
+++ b/io_uring/io-wq.h
@@ -52,6 +52,7 @@ void io_wq_hash_work(struct io_wq_work *work, void *val);
 
 int io_wq_cpu_affinity(struct io_uring_task *tctx, cpumask_var_t mask);
 int io_wq_max_workers(struct io_wq *wq, int *new_count);
+bool io_wq_worker_stopped(void);
 
 static inline bool io_wq_is_hashed(struct io_wq_work *work)
 {
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 88599852af82..4674203c1cac 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1942,6 +1942,8 @@ void io_wq_submit_work(struct io_wq_work *work)
 		if (!needs_poll) {
 			if (!(req->ctx->flags & IORING_SETUP_IOPOLL))
 				break;
+			if (io_wq_worker_stopped())
+				break;
 			cond_resched();
 			continue;
 		}
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2] io_uring: fix unprotected iopoll overflow
  2023-09-07 12:50 [PATCH 0/2] for-next fixes Pavel Begunkov
  2023-09-07 12:50 ` [PATCH 1/2] io_uring: break out of iowq iopoll on teardown Pavel Begunkov
@ 2023-09-07 12:50 ` Pavel Begunkov
  2023-09-07 15:02 ` [PATCH 0/2] for-next fixes Jens Axboe
  2 siblings, 0 replies; 4+ messages in thread
From: Pavel Begunkov @ 2023-09-07 12:50 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence

[   71.490669] WARNING: CPU: 3 PID: 17070 at io_uring/io_uring.c:769
io_cqring_event_overflow+0x47b/0x6b0
[   71.498381] Call Trace:
[   71.498590]  <TASK>
[   71.501858]  io_req_cqe_overflow+0x105/0x1e0
[   71.502194]  __io_submit_flush_completions+0x9f9/0x1090
[   71.503537]  io_submit_sqes+0xebd/0x1f00
[   71.503879]  __do_sys_io_uring_enter+0x8c5/0x2380
[   71.507360]  do_syscall_64+0x39/0x80

We decoupled CQ locking from ->task_complete but haven't fixed up places
forcing locking for CQ overflows.

Fixes: ec26c225f06f5 ("io_uring: merge iopoll and normal completion paths")
Signed-off-by: Pavel Begunkov <[email protected]>
---
 io_uring/io_uring.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 4674203c1cac..6cce8948bddf 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -883,7 +883,7 @@ static void __io_flush_post_cqes(struct io_ring_ctx *ctx)
 		struct io_uring_cqe *cqe = &ctx->completion_cqes[i];
 
 		if (!io_fill_cqe_aux(ctx, cqe->user_data, cqe->res, cqe->flags)) {
-			if (ctx->task_complete) {
+			if (ctx->lockless_cq) {
 				spin_lock(&ctx->completion_lock);
 				io_cqring_event_overflow(ctx, cqe->user_data,
 							cqe->res, cqe->flags, 0, 0);
@@ -1541,7 +1541,7 @@ void __io_submit_flush_completions(struct io_ring_ctx *ctx)
 
 		if (!(req->flags & REQ_F_CQE_SKIP) &&
 		    unlikely(!io_fill_cqe_req(ctx, req))) {
-			if (ctx->task_complete) {
+			if (ctx->lockless_cq) {
 				spin_lock(&ctx->completion_lock);
 				io_req_cqe_overflow(req);
 				spin_unlock(&ctx->completion_lock);
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 0/2] for-next fixes
  2023-09-07 12:50 [PATCH 0/2] for-next fixes Pavel Begunkov
  2023-09-07 12:50 ` [PATCH 1/2] io_uring: break out of iowq iopoll on teardown Pavel Begunkov
  2023-09-07 12:50 ` [PATCH 2/2] io_uring: fix unprotected iopoll overflow Pavel Begunkov
@ 2023-09-07 15:02 ` Jens Axboe
  2 siblings, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2023-09-07 15:02 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 9/7/23 6:50 AM, Pavel Begunkov wrote:
> Patch 1 fixes a potential iopoll/iowq live lock
> Patch 2 fixes a recent problem in overflow locking
> 
> Pavel Begunkov (2):
>   io_uring: break out of iowq iopoll on teardown
>   io_uring: fix unprotected iopoll overflow
> 
>  io_uring/io-wq.c    | 10 ++++++++++
>  io_uring/io-wq.h    |  1 +
>  io_uring/io_uring.c |  6 ++++--
>  3 files changed, 15 insertions(+), 2 deletions(-)

Thanks - applied manually, as lore is lagging for hours again...

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-09-07 16:59 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-07 12:50 [PATCH 0/2] for-next fixes Pavel Begunkov
2023-09-07 12:50 ` [PATCH 1/2] io_uring: break out of iowq iopoll on teardown Pavel Begunkov
2023-09-07 12:50 ` [PATCH 2/2] io_uring: fix unprotected iopoll overflow Pavel Begunkov
2023-09-07 15:02 ` [PATCH 0/2] for-next fixes Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox