[PATCH 0/2] Pol fixups

public inbox for [email protected]
 help / color / mirror / Atom feed

* [PATCH 0/2] Pol fixups
@ 2023-01-10 18:00 Jens Axboe
  2023-01-10 18:00 ` [PATCH 1/2] io_uring/fdinfo: include locked hash table in fdinfo output Jens Axboe
  2023-01-10 18:00 ` [PATCH 2/2] io_uring/poll: attempt request issue after racy poll wakeup Jens Axboe
  0 siblings, 2 replies; 3+ messages in thread
From: Jens Axboe @ 2023-01-10 18:00 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence

Hi,

Patch 1 just adds support for fdinfo outputting state of the locked
poll hash table, something which was forgotten when that support was
added earlier.

Patch 2 fixes a race where multiple requests on the same poll waitqueue
may get woken and not properly retried. This causes those requests to
potentially never get woken again, if we race.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 1/2] io_uring/fdinfo: include locked hash table in fdinfo output
  2023-01-10 18:00 [PATCH 0/2] Pol fixups Jens Axboe
@ 2023-01-10 18:00 ` Jens Axboe
  2023-01-10 18:00 ` [PATCH 2/2] io_uring/poll: attempt request issue after racy poll wakeup Jens Axboe
  1 sibling, 0 replies; 3+ messages in thread
From: Jens Axboe @ 2023-01-10 18:00 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence, Jens Axboe

A previous commit split the hash table for polled requests into two
parts, but didn't get the fdinfo output updated. This means that it's
less useful for debugging, as we may think a given request is not pending
poll.

Fix this up by dumping the locked hash table contents too.

Fixes: 9ca9fb24d5fe ("io_uring: mutex locked poll hashing")
Signed-off-by: Jens Axboe <[email protected]>
---
 io_uring/fdinfo.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/io_uring/fdinfo.c b/io_uring/fdinfo.c
index 2e04850a657b..882bd56b01ed 100644
--- a/io_uring/fdinfo.c
+++ b/io_uring/fdinfo.c
@@ -170,12 +170,11 @@ static __cold void __io_uring_show_fdinfo(struct io_ring_ctx *ctx,
 		xa_for_each(&ctx->personalities, index, cred)
 			io_uring_show_cred(m, index, cred);
 	}
-	if (has_lock)
-		mutex_unlock(&ctx->uring_lock);
 
 	seq_puts(m, "PollList:\n");
 	for (i = 0; i < (1U << ctx->cancel_table.hash_bits); i++) {
 		struct io_hash_bucket *hb = &ctx->cancel_table.hbs[i];
+		struct io_hash_bucket *hbl = &ctx->cancel_table_locked.hbs[i];
 		struct io_kiocb *req;
 
 		spin_lock(&hb->lock);
@@ -183,8 +182,17 @@ static __cold void __io_uring_show_fdinfo(struct io_ring_ctx *ctx,
 			seq_printf(m, "  op=%d, task_works=%d\n", req->opcode,
 					task_work_pending(req->task));
 		spin_unlock(&hb->lock);
+
+		if (!has_lock)
+			continue;
+		hlist_for_each_entry(req, &hbl->list, hash_node)
+			seq_printf(m, "  op=%d, task_works=%d\n", req->opcode,
+					task_work_pending(req->task));
 	}
 
+	if (has_lock)
+		mutex_unlock(&ctx->uring_lock);
+
 	seq_puts(m, "CqOverflowList:\n");
 	spin_lock(&ctx->completion_lock);
 	list_for_each_entry(ocqe, &ctx->cq_overflow_list, list) {
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH 2/2] io_uring/poll: attempt request issue after racy poll wakeup
  2023-01-10 18:00 [PATCH 0/2] Pol fixups Jens Axboe
  2023-01-10 18:00 ` [PATCH 1/2] io_uring/fdinfo: include locked hash table in fdinfo output Jens Axboe
@ 2023-01-10 18:00 ` Jens Axboe
  1 sibling, 0 replies; 3+ messages in thread
From: Jens Axboe @ 2023-01-10 18:00 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence, Jens Axboe

If we have multiple requests waiting on the same target poll waitqueue,
then it's quite possible to get a request triggered and get disappointed
in not being able to make any progress with it. If we race in doing so,
we'll potentially leave the poll request on the internal tables, but
removed from the waitqueue. That means that any subsequent trigger of
the poll waitqueue will not kick that request into action, causing an
application to potentially wait for completion of a request that will
never happen.

Fix this by adding a new poll return state, IOU_POLL_REISSUE. Rather
than have complicated logic for how to re-arm a given type of request,
just punt it for a reissue.

While in there, move the 'ret' variable to the only section where it
gets used. This avoids confusion the scope of it.

Fixes: 49f1c68e048f ("io_uring: optimise submission side poll_refs")
Signed-off-by: Jens Axboe <[email protected]>
---
 io_uring/poll.c | 30 +++++++++++++++++++-----------
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/io_uring/poll.c b/io_uring/poll.c
index cf6a70bd54e0..0aca4ee6142b 100644
--- a/io_uring/poll.c
+++ b/io_uring/poll.c
@@ -223,21 +223,22 @@ enum {
 	IOU_POLL_DONE = 0,
 	IOU_POLL_NO_ACTION = 1,
 	IOU_POLL_REMOVE_POLL_USE_RES = 2,
+	IOU_POLL_REISSUE = 3,
 };
 
 /*
  * All poll tw should go through this. Checks for poll events, manages
  * references, does rewait, etc.
  *
- * Returns a negative error on failure. IOU_POLL_NO_ACTION when no action require,
- * which is either spurious wakeup or multishot CQE is served.
- * IOU_POLL_DONE when it's done with the request, then the mask is stored in req->cqe.res.
- * IOU_POLL_REMOVE_POLL_USE_RES indicates to remove multishot poll and that the result
- * is stored in req->cqe.
+ * Returns a negative error on failure. IOU_POLL_NO_ACTION when no action
+ * require, which is either spurious wakeup or multishot CQE is served.
+ * IOU_POLL_DONE when it's done with the request, then the mask is stored in
+ * req->cqe.res. IOU_POLL_REMOVE_POLL_USE_RES indicates to remove multishot
+ * poll and that the result is stored in req->cqe.
  */
 static int io_poll_check_events(struct io_kiocb *req, bool *locked)
 {
-	int v, ret;
+	int v;
 
 	/* req->task == current here, checking PF_EXITING is safe */
 	if (unlikely(req->task->flags & PF_EXITING))
@@ -276,10 +277,15 @@ static int io_poll_check_events(struct io_kiocb *req, bool *locked)
 		if (!req->cqe.res) {
 			struct poll_table_struct pt = { ._key = req->apoll_events };
 			req->cqe.res = vfs_poll(req->file, &pt) & req->apoll_events;
+			/*
+			 * We got woken with a mask, but someone else got to
+			 * it first. The above vfs_poll() doesn't add us back
+			 * to the waitqueue, so if we get nothing back, we
+			 * should be safe and attempt a reissue.
+			 */
+			if ((unlikely(!req->cqe.res)))
+				return IOU_POLL_REISSUE;
 		}
-
-		if ((unlikely(!req->cqe.res)))
-			continue;
 		if (req->apoll_events & EPOLLONESHOT)
 			return IOU_POLL_DONE;
 
@@ -294,7 +300,7 @@ static int io_poll_check_events(struct io_kiocb *req, bool *locked)
 				return IOU_POLL_REMOVE_POLL_USE_RES;
 			}
 		} else {
-			ret = io_poll_issue(req, locked);
+			int ret = io_poll_issue(req, locked);
 			if (ret == IOU_STOP_MULTISHOT)
 				return IOU_POLL_REMOVE_POLL_USE_RES;
 			if (ret < 0)
@@ -330,6 +336,8 @@ static void io_poll_task_func(struct io_kiocb *req, bool *locked)
 
 			poll = io_kiocb_to_cmd(req, struct io_poll);
 			req->cqe.res = mangle_poll(req->cqe.res & poll->events);
+		} else if (ret == IOU_POLL_REISSUE) {
+			io_req_task_submit(req, locked);
 		} else if (ret != IOU_POLL_REMOVE_POLL_USE_RES) {
 			req->cqe.res = ret;
 			req_set_fail(req);
@@ -342,7 +350,7 @@ static void io_poll_task_func(struct io_kiocb *req, bool *locked)
 
 		if (ret == IOU_POLL_REMOVE_POLL_USE_RES)
 			io_req_task_complete(req, locked);
-		else if (ret == IOU_POLL_DONE)
+		else if (ret == IOU_POLL_DONE || ret == IOU_POLL_REISSUE)
 			io_req_task_submit(req, locked);
 		else
 			io_req_defer_failed(req, ret);
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-01-10 18:03 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-10 18:00 [PATCH 0/2] Pol fixups Jens Axboe
2023-01-10 18:00 ` [PATCH 1/2] io_uring/fdinfo: include locked hash table in fdinfo output Jens Axboe
2023-01-10 18:00 ` [PATCH 2/2] io_uring/poll: attempt request issue after racy poll wakeup Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox