public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH for-next 0/6] poll cleanups and optimisations
@ 2022-06-23  9:34 Pavel Begunkov
  2022-06-23  9:34 ` [PATCH for-next 1/6] io_uring: clean poll ->private flagging Pavel Begunkov
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Pavel Begunkov @ 2022-06-23  9:34 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence

1-5 are clean ups, can be considered separately.

6 optimises the final atomic_dec() in __io_arm_poll_handler(). Jens
measured almost the same patch imrpoving some of the tests (netbench?)
by ~1-1.5%.

Pavel Begunkov (6):
  io_uring: clean poll ->private flagging
  io_uring: remove events caching atavisms
  io_uring: add a helper for apoll alloc
  io_uring: change arm poll return values
  io_uring: refactor poll arm error handling
  io_uring: optimise submission side poll_refs

 io_uring/poll.c | 213 ++++++++++++++++++++++++++++++++----------------
 1 file changed, 142 insertions(+), 71 deletions(-)

-- 
2.36.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH for-next 1/6] io_uring: clean poll ->private flagging
  2022-06-23  9:34 [PATCH for-next 0/6] poll cleanups and optimisations Pavel Begunkov
@ 2022-06-23  9:34 ` Pavel Begunkov
  2022-06-23  9:34 ` [PATCH for-next 2/6] io_uring: remove events caching atavisms Pavel Begunkov
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Pavel Begunkov @ 2022-06-23  9:34 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence

We store a req pointer in wqe->private but also take one bit to mark
double poll entries. Replace macro helpers with inline functions for
better type checking and also name the double flag.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 io_uring/poll.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/io_uring/poll.c b/io_uring/poll.c
index bd3110750cfa..210b174b155b 100644
--- a/io_uring/poll.c
+++ b/io_uring/poll.c
@@ -39,6 +39,22 @@ struct io_poll_table {
 #define IO_POLL_CANCEL_FLAG	BIT(31)
 #define IO_POLL_REF_MASK	GENMASK(30, 0)
 
+#define IO_WQE_F_DOUBLE		1
+
+static inline struct io_kiocb *wqe_to_req(struct wait_queue_entry *wqe)
+{
+	unsigned long priv = (unsigned long)wqe->private;
+
+	return (struct io_kiocb *)(priv & ~IO_WQE_F_DOUBLE);
+}
+
+static inline bool wqe_is_double(struct wait_queue_entry *wqe)
+{
+	unsigned long priv = (unsigned long)wqe->private;
+
+	return priv & IO_WQE_F_DOUBLE;
+}
+
 /*
  * If refs part of ->poll_refs (see IO_POLL_REF_MASK) is 0, it's free. We can
  * bump it and acquire ownership. It's disallowed to modify requests while not
@@ -306,8 +322,6 @@ static void io_poll_cancel_req(struct io_kiocb *req)
 	io_poll_execute(req, 0, 0);
 }
 
-#define wqe_to_req(wait)	((void *)((unsigned long) (wait)->private & ~1))
-#define wqe_is_double(wait)	((unsigned long) (wait)->private & 1)
 #define IO_ASYNC_POLL_COMMON	(EPOLLONESHOT | EPOLLPRI)
 
 static int io_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
@@ -392,7 +406,7 @@ static void __io_queue_proc(struct io_poll *poll, struct io_poll_table *pt,
 			return;
 		}
 		/* mark as double wq entry */
-		wqe_private |= 1;
+		wqe_private |= IO_WQE_F_DOUBLE;
 		req->flags |= REQ_F_DOUBLE_POLL;
 		io_init_poll_iocb(poll, first->events, first->wait.func);
 		*poll_ptr = poll;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH for-next 2/6] io_uring: remove events caching atavisms
  2022-06-23  9:34 [PATCH for-next 0/6] poll cleanups and optimisations Pavel Begunkov
  2022-06-23  9:34 ` [PATCH for-next 1/6] io_uring: clean poll ->private flagging Pavel Begunkov
@ 2022-06-23  9:34 ` Pavel Begunkov
  2022-06-23  9:34 ` [PATCH for-next 3/6] io_uring: add a helper for apoll alloc Pavel Begunkov
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Pavel Begunkov @ 2022-06-23  9:34 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence

Remove events argument from *io_poll_execute(), it's not needed and not
used.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 io_uring/poll.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/io_uring/poll.c b/io_uring/poll.c
index 210b174b155b..7de8c52793cd 100644
--- a/io_uring/poll.c
+++ b/io_uring/poll.c
@@ -289,8 +289,7 @@ static void io_apoll_task_func(struct io_kiocb *req, bool *locked)
 		io_req_complete_failed(req, ret);
 }
 
-static void __io_poll_execute(struct io_kiocb *req, int mask,
-			      __poll_t __maybe_unused events)
+static void __io_poll_execute(struct io_kiocb *req, int mask)
 {
 	io_req_set_res(req, mask, 0);
 	/*
@@ -308,18 +307,17 @@ static void __io_poll_execute(struct io_kiocb *req, int mask,
 	io_req_task_work_add(req);
 }
 
-static inline void io_poll_execute(struct io_kiocb *req, int res,
-		__poll_t events)
+static inline void io_poll_execute(struct io_kiocb *req, int res)
 {
 	if (io_poll_get_ownership(req))
-		__io_poll_execute(req, res, events);
+		__io_poll_execute(req, res);
 }
 
 static void io_poll_cancel_req(struct io_kiocb *req)
 {
 	io_poll_mark_cancelled(req);
 	/* kick tw, which should complete the request */
-	io_poll_execute(req, 0, 0);
+	io_poll_execute(req, 0);
 }
 
 #define IO_ASYNC_POLL_COMMON	(EPOLLONESHOT | EPOLLPRI)
@@ -334,7 +332,7 @@ static int io_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
 	if (unlikely(mask & POLLFREE)) {
 		io_poll_mark_cancelled(req);
 		/* we have to kick tw in case it's not already */
-		io_poll_execute(req, 0, poll->events);
+		io_poll_execute(req, 0);
 
 		/*
 		 * If the waitqueue is being freed early but someone is already
@@ -369,7 +367,7 @@ static int io_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
 			else
 				req->flags &= ~REQ_F_SINGLE_POLL;
 		}
-		__io_poll_execute(req, mask, poll->events);
+		__io_poll_execute(req, mask);
 	}
 	return 1;
 }
@@ -487,7 +485,7 @@ static int __io_arm_poll_handler(struct io_kiocb *req,
 			req->apoll_events |= EPOLLONESHOT;
 			ipt->error = 0;
 		}
-		__io_poll_execute(req, mask, poll->events);
+		__io_poll_execute(req, mask);
 		return 0;
 	}
 
@@ -497,7 +495,7 @@ static int __io_arm_poll_handler(struct io_kiocb *req,
 	 */
 	v = atomic_dec_return(&req->poll_refs);
 	if (unlikely(v & IO_POLL_REF_MASK))
-		__io_poll_execute(req, 0, poll->events);
+		__io_poll_execute(req, 0);
 	return 0;
 }
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH for-next 3/6] io_uring: add a helper for apoll alloc
  2022-06-23  9:34 [PATCH for-next 0/6] poll cleanups and optimisations Pavel Begunkov
  2022-06-23  9:34 ` [PATCH for-next 1/6] io_uring: clean poll ->private flagging Pavel Begunkov
  2022-06-23  9:34 ` [PATCH for-next 2/6] io_uring: remove events caching atavisms Pavel Begunkov
@ 2022-06-23  9:34 ` Pavel Begunkov
  2022-06-23  9:34 ` [PATCH for-next 4/6] io_uring: change arm poll return values Pavel Begunkov
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Pavel Begunkov @ 2022-06-23  9:34 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence

Extract a helper function for apoll allocation, makes the code easier to
read.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 io_uring/poll.c | 44 ++++++++++++++++++++++++++++----------------
 1 file changed, 28 insertions(+), 16 deletions(-)

diff --git a/io_uring/poll.c b/io_uring/poll.c
index 7de8c52793cd..aef77f2a8a9a 100644
--- a/io_uring/poll.c
+++ b/io_uring/poll.c
@@ -508,10 +508,33 @@ static void io_async_queue_proc(struct file *file, struct wait_queue_head *head,
 	__io_queue_proc(&apoll->poll, pt, head, &apoll->double_poll);
 }
 
+static struct async_poll *io_req_alloc_apoll(struct io_kiocb *req,
+					     unsigned issue_flags)
+{
+	struct io_ring_ctx *ctx = req->ctx;
+	struct async_poll *apoll;
+
+	if (req->flags & REQ_F_POLLED) {
+		apoll = req->apoll;
+		kfree(apoll->double_poll);
+	} else if (!(issue_flags & IO_URING_F_UNLOCKED) &&
+		   !list_empty(&ctx->apoll_cache)) {
+		apoll = list_first_entry(&ctx->apoll_cache, struct async_poll,
+						poll.wait.entry);
+		list_del_init(&apoll->poll.wait.entry);
+	} else {
+		apoll = kmalloc(sizeof(*apoll), GFP_ATOMIC);
+		if (unlikely(!apoll))
+			return NULL;
+	}
+	apoll->double_poll = NULL;
+	req->apoll = apoll;
+	return apoll;
+}
+
 int io_arm_poll_handler(struct io_kiocb *req, unsigned issue_flags)
 {
 	const struct io_op_def *def = &io_op_defs[req->opcode];
-	struct io_ring_ctx *ctx = req->ctx;
 	struct async_poll *apoll;
 	struct io_poll_table ipt;
 	__poll_t mask = POLLPRI | POLLERR | EPOLLET;
@@ -546,21 +569,10 @@ int io_arm_poll_handler(struct io_kiocb *req, unsigned issue_flags)
 	}
 	if (def->poll_exclusive)
 		mask |= EPOLLEXCLUSIVE;
-	if (req->flags & REQ_F_POLLED) {
-		apoll = req->apoll;
-		kfree(apoll->double_poll);
-	} else if (!(issue_flags & IO_URING_F_UNLOCKED) &&
-		   !list_empty(&ctx->apoll_cache)) {
-		apoll = list_first_entry(&ctx->apoll_cache, struct async_poll,
-						poll.wait.entry);
-		list_del_init(&apoll->poll.wait.entry);
-	} else {
-		apoll = kmalloc(sizeof(*apoll), GFP_ATOMIC);
-		if (unlikely(!apoll))
-			return IO_APOLL_ABORTED;
-	}
-	apoll->double_poll = NULL;
-	req->apoll = apoll;
+
+	apoll = io_req_alloc_apoll(req, issue_flags);
+	if (!apoll)
+		return IO_APOLL_ABORTED;
 	req->flags |= REQ_F_POLLED;
 	ipt.pt._qproc = io_async_queue_proc;
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH for-next 4/6] io_uring: change arm poll return values
  2022-06-23  9:34 [PATCH for-next 0/6] poll cleanups and optimisations Pavel Begunkov
                   ` (2 preceding siblings ...)
  2022-06-23  9:34 ` [PATCH for-next 3/6] io_uring: add a helper for apoll alloc Pavel Begunkov
@ 2022-06-23  9:34 ` Pavel Begunkov
  2022-06-23  9:34 ` [PATCH for-next 5/6] io_uring: refactor poll arm error handling Pavel Begunkov
  2022-06-23  9:34 ` [PATCH for-next 6/6] io_uring: optimise submission side poll_refs Pavel Begunkov
  5 siblings, 0 replies; 9+ messages in thread
From: Pavel Begunkov @ 2022-06-23  9:34 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence

The rules for __io_arm_poll_handler()'s result parsing are complicated,
as the first step don't pass return a mask but pass back a positive
return code and fill ipt->result_mask.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 io_uring/poll.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/io_uring/poll.c b/io_uring/poll.c
index aef77f2a8a9a..80113b036c88 100644
--- a/io_uring/poll.c
+++ b/io_uring/poll.c
@@ -34,6 +34,8 @@ struct io_poll_table {
 	struct io_kiocb *req;
 	int nr_entries;
 	int error;
+	/* output value, set only if arm poll returns >0 */
+	__poll_t result_mask;
 };
 
 #define IO_POLL_CANCEL_FLAG	BIT(31)
@@ -462,8 +464,9 @@ static int __io_arm_poll_handler(struct io_kiocb *req,
 	if (mask &&
 	   ((poll->events & (EPOLLET|EPOLLONESHOT)) == (EPOLLET|EPOLLONESHOT))) {
 		io_poll_remove_entries(req);
+		ipt->result_mask = mask;
 		/* no one else has access to the req, forget about the ref */
-		return mask;
+		return 1;
 	}
 
 	if (!mask && unlikely(ipt->error || !ipt->nr_entries)) {
@@ -813,7 +816,7 @@ int io_poll_add(struct io_kiocb *req, unsigned int issue_flags)
 
 	ret = __io_arm_poll_handler(req, poll, &ipt, poll->events);
 	if (ret) {
-		io_req_set_res(req, ret, 0);
+		io_req_set_res(req, ipt.result_mask, 0);
 		return IOU_OK;
 	}
 	if (ipt.error) {
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH for-next 5/6] io_uring: refactor poll arm error handling
  2022-06-23  9:34 [PATCH for-next 0/6] poll cleanups and optimisations Pavel Begunkov
                   ` (3 preceding siblings ...)
  2022-06-23  9:34 ` [PATCH for-next 4/6] io_uring: change arm poll return values Pavel Begunkov
@ 2022-06-23  9:34 ` Pavel Begunkov
  2022-06-23 12:09   ` Jens Axboe
  2022-06-23  9:34 ` [PATCH for-next 6/6] io_uring: optimise submission side poll_refs Pavel Begunkov
  5 siblings, 1 reply; 9+ messages in thread
From: Pavel Begunkov @ 2022-06-23  9:34 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence

__io_arm_poll_handler() errors parsing is a horror, in case it failed it
returns 0 and the caller is expected to look at ipt.error, which already
led us to a number of problems before.

When it returns a valid mask, leave it as it's not, i.e. return 1 and
store the mask in ipt.result_mask. In case of a failure that can be
handled inline return an error code (negative value), and return 0 if
__io_arm_poll_handler() took ownership of the request and will complete
it.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 io_uring/poll.c | 44 +++++++++++++++++++++-----------------------
 1 file changed, 21 insertions(+), 23 deletions(-)

diff --git a/io_uring/poll.c b/io_uring/poll.c
index 80113b036c88..149205eae418 100644
--- a/io_uring/poll.c
+++ b/io_uring/poll.c
@@ -435,6 +435,12 @@ static void io_poll_queue_proc(struct file *file, struct wait_queue_head *head,
 			(struct io_poll **) &pt->req->async_data);
 }
 
+/*
+ * Returns 0 when it's handed over for polling. The caller owns the requests if
+ * it returns non-zero, but otherwise should not touch it. Negative values
+ * contain an error code. When the result is >0, the polling has completed
+ * inline and ipt.result_mask is set to the mask.
+ */
 static int __io_arm_poll_handler(struct io_kiocb *req,
 				 struct io_poll *poll,
 				 struct io_poll_table *ipt, __poll_t mask)
@@ -461,6 +467,17 @@ static int __io_arm_poll_handler(struct io_kiocb *req,
 	atomic_set(&req->poll_refs, 1);
 	mask = vfs_poll(req->file, &ipt->pt) & poll->events;
 
+	if (unlikely(ipt->error || !ipt->nr_entries)) {
+		io_poll_remove_entries(req);
+
+		if (mask && !(poll->events & EPOLLET)) {
+			ipt->result_mask = mask;
+			return 1;
+		} else {
+			return ipt->error ?: -EINVAL;
+		}
+	}
+
 	if (mask &&
 	   ((poll->events & (EPOLLET|EPOLLONESHOT)) == (EPOLLET|EPOLLONESHOT))) {
 		io_poll_remove_entries(req);
@@ -469,25 +486,12 @@ static int __io_arm_poll_handler(struct io_kiocb *req,
 		return 1;
 	}
 
-	if (!mask && unlikely(ipt->error || !ipt->nr_entries)) {
-		io_poll_remove_entries(req);
-		if (!ipt->error)
-			ipt->error = -EINVAL;
-		return 0;
-	}
-
 	if (req->flags & REQ_F_HASH_LOCKED)
 		io_poll_req_insert_locked(req);
 	else
 		io_poll_req_insert(req);
 
 	if (mask && (poll->events & EPOLLET)) {
-		/* can't multishot if failed, just queue the event we've got */
-		if (unlikely(ipt->error || !ipt->nr_entries)) {
-			poll->events |= EPOLLONESHOT;
-			req->apoll_events |= EPOLLONESHOT;
-			ipt->error = 0;
-		}
 		__io_poll_execute(req, mask);
 		return 0;
 	}
@@ -582,9 +586,8 @@ int io_arm_poll_handler(struct io_kiocb *req, unsigned issue_flags)
 	io_kbuf_recycle(req, issue_flags);
 
 	ret = __io_arm_poll_handler(req, &apoll->poll, &ipt, mask);
-	if (ret || ipt.error)
-		return ret ? IO_APOLL_READY : IO_APOLL_ABORTED;
-
+	if (ret)
+		return ret > 0 ? IO_APOLL_READY : IO_APOLL_ABORTED;
 	trace_io_uring_poll_arm(req, mask, apoll->poll.events);
 	return IO_APOLL_OK;
 }
@@ -815,16 +818,11 @@ int io_poll_add(struct io_kiocb *req, unsigned int issue_flags)
 		req->flags &= ~REQ_F_HASH_LOCKED;
 
 	ret = __io_arm_poll_handler(req, poll, &ipt, poll->events);
-	if (ret) {
+	if (ret > 0) {
 		io_req_set_res(req, ipt.result_mask, 0);
 		return IOU_OK;
 	}
-	if (ipt.error) {
-		req_set_fail(req);
-		return ipt.error;
-	}
-
-	return IOU_ISSUE_SKIP_COMPLETE;
+	return ret ?: IOU_ISSUE_SKIP_COMPLETE;
 }
 
 int io_poll_remove(struct io_kiocb *req, unsigned int issue_flags)
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH for-next 6/6] io_uring: optimise submission side poll_refs
  2022-06-23  9:34 [PATCH for-next 0/6] poll cleanups and optimisations Pavel Begunkov
                   ` (4 preceding siblings ...)
  2022-06-23  9:34 ` [PATCH for-next 5/6] io_uring: refactor poll arm error handling Pavel Begunkov
@ 2022-06-23  9:34 ` Pavel Begunkov
  5 siblings, 0 replies; 9+ messages in thread
From: Pavel Begunkov @ 2022-06-23  9:34 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence

The final poll_refs put in __io_arm_poll_handler() takes quite some
cycles. When we're arming from the original task context task_work won't
be run, so in this case we can assume that we won't race with task_works
and so not take the initial ownership ref.

One caveat is that after arming a poll we may race with it, so we have
to add a bunch of io_poll_get_ownership() hidden inside of
io_poll_can_finish_inline() whenever we want to complete arming inline.
For the same reason we can't just set REQ_F_DOUBLE_POLL in
__io_queue_proc() and so need to sync with the first poll entry by
taking its wq head lock.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 io_uring/poll.c | 88 +++++++++++++++++++++++++++++++++++++------------
 1 file changed, 67 insertions(+), 21 deletions(-)

diff --git a/io_uring/poll.c b/io_uring/poll.c
index 149205eae418..69b2f4bab3b2 100644
--- a/io_uring/poll.c
+++ b/io_uring/poll.c
@@ -34,6 +34,7 @@ struct io_poll_table {
 	struct io_kiocb *req;
 	int nr_entries;
 	int error;
+	bool owning;
 	/* output value, set only if arm poll returns >0 */
 	__poll_t result_mask;
 };
@@ -374,6 +375,27 @@ static int io_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
 	return 1;
 }
 
+static void io_poll_double_prepare(struct io_kiocb *req)
+{
+	struct wait_queue_head *head;
+	struct io_poll *poll = io_poll_get_single(req);
+
+	/* head is RCU protected, see io_poll_remove_entries() comments */
+	rcu_read_lock();
+	head = smp_load_acquire(&poll->head);
+	if (head) {
+		/*
+		 * poll arm may not hold ownership and so race with
+		 * io_poll_wake() by modifying req->flags. There is only one
+		 * poll entry queued, serialise with it by taking its head lock.
+		 */
+		spin_lock_irq(&head->lock);
+		req->flags |= REQ_F_DOUBLE_POLL;
+		spin_unlock_irq(&head->lock);
+	}
+	rcu_read_unlock();
+}
+
 static void __io_queue_proc(struct io_poll *poll, struct io_poll_table *pt,
 			    struct wait_queue_head *head,
 			    struct io_poll **poll_ptr)
@@ -405,16 +427,19 @@ static void __io_queue_proc(struct io_poll *poll, struct io_poll_table *pt,
 			pt->error = -ENOMEM;
 			return;
 		}
+
+		io_poll_double_prepare(req);
 		/* mark as double wq entry */
 		wqe_private |= IO_WQE_F_DOUBLE;
-		req->flags |= REQ_F_DOUBLE_POLL;
 		io_init_poll_iocb(poll, first->events, first->wait.func);
 		*poll_ptr = poll;
 		if (req->opcode == IORING_OP_POLL_ADD)
 			req->flags |= REQ_F_ASYNC_DATA;
+	} else {
+		/* fine to modify, there is no poll queued to race with us */
+		req->flags |= REQ_F_SINGLE_POLL;
 	}
 
-	req->flags |= REQ_F_SINGLE_POLL;
 	pt->nr_entries++;
 	poll->head = head;
 	poll->wait.private = (void *) wqe_private;
@@ -435,6 +460,12 @@ static void io_poll_queue_proc(struct file *file, struct wait_queue_head *head,
 			(struct io_poll **) &pt->req->async_data);
 }
 
+static bool io_poll_can_finish_inline(struct io_kiocb *req,
+				      struct io_poll_table *pt)
+{
+	return pt->owning || io_poll_get_ownership(req);
+}
+
 /*
  * Returns 0 when it's handed over for polling. The caller owns the requests if
  * it returns non-zero, but otherwise should not touch it. Negative values
@@ -443,7 +474,8 @@ static void io_poll_queue_proc(struct file *file, struct wait_queue_head *head,
  */
 static int __io_arm_poll_handler(struct io_kiocb *req,
 				 struct io_poll *poll,
-				 struct io_poll_table *ipt, __poll_t mask)
+				 struct io_poll_table *ipt, __poll_t mask,
+				 unsigned issue_flags)
 {
 	struct io_ring_ctx *ctx = req->ctx;
 	int v;
@@ -452,34 +484,45 @@ static int __io_arm_poll_handler(struct io_kiocb *req,
 	req->work.cancel_seq = atomic_read(&ctx->cancel_seq);
 	io_init_poll_iocb(poll, mask, io_poll_wake);
 	poll->file = req->file;
-
 	req->apoll_events = poll->events;
 
 	ipt->pt._key = mask;
 	ipt->req = req;
 	ipt->error = 0;
 	ipt->nr_entries = 0;
-
 	/*
-	 * Take the ownership to delay any tw execution up until we're done
-	 * with poll arming. see io_poll_get_ownership().
+	 * Polling is either completed here or via task_work, so if we're in the
+	 * task context we're naturally serialised with tw by merit of running
+	 * the same task. When it's io-wq, take the ownership to prevent tw
+	 * from running. However, when we're in the task context, skip taking
+	 * it as an optimisation.
+	 *
+	 * Note: even though the request won't be completed/freed, without
+	 * ownership we still can race with io_poll_wake().
+	 * io_poll_can_finish_inline() tries to deal with that.
 	 */
-	atomic_set(&req->poll_refs, 1);
+	ipt->owning = issue_flags & IO_URING_F_UNLOCKED;
+
+	atomic_set(&req->poll_refs, (int)ipt->owning);
 	mask = vfs_poll(req->file, &ipt->pt) & poll->events;
 
 	if (unlikely(ipt->error || !ipt->nr_entries)) {
 		io_poll_remove_entries(req);
 
-		if (mask && !(poll->events & EPOLLET)) {
+		if (!io_poll_can_finish_inline(req, ipt)) {
+			io_poll_mark_cancelled(req);
+			return 0;
+		} else if (mask && !(poll->events & EPOLLET)) {
 			ipt->result_mask = mask;
 			return 1;
-		} else {
-			return ipt->error ?: -EINVAL;
 		}
+		return ipt->error ?: -EINVAL;
 	}
 
 	if (mask &&
 	   ((poll->events & (EPOLLET|EPOLLONESHOT)) == (EPOLLET|EPOLLONESHOT))) {
+		if (!io_poll_can_finish_inline(req, ipt))
+			return 0;
 		io_poll_remove_entries(req);
 		ipt->result_mask = mask;
 		/* no one else has access to the req, forget about the ref */
@@ -491,18 +534,21 @@ static int __io_arm_poll_handler(struct io_kiocb *req,
 	else
 		io_poll_req_insert(req);
 
-	if (mask && (poll->events & EPOLLET)) {
+	if (mask && (poll->events & EPOLLET) &&
+	    io_poll_can_finish_inline(req, ipt)) {
 		__io_poll_execute(req, mask);
 		return 0;
 	}
 
-	/*
-	 * Release ownership. If someone tried to queue a tw while it was
-	 * locked, kick it off for them.
-	 */
-	v = atomic_dec_return(&req->poll_refs);
-	if (unlikely(v & IO_POLL_REF_MASK))
-		__io_poll_execute(req, 0);
+	if (ipt->owning) {
+		/*
+		 * Release ownership. If someone tried to queue a tw while it was
+		 * locked, kick it off for them.
+		 */
+		v = atomic_dec_return(&req->poll_refs);
+		if (unlikely(v & IO_POLL_REF_MASK))
+			__io_poll_execute(req, 0);
+	}
 	return 0;
 }
 
@@ -585,7 +631,7 @@ int io_arm_poll_handler(struct io_kiocb *req, unsigned issue_flags)
 
 	io_kbuf_recycle(req, issue_flags);
 
-	ret = __io_arm_poll_handler(req, &apoll->poll, &ipt, mask);
+	ret = __io_arm_poll_handler(req, &apoll->poll, &ipt, mask, issue_flags);
 	if (ret)
 		return ret > 0 ? IO_APOLL_READY : IO_APOLL_ABORTED;
 	trace_io_uring_poll_arm(req, mask, apoll->poll.events);
@@ -817,7 +863,7 @@ int io_poll_add(struct io_kiocb *req, unsigned int issue_flags)
 	else
 		req->flags &= ~REQ_F_HASH_LOCKED;
 
-	ret = __io_arm_poll_handler(req, poll, &ipt, poll->events);
+	ret = __io_arm_poll_handler(req, poll, &ipt, poll->events, issue_flags);
 	if (ret > 0) {
 		io_req_set_res(req, ipt.result_mask, 0);
 		return IOU_OK;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH for-next 5/6] io_uring: refactor poll arm error handling
  2022-06-23  9:34 ` [PATCH for-next 5/6] io_uring: refactor poll arm error handling Pavel Begunkov
@ 2022-06-23 12:09   ` Jens Axboe
  2022-06-23 12:16     ` Pavel Begunkov
  0 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2022-06-23 12:09 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 6/23/22 3:34 AM, Pavel Begunkov wrote:
> __io_arm_poll_handler() errors parsing is a horror, in case it failed it
> returns 0 and the caller is expected to look at ipt.error, which already
> led us to a number of problems before.
> 
> When it returns a valid mask, leave it as it's not, i.e. return 1 and
> store the mask in ipt.result_mask. In case of a failure that can be
> handled inline return an error code (negative value), and return 0 if
> __io_arm_poll_handler() took ownership of the request and will complete
> it.

Haven't looked at it yet, but this causes a consistent failure of one of
the poll based test cases:

axboe@m1pro-kvm ~/g/liburing (master)> test/poll-v-poll.t
do_fd_test: res 2a/1 differ
fd test IN failed

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH for-next 5/6] io_uring: refactor poll arm error handling
  2022-06-23 12:09   ` Jens Axboe
@ 2022-06-23 12:16     ` Pavel Begunkov
  0 siblings, 0 replies; 9+ messages in thread
From: Pavel Begunkov @ 2022-06-23 12:16 UTC (permalink / raw)
  To: Jens Axboe, io-uring

On 6/23/22 13:09, Jens Axboe wrote:
> On 6/23/22 3:34 AM, Pavel Begunkov wrote:
>> __io_arm_poll_handler() errors parsing is a horror, in case it failed it
>> returns 0 and the caller is expected to look at ipt.error, which already
>> led us to a number of problems before.
>>
>> When it returns a valid mask, leave it as it's not, i.e. return 1 and
>> store the mask in ipt.result_mask. In case of a failure that can be
>> handled inline return an error code (negative value), and return 0 if
>> __io_arm_poll_handler() took ownership of the request and will complete
>> it.
> 
> Haven't looked at it yet, but this causes a consistent failure of one of
> the poll based test cases:
> 
> axboe@m1pro-kvm ~/g/liburing (master)> test/poll-v-poll.t
> do_fd_test: res 2a/1 differ
> fd test IN failed

Ok, worked fine on the previous 5.20 rebase for me. Could've screwed
while rebasing today as I couldn't test because io_uring/5.20 is failing.
Let me respin once it's fixed.

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-06-23 12:17 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-06-23  9:34 [PATCH for-next 0/6] poll cleanups and optimisations Pavel Begunkov
2022-06-23  9:34 ` [PATCH for-next 1/6] io_uring: clean poll ->private flagging Pavel Begunkov
2022-06-23  9:34 ` [PATCH for-next 2/6] io_uring: remove events caching atavisms Pavel Begunkov
2022-06-23  9:34 ` [PATCH for-next 3/6] io_uring: add a helper for apoll alloc Pavel Begunkov
2022-06-23  9:34 ` [PATCH for-next 4/6] io_uring: change arm poll return values Pavel Begunkov
2022-06-23  9:34 ` [PATCH for-next 5/6] io_uring: refactor poll arm error handling Pavel Begunkov
2022-06-23 12:09   ` Jens Axboe
2022-06-23 12:16     ` Pavel Begunkov
2022-06-23  9:34 ` [PATCH for-next 6/6] io_uring: optimise submission side poll_refs Pavel Begunkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox