public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH for-6.1 v2] io_uring: make poll refs more robust
@ 2022-11-18 20:20 Pavel Begunkov
  2022-11-18 21:08 ` Jens Axboe
  2022-11-18 22:39 ` Pavel Begunkov
  0 siblings, 2 replies; 4+ messages in thread
From: Pavel Begunkov @ 2022-11-18 20:20 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence, Lin Horse

poll_refs carry two functions, the first is ownership over the request.
The second is notifying the io_poll_check_events() that there was an
event but wake up couldn't grab the ownership, so io_poll_check_events()
should retry.

We want to make poll_refs more robust against overflows. Instead of
always incrementing it, which covers two purposes with one atomic, check
if poll_refs is large and if so set a retry flag without attempts to
grab ownership. The gap between the bias check and following atomics may
seem racy, but we don't need it to be strict. Moreover there might only
be maximum 4 parallel updates: by the first and the second poll entries,
__io_arm_poll_handler() and cancellation. From those four, only poll wake
ups may be executed multiple times, but they're protected by a spin.

Cc: [email protected]
Reported-by: Lin Horse <[email protected]>
Fixes: aa43477b04025 ("io_uring: poll rework")
Signed-off-by: Pavel Begunkov <[email protected]>
---

v2: clear the retry flag before vfs_poll()

 io_uring/poll.c | 42 ++++++++++++++++++++++++++++++++++++++----
 1 file changed, 38 insertions(+), 4 deletions(-)

diff --git a/io_uring/poll.c b/io_uring/poll.c
index 055632e9092a..c831a7b6b468 100644
--- a/io_uring/poll.c
+++ b/io_uring/poll.c
@@ -40,7 +40,15 @@ struct io_poll_table {
 };
 
 #define IO_POLL_CANCEL_FLAG	BIT(31)
-#define IO_POLL_REF_MASK	GENMASK(30, 0)
+#define IO_POLL_RETRY_FLAG	BIT(30)
+#define IO_POLL_REF_MASK	GENMASK(29, 0)
+#define IO_POLL_RETRY_MASK	(IO_POLL_REF_MASK | IO_POLL_RETRY_FLAG)
+
+/*
+ * We usually have 1-2 refs taken, 128 is more than enough and we want to
+ * maximise the margin between this amount and the moment when it overflows.
+ */
+#define IO_POLL_REF_BIAS	128
 
 #define IO_WQE_F_DOUBLE		1
 
@@ -58,6 +66,21 @@ static inline bool wqe_is_double(struct wait_queue_entry *wqe)
 	return priv & IO_WQE_F_DOUBLE;
 }
 
+static bool io_poll_get_ownership_slowpath(struct io_kiocb *req)
+{
+	int v;
+
+	/*
+	 * poll_refs are already elevated and we don't have much hope for
+	 * grabbing the ownership. Instead of incrementing set a retry flag
+	 * to notify the loop that there might have been some change.
+	 */
+	v = atomic_fetch_or(IO_POLL_RETRY_FLAG, &req->poll_refs);
+	if (!(v & IO_POLL_REF_MASK))
+		return !(atomic_fetch_inc(&req->poll_refs) & IO_POLL_REF_MASK);
+	return false;
+}
+
 /*
  * If refs part of ->poll_refs (see IO_POLL_REF_MASK) is 0, it's free. We can
  * bump it and acquire ownership. It's disallowed to modify requests while not
@@ -66,6 +89,8 @@ static inline bool wqe_is_double(struct wait_queue_entry *wqe)
  */
 static inline bool io_poll_get_ownership(struct io_kiocb *req)
 {
+	if (unlikely(atomic_read(&req->poll_refs) >= IO_POLL_REF_BIAS))
+		return io_poll_get_ownership_slowpath(req);
 	return !(atomic_fetch_inc(&req->poll_refs) & IO_POLL_REF_MASK);
 }
 
@@ -233,8 +258,17 @@ static int io_poll_check_events(struct io_kiocb *req, bool *locked)
 		 * and all others are be lost. Redo vfs_poll() to get
 		 * up to date state.
 		 */
-		if ((v & IO_POLL_REF_MASK) != 1)
+		if ((v & IO_POLL_RETRY_MASK) != 1)
 			req->cqe.res = 0;
+		if (v & IO_POLL_RETRY_FLAG) {
+			/*
+			 * We won't find new events that came in between of
+			 * vfs_poll and the ref put unless we clear the flag
+			 * in advance.
+			 */
+			atomic_andnot(IO_POLL_RETRY_FLAG, &req->poll_refs);
+			v &= ~IO_POLL_RETRY_FLAG;
+		}
 
 		/* the mask was stashed in __io_poll_execute */
 		if (!req->cqe.res) {
@@ -274,7 +308,7 @@ static int io_poll_check_events(struct io_kiocb *req, bool *locked)
 		 * Release all references, retry if someone tried to restart
 		 * task_work while we were executing it.
 		 */
-	} while (atomic_sub_return(v & IO_POLL_REF_MASK, &req->poll_refs));
+	} while (atomic_sub_return(v & IO_POLL_RETRY_MASK, &req->poll_refs));
 
 	return IOU_POLL_NO_ACTION;
 }
@@ -590,7 +624,7 @@ static int __io_arm_poll_handler(struct io_kiocb *req,
 		 * locked, kick it off for them.
 		 */
 		v = atomic_dec_return(&req->poll_refs);
-		if (unlikely(v & IO_POLL_REF_MASK))
+		if (unlikely(v & IO_POLL_RETRY_MASK))
 			__io_poll_execute(req, 0);
 	}
 	return 0;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH for-6.1 v2] io_uring: make poll refs more robust
  2022-11-18 20:20 [PATCH for-6.1 v2] io_uring: make poll refs more robust Pavel Begunkov
@ 2022-11-18 21:08 ` Jens Axboe
  2022-11-18 22:39 ` Pavel Begunkov
  1 sibling, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2022-11-18 21:08 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring; +Cc: Lin Horse

On Fri, 18 Nov 2022 20:20:29 +0000, Pavel Begunkov wrote:
> poll_refs carry two functions, the first is ownership over the request.
> The second is notifying the io_poll_check_events() that there was an
> event but wake up couldn't grab the ownership, so io_poll_check_events()
> should retry.
> 
> We want to make poll_refs more robust against overflows. Instead of
> always incrementing it, which covers two purposes with one atomic, check
> if poll_refs is large and if so set a retry flag without attempts to
> grab ownership. The gap between the bias check and following atomics may
> seem racy, but we don't need it to be strict. Moreover there might only
> be maximum 4 parallel updates: by the first and the second poll entries,
> __io_arm_poll_handler() and cancellation. From those four, only poll wake
> ups may be executed multiple times, but they're protected by a spin.
> 
> [...]

Applied, thanks!

[1/1] io_uring: make poll refs more robust
      (no commit info)

Best regards,
-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH for-6.1 v2] io_uring: make poll refs more robust
  2022-11-18 20:20 [PATCH for-6.1 v2] io_uring: make poll refs more robust Pavel Begunkov
  2022-11-18 21:08 ` Jens Axboe
@ 2022-11-18 22:39 ` Pavel Begunkov
  2022-11-18 22:43   ` Jens Axboe
  1 sibling, 1 reply; 4+ messages in thread
From: Pavel Begunkov @ 2022-11-18 22:39 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, Lin Horse

On 11/18/22 20:20, Pavel Begunkov wrote:
> poll_refs carry two functions, the first is ownership over the request.
> The second is notifying the io_poll_check_events() that there was an
> event but wake up couldn't grab the ownership, so io_poll_check_events()
> should retry.
> 
> We want to make poll_refs more robust against overflows. Instead of
> always incrementing it, which covers two purposes with one atomic, check
> if poll_refs is large and if so set a retry flag without attempts to
> grab ownership. The gap between the bias check and following atomics may
> seem racy, but we don't need it to be strict. Moreover there might only
> be maximum 4 parallel updates: by the first and the second poll entries,
> __io_arm_poll_handler() and cancellation. From those four, only poll wake
> ups may be executed multiple times, but they're protected by a spin.
> 
> Cc: [email protected]
> Reported-by: Lin Horse <[email protected]>
> Fixes: aa43477b04025 ("io_uring: poll rework")
> Signed-off-by: Pavel Begunkov <[email protected]>
> ---
[...]
> @@ -590,7 +624,7 @@ static int __io_arm_poll_handler(struct io_kiocb *req,
>   		 * locked, kick it off for them.
>   		 */
>   		v = atomic_dec_return(&req->poll_refs);
> -		if (unlikely(v & IO_POLL_REF_MASK))
> +		if (unlikely(v & IO_POLL_RETRY_MASK))

Still no good, this chunk is about ownership and so should use
IO_POLL_REF_MASK as before. Jens, please drop the patch, needs
more testing

>   			__io_poll_execute(req, 0);
>   	}
>   	return 0;

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH for-6.1 v2] io_uring: make poll refs more robust
  2022-11-18 22:39 ` Pavel Begunkov
@ 2022-11-18 22:43   ` Jens Axboe
  0 siblings, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2022-11-18 22:43 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring; +Cc: Lin Horse

On 11/18/22 3:39 PM, Pavel Begunkov wrote:
> On 11/18/22 20:20, Pavel Begunkov wrote:
>> poll_refs carry two functions, the first is ownership over the request.
>> The second is notifying the io_poll_check_events() that there was an
>> event but wake up couldn't grab the ownership, so io_poll_check_events()
>> should retry.
>>
>> We want to make poll_refs more robust against overflows. Instead of
>> always incrementing it, which covers two purposes with one atomic, check
>> if poll_refs is large and if so set a retry flag without attempts to
>> grab ownership. The gap between the bias check and following atomics may
>> seem racy, but we don't need it to be strict. Moreover there might only
>> be maximum 4 parallel updates: by the first and the second poll entries,
>> __io_arm_poll_handler() and cancellation. From those four, only poll wake
>> ups may be executed multiple times, but they're protected by a spin.
>>
>> Cc: [email protected]
>> Reported-by: Lin Horse <[email protected]>
>> Fixes: aa43477b04025 ("io_uring: poll rework")
>> Signed-off-by: Pavel Begunkov <[email protected]>
>> ---
> [...]
>> @@ -590,7 +624,7 @@ static int __io_arm_poll_handler(struct io_kiocb *req,
>>            * locked, kick it off for them.
>>            */
>>           v = atomic_dec_return(&req->poll_refs);
>> -        if (unlikely(v & IO_POLL_REF_MASK))
>> +        if (unlikely(v & IO_POLL_RETRY_MASK))
> 
> Still no good, this chunk is about ownership and so should use
> IO_POLL_REF_MASK as before. Jens, please drop the patch, needs
> more testing

Bummed - I dropped it for now.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-11-18 22:43 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-18 20:20 [PATCH for-6.1 v2] io_uring: make poll refs more robust Pavel Begunkov
2022-11-18 21:08 ` Jens Axboe
2022-11-18 22:39 ` Pavel Begunkov
2022-11-18 22:43   ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox