* [PATCH v3 0/2] poll_refs armoring
@ 2022-11-20 16:57 Pavel Begunkov
2022-11-20 16:57 ` [PATCH v3 1/2] io_uring: cmpxchg for poll arm refs release Pavel Begunkov
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Pavel Begunkov @ 2022-11-20 16:57 UTC (permalink / raw)
To: io-uring; +Cc: Jens Axboe, asml.silence
Make poll_refs more robust and protected from overflows. The mechanism
description is in 2/2. 1/2 helps to make the second patch a little bit
cleaner by tunnelling all edge cases from arming to a tw.
A good way to test is to set IO_POLL_REF_BIAS to 0 and 1. 0 will make
the slowpath to be hit every single time, and 1 triggers it in case of
races.
v2: clear the retry flag before vfs_poll()
v3: fix not handling arm_poll* refs release edge cases with patch 1/2
Pavel Begunkov (2):
io_uring: cmpxchg for poll arm refs release
io_uring: make poll refs more robust
io_uring/poll.c | 44 ++++++++++++++++++++++++++++++++++++++------
1 file changed, 38 insertions(+), 6 deletions(-)
--
2.38.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v3 1/2] io_uring: cmpxchg for poll arm refs release
2022-11-20 16:57 [PATCH v3 0/2] poll_refs armoring Pavel Begunkov
@ 2022-11-20 16:57 ` Pavel Begunkov
2022-11-25 13:51 ` Pavel Begunkov
2022-11-20 16:57 ` [PATCH v3 2/2] io_uring: make poll refs more robust Pavel Begunkov
2022-11-21 17:10 ` [PATCH v3 0/2] poll_refs armoring Jens Axboe
2 siblings, 1 reply; 6+ messages in thread
From: Pavel Begunkov @ 2022-11-20 16:57 UTC (permalink / raw)
To: io-uring; +Cc: Jens Axboe, asml.silence
Replace atomically substracting the ownership reference at the end of
arming a poll with a cmpxchg. We try to release ownership by setting 0
assuming that poll_refs didn't change while we were arming. If it did
change, we keep the ownership and use it to queue a tw, which is fully
capable to process all events and (even tolerates spurious wake ups).
It's a bit more elegant as we reduce races b/w setting the cancellation
flag and getting refs with this release, and with that we don't have to
worry about any kinds of underflows. It's not the fastest path for
polling. The performance difference b/w cmpxchg and atomic dec is
usually negligible and it's not the fastest path.
Signed-off-by: Pavel Begunkov <[email protected]>
---
io_uring/poll.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/io_uring/poll.c b/io_uring/poll.c
index 055632e9092a..1b78b527075d 100644
--- a/io_uring/poll.c
+++ b/io_uring/poll.c
@@ -518,7 +518,6 @@ static int __io_arm_poll_handler(struct io_kiocb *req,
unsigned issue_flags)
{
struct io_ring_ctx *ctx = req->ctx;
- int v;
INIT_HLIST_NODE(&req->hash_node);
req->work.cancel_seq = atomic_read(&ctx->cancel_seq);
@@ -586,11 +585,10 @@ static int __io_arm_poll_handler(struct io_kiocb *req,
if (ipt->owning) {
/*
- * Release ownership. If someone tried to queue a tw while it was
- * locked, kick it off for them.
+ * Try to release ownership. If we see a change of state, e.g.
+ * poll was waken up, queue up a tw, it'll deal with it.
*/
- v = atomic_dec_return(&req->poll_refs);
- if (unlikely(v & IO_POLL_REF_MASK))
+ if (atomic_cmpxchg(&req->poll_refs, 1, 0) != 1)
__io_poll_execute(req, 0);
}
return 0;
--
2.38.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v3 2/2] io_uring: make poll refs more robust
2022-11-20 16:57 [PATCH v3 0/2] poll_refs armoring Pavel Begunkov
2022-11-20 16:57 ` [PATCH v3 1/2] io_uring: cmpxchg for poll arm refs release Pavel Begunkov
@ 2022-11-20 16:57 ` Pavel Begunkov
2022-11-21 17:10 ` [PATCH v3 0/2] poll_refs armoring Jens Axboe
2 siblings, 0 replies; 6+ messages in thread
From: Pavel Begunkov @ 2022-11-20 16:57 UTC (permalink / raw)
To: io-uring; +Cc: Jens Axboe, asml.silence, Lin Ma
poll_refs carry two functions, the first is ownership over the request.
The second is notifying the io_poll_check_events() that there was an
event but wake up couldn't grab the ownership, so io_poll_check_events()
should retry.
We want to make poll_refs more robust against overflows. Instead of
always incrementing it, which covers two purposes with one atomic, check
if poll_refs is elevated enough and if so set a retry flag without
attempts to grab ownership. The gap between the bias check and following
atomics may seem racy, but we don't need it to be strict. Moreover there
might only be maximum 4 parallel updates: by the first and the second
poll entries, __io_arm_poll_handler() and cancellation. From those four,
only poll wake ups may be executed multiple times, but they're protected
by a spin.
Cc: [email protected]
Reported-by: Lin Ma <[email protected]>
Fixes: aa43477b04025 ("io_uring: poll rework")
Signed-off-by: Pavel Begunkov <[email protected]>
---
io_uring/poll.c | 36 +++++++++++++++++++++++++++++++++++-
1 file changed, 35 insertions(+), 1 deletion(-)
diff --git a/io_uring/poll.c b/io_uring/poll.c
index 1b78b527075d..b444b7d87697 100644
--- a/io_uring/poll.c
+++ b/io_uring/poll.c
@@ -40,7 +40,14 @@ struct io_poll_table {
};
#define IO_POLL_CANCEL_FLAG BIT(31)
-#define IO_POLL_REF_MASK GENMASK(30, 0)
+#define IO_POLL_RETRY_FLAG BIT(30)
+#define IO_POLL_REF_MASK GENMASK(29, 0)
+
+/*
+ * We usually have 1-2 refs taken, 128 is more than enough and we want to
+ * maximise the margin between this amount and the moment when it overflows.
+ */
+#define IO_POLL_REF_BIAS 128
#define IO_WQE_F_DOUBLE 1
@@ -58,6 +65,21 @@ static inline bool wqe_is_double(struct wait_queue_entry *wqe)
return priv & IO_WQE_F_DOUBLE;
}
+static bool io_poll_get_ownership_slowpath(struct io_kiocb *req)
+{
+ int v;
+
+ /*
+ * poll_refs are already elevated and we don't have much hope for
+ * grabbing the ownership. Instead of incrementing set a retry flag
+ * to notify the loop that there might have been some change.
+ */
+ v = atomic_fetch_or(IO_POLL_RETRY_FLAG, &req->poll_refs);
+ if (v & IO_POLL_REF_MASK)
+ return false;
+ return !(atomic_fetch_inc(&req->poll_refs) & IO_POLL_REF_MASK);
+}
+
/*
* If refs part of ->poll_refs (see IO_POLL_REF_MASK) is 0, it's free. We can
* bump it and acquire ownership. It's disallowed to modify requests while not
@@ -66,6 +88,8 @@ static inline bool wqe_is_double(struct wait_queue_entry *wqe)
*/
static inline bool io_poll_get_ownership(struct io_kiocb *req)
{
+ if (unlikely(atomic_read(&req->poll_refs) >= IO_POLL_REF_BIAS))
+ return io_poll_get_ownership_slowpath(req);
return !(atomic_fetch_inc(&req->poll_refs) & IO_POLL_REF_MASK);
}
@@ -235,6 +259,16 @@ static int io_poll_check_events(struct io_kiocb *req, bool *locked)
*/
if ((v & IO_POLL_REF_MASK) != 1)
req->cqe.res = 0;
+ if (v & IO_POLL_RETRY_FLAG) {
+ req->cqe.res = 0;
+ /*
+ * We won't find new events that came in between
+ * vfs_poll and the ref put unless we clear the flag
+ * in advance.
+ */
+ atomic_andnot(IO_POLL_RETRY_FLAG, &req->poll_refs);
+ v &= ~IO_POLL_RETRY_FLAG;
+ }
/* the mask was stashed in __io_poll_execute */
if (!req->cqe.res) {
--
2.38.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v3 0/2] poll_refs armoring
2022-11-20 16:57 [PATCH v3 0/2] poll_refs armoring Pavel Begunkov
2022-11-20 16:57 ` [PATCH v3 1/2] io_uring: cmpxchg for poll arm refs release Pavel Begunkov
2022-11-20 16:57 ` [PATCH v3 2/2] io_uring: make poll refs more robust Pavel Begunkov
@ 2022-11-21 17:10 ` Jens Axboe
2 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2022-11-21 17:10 UTC (permalink / raw)
To: Pavel Begunkov, io-uring
On Sun, 20 Nov 2022 16:57:40 +0000, Pavel Begunkov wrote:
> Make poll_refs more robust and protected from overflows. The mechanism
> description is in 2/2. 1/2 helps to make the second patch a little bit
> cleaner by tunnelling all edge cases from arming to a tw.
>
> A good way to test is to set IO_POLL_REF_BIAS to 0 and 1. 0 will make
> the slowpath to be hit every single time, and 1 triggers it in case of
> races.
>
> [...]
Applied, thanks!
[1/2] io_uring: cmpxchg for poll arm refs release
(no commit info)
[2/2] io_uring: make poll refs more robust
(no commit info)
Best regards,
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3 1/2] io_uring: cmpxchg for poll arm refs release
2022-11-20 16:57 ` [PATCH v3 1/2] io_uring: cmpxchg for poll arm refs release Pavel Begunkov
@ 2022-11-25 13:51 ` Pavel Begunkov
2022-11-25 13:55 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Pavel Begunkov @ 2022-11-25 13:51 UTC (permalink / raw)
To: io-uring; +Cc: Jens Axboe
On 11/20/22 16:57, Pavel Begunkov wrote:
> Replace atomically substracting the ownership reference at the end of
> arming a poll with a cmpxchg. We try to release ownership by setting 0
> assuming that poll_refs didn't change while we were arming. If it did
> change, we keep the ownership and use it to queue a tw, which is fully
> capable to process all events and (even tolerates spurious wake ups).
>
> It's a bit more elegant as we reduce races b/w setting the cancellation
> flag and getting refs with this release, and with that we don't have to
> worry about any kinds of underflows. It's not the fastest path for
> polling. The performance difference b/w cmpxchg and atomic dec is
> usually negligible and it's not the fastest path.
Jens, can you add a couple of tags? Not a fix but the second patch
depends on it but applies cleanly even without 1/2, which may mess
things up.
Cc: [email protected]
Fixes: aa43477b04025 ("io_uring: poll rework")
> Signed-off-by: Pavel Begunkov <[email protected]>
> ---
> io_uring/poll.c | 8 +++-----
> 1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/io_uring/poll.c b/io_uring/poll.c
> index 055632e9092a..1b78b527075d 100644
> --- a/io_uring/poll.c
> +++ b/io_uring/poll.c
> @@ -518,7 +518,6 @@ static int __io_arm_poll_handler(struct io_kiocb *req,
> unsigned issue_flags)
> {
> struct io_ring_ctx *ctx = req->ctx;
> - int v;
>
> INIT_HLIST_NODE(&req->hash_node);
> req->work.cancel_seq = atomic_read(&ctx->cancel_seq);
> @@ -586,11 +585,10 @@ static int __io_arm_poll_handler(struct io_kiocb *req,
>
> if (ipt->owning) {
> /*
> - * Release ownership. If someone tried to queue a tw while it was
> - * locked, kick it off for them.
> + * Try to release ownership. If we see a change of state, e.g.
> + * poll was waken up, queue up a tw, it'll deal with it.
> */
> - v = atomic_dec_return(&req->poll_refs);
> - if (unlikely(v & IO_POLL_REF_MASK))
> + if (atomic_cmpxchg(&req->poll_refs, 1, 0) != 1)
> __io_poll_execute(req, 0);
> }
> return 0;
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3 1/2] io_uring: cmpxchg for poll arm refs release
2022-11-25 13:51 ` Pavel Begunkov
@ 2022-11-25 13:55 ` Jens Axboe
0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2022-11-25 13:55 UTC (permalink / raw)
To: Pavel Begunkov, io-uring
On 11/25/22 6:51 AM, Pavel Begunkov wrote:
> On 11/20/22 16:57, Pavel Begunkov wrote:
>> Replace atomically substracting the ownership reference at the end of
>> arming a poll with a cmpxchg. We try to release ownership by setting 0
>> assuming that poll_refs didn't change while we were arming. If it did
>> change, we keep the ownership and use it to queue a tw, which is fully
>> capable to process all events and (even tolerates spurious wake ups).
>>
>> It's a bit more elegant as we reduce races b/w setting the cancellation
>> flag and getting refs with this release, and with that we don't have to
>> worry about any kinds of underflows. It's not the fastest path for
>> polling. The performance difference b/w cmpxchg and atomic dec is
>> usually negligible and it's not the fastest path.
>
> Jens, can you add a couple of tags? Not a fix but the second patch
> depends on it but applies cleanly even without 1/2, which may mess
> things up.
>
> Cc: [email protected]
> Fixes: aa43477b04025 ("io_uring: poll rework")
Sure, done.
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-11-25 13:55 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-20 16:57 [PATCH v3 0/2] poll_refs armoring Pavel Begunkov
2022-11-20 16:57 ` [PATCH v3 1/2] io_uring: cmpxchg for poll arm refs release Pavel Begunkov
2022-11-25 13:51 ` Pavel Begunkov
2022-11-25 13:55 ` Jens Axboe
2022-11-20 16:57 ` [PATCH v3 2/2] io_uring: make poll refs more robust Pavel Begunkov
2022-11-21 17:10 ` [PATCH v3 0/2] poll_refs armoring Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox