* [PATCH 1/1] io_uring: fix missing mb() before waitqueue_active
@ 2021-09-08 19:49 Pavel Begunkov
2021-09-08 19:57 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Pavel Begunkov @ 2021-09-08 19:49 UTC (permalink / raw)
To: Jens Axboe, io-uring
In case of !SQPOLL, io_cqring_ev_posted_iopoll() doesn't provide a
memory barrier required by waitqueue_active(&ctx->poll_wait). There is
a wq_has_sleeper(), which does smb_mb() inside, but it's called only for
SQPOLL.
Fixes: 5fd4617840596 ("io_uring: be smarter about waking multiple CQ ring waiters")
Signed-off-by: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index d816c09c88a5..d80d8359501f 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1619,8 +1619,11 @@ static void io_cqring_ev_posted(struct io_ring_ctx *ctx)
static void io_cqring_ev_posted_iopoll(struct io_ring_ctx *ctx)
{
+ /* see waitqueue_active() comment */
+ smp_mb();
+
if (ctx->flags & IORING_SETUP_SQPOLL) {
- if (wq_has_sleeper(&ctx->cq_wait))
+ if (waitqueue_active(&ctx->cq_wait))
wake_up_all(&ctx->cq_wait);
}
if (io_should_trigger_evfd(ctx))
--
2.33.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] io_uring: fix missing mb() before waitqueue_active
2021-09-08 19:49 [PATCH 1/1] io_uring: fix missing mb() before waitqueue_active Pavel Begunkov
@ 2021-09-08 19:57 ` Jens Axboe
2021-09-08 20:09 ` Pavel Begunkov
0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2021-09-08 19:57 UTC (permalink / raw)
To: Pavel Begunkov, io-uring
On 9/8/21 1:49 PM, Pavel Begunkov wrote:
> In case of !SQPOLL, io_cqring_ev_posted_iopoll() doesn't provide a
> memory barrier required by waitqueue_active(&ctx->poll_wait). There is
> a wq_has_sleeper(), which does smb_mb() inside, but it's called only for
> SQPOLL.
We can probably get rid of the need to even do so by having the slow
path (eg someone waiting on cq_wait or poll_wait) a bit more expensive,
but this should do for now.
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] io_uring: fix missing mb() before waitqueue_active
2021-09-08 19:57 ` Jens Axboe
@ 2021-09-08 20:09 ` Pavel Begunkov
2021-09-08 20:15 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Pavel Begunkov @ 2021-09-08 20:09 UTC (permalink / raw)
To: Jens Axboe, io-uring
On 9/8/21 8:57 PM, Jens Axboe wrote:
> On 9/8/21 1:49 PM, Pavel Begunkov wrote:
>> In case of !SQPOLL, io_cqring_ev_posted_iopoll() doesn't provide a
>> memory barrier required by waitqueue_active(&ctx->poll_wait). There is
>> a wq_has_sleeper(), which does smb_mb() inside, but it's called only for
>> SQPOLL.
>
> We can probably get rid of the need to even do so by having the slow
> path (eg someone waiting on cq_wait or poll_wait) a bit more expensive,
> but this should do for now.
You have probably seen smp_mb__after_spin_unlock() trick [1], easy way
to get rid of it for !IOPOLL. Haven't figured it out for IOPOLL, though
[1] https://github.com/isilence/linux/commit/bb391b10d0555ba2d55aa8ee0a08dff8701a6a57
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] io_uring: fix missing mb() before waitqueue_active
2021-09-08 20:09 ` Pavel Begunkov
@ 2021-09-08 20:15 ` Jens Axboe
2021-09-08 20:22 ` Pavel Begunkov
0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2021-09-08 20:15 UTC (permalink / raw)
To: Pavel Begunkov, io-uring
On 9/8/21 2:09 PM, Pavel Begunkov wrote:
> On 9/8/21 8:57 PM, Jens Axboe wrote:
>> On 9/8/21 1:49 PM, Pavel Begunkov wrote:
>>> In case of !SQPOLL, io_cqring_ev_posted_iopoll() doesn't provide a
>>> memory barrier required by waitqueue_active(&ctx->poll_wait). There is
>>> a wq_has_sleeper(), which does smb_mb() inside, but it's called only for
>>> SQPOLL.
>>
>> We can probably get rid of the need to even do so by having the slow
>> path (eg someone waiting on cq_wait or poll_wait) a bit more expensive,
>> but this should do for now.
>
> You have probably seen smp_mb__after_spin_unlock() trick [1], easy way
> to get rid of it for !IOPOLL. Haven't figured it out for IOPOLL, though
>
> [1] https://github.com/isilence/linux/commit/bb391b10d0555ba2d55aa8ee0a08dff8701a6a57
We can just synchronize the poll_wait() with a spinlock. It's kind of silly,
and it's especially silly since I bet nobody does poll(2) on the ring fd for
IOPOLL, but...
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] io_uring: fix missing mb() before waitqueue_active
2021-09-08 20:15 ` Jens Axboe
@ 2021-09-08 20:22 ` Pavel Begunkov
2021-09-08 20:24 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Pavel Begunkov @ 2021-09-08 20:22 UTC (permalink / raw)
To: Jens Axboe, io-uring
On 9/8/21 9:15 PM, Jens Axboe wrote:
> On 9/8/21 2:09 PM, Pavel Begunkov wrote:
>> On 9/8/21 8:57 PM, Jens Axboe wrote:
>>> On 9/8/21 1:49 PM, Pavel Begunkov wrote:
>>>> In case of !SQPOLL, io_cqring_ev_posted_iopoll() doesn't provide a
>>>> memory barrier required by waitqueue_active(&ctx->poll_wait). There is
>>>> a wq_has_sleeper(), which does smb_mb() inside, but it's called only for
>>>> SQPOLL.
>>>
>>> We can probably get rid of the need to even do so by having the slow
>>> path (eg someone waiting on cq_wait or poll_wait) a bit more expensive,
>>> but this should do for now.
>>
>> You have probably seen smp_mb__after_spin_unlock() trick [1], easy way
>> to get rid of it for !IOPOLL. Haven't figured it out for IOPOLL, though
>>
>> [1] https://github.com/isilence/linux/commit/bb391b10d0555ba2d55aa8ee0a08dff8701a6a57
>
> We can just synchronize the poll_wait() with a spinlock. It's kind of silly,
> and it's especially silly since I bet nobody does poll(2) on the ring fd for
> IOPOLL, but...
fwiw, for the ebpf cat ev_posted() -> smb_mb() for taking ~3-5%.
And there are non-bpf cases that may benefit from it.
On my list to publish a refined version of the patch.
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] io_uring: fix missing mb() before waitqueue_active
2021-09-08 20:22 ` Pavel Begunkov
@ 2021-09-08 20:24 ` Jens Axboe
0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2021-09-08 20:24 UTC (permalink / raw)
To: Pavel Begunkov, io-uring
On 9/8/21 2:22 PM, Pavel Begunkov wrote:
> On 9/8/21 9:15 PM, Jens Axboe wrote:
>> On 9/8/21 2:09 PM, Pavel Begunkov wrote:
>>> On 9/8/21 8:57 PM, Jens Axboe wrote:
>>>> On 9/8/21 1:49 PM, Pavel Begunkov wrote:
>>>>> In case of !SQPOLL, io_cqring_ev_posted_iopoll() doesn't provide a
>>>>> memory barrier required by waitqueue_active(&ctx->poll_wait). There is
>>>>> a wq_has_sleeper(), which does smb_mb() inside, but it's called only for
>>>>> SQPOLL.
>>>>
>>>> We can probably get rid of the need to even do so by having the slow
>>>> path (eg someone waiting on cq_wait or poll_wait) a bit more expensive,
>>>> but this should do for now.
>>>
>>> You have probably seen smp_mb__after_spin_unlock() trick [1], easy way
>>> to get rid of it for !IOPOLL. Haven't figured it out for IOPOLL, though
>>>
>>> [1] https://github.com/isilence/linux/commit/bb391b10d0555ba2d55aa8ee0a08dff8701a6a57
>>
>> We can just synchronize the poll_wait() with a spinlock. It's kind of silly,
>> and it's especially silly since I bet nobody does poll(2) on the ring fd for
>> IOPOLL, but...
>
> fwiw, for the ebpf cat ev_posted() -> smb_mb() for taking ~3-5%.
> And there are non-bpf cases that may benefit from it.
>
> On my list to publish a refined version of the patch.
Maybe let's postpone this patch then and see if we can't do better...
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-09-08 20:24 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-09-08 19:49 [PATCH 1/1] io_uring: fix missing mb() before waitqueue_active Pavel Begunkov
2021-09-08 19:57 ` Jens Axboe
2021-09-08 20:09 ` Pavel Begunkov
2021-09-08 20:15 ` Jens Axboe
2021-09-08 20:22 ` Pavel Begunkov
2021-09-08 20:24 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox