From: Jens Axboe <[email protected]>
To: Pavel Begunkov <[email protected]>, [email protected]
Cc: Dylan Yudaken <[email protected]>
Subject: Re: [PATCH for-next] io_uring: fix CQE reordering
Date: Fri, 23 Sep 2022 08:35:17 -0600 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 9/23/22 8:26 AM, Pavel Begunkov wrote:
> On 9/23/22 15:19, Jens Axboe wrote:
>> On 9/23/22 7:53 AM, Pavel Begunkov wrote:
>>> Overflowing CQEs may result in reordeing, which is buggy in case of
>>> links, F_MORE and so.
>>>
>>> Reported-by: Dylan Yudaken <[email protected]>
>>> Signed-off-by: Pavel Begunkov <[email protected]>
>>> ---
>>> io_uring/io_uring.c | 12 ++++++++++--
>>> io_uring/io_uring.h | 12 +++++++++---
>>> 2 files changed, 19 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
>>> index f359e24b46c3..62d1f55fde55 100644
>>> --- a/io_uring/io_uring.c
>>> +++ b/io_uring/io_uring.c
>>> @@ -609,7 +609,7 @@ static bool __io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force)
>>> io_cq_lock(ctx);
>>> while (!list_empty(&ctx->cq_overflow_list)) {
>>> - struct io_uring_cqe *cqe = io_get_cqe(ctx);
>>> + struct io_uring_cqe *cqe = io_get_cqe_overflow(ctx, true);
>>> struct io_overflow_cqe *ocqe;
>>> if (!cqe && !force)
>>> @@ -736,12 +736,19 @@ bool io_req_cqe_overflow(struct io_kiocb *req)
>>> * control dependency is enough as we're using WRITE_ONCE to
>>> * fill the cq entry
>>> */
>>> -struct io_uring_cqe *__io_get_cqe(struct io_ring_ctx *ctx)
>>> +struct io_uring_cqe *__io_get_cqe(struct io_ring_ctx *ctx, bool overflow)
>>> {
>>> struct io_rings *rings = ctx->rings;
>>> unsigned int off = ctx->cached_cq_tail & (ctx->cq_entries - 1);
>>> unsigned int free, queued, len;
>>> + /*
>>> + * Posting into the CQ when there are pending overflowed CQEs may break
>>> + * ordering guarantees, which will affect links, F_MORE users and more.
>>> + * Force overflow the completion.
>>> + */
>>> + if (!overflow && (ctx->check_cq & BIT(IO_CHECK_CQ_OVERFLOW_BIT)))
>>> + return NULL;
>>
>> Rather than pass this bool around for the hot path, why not add a helper
>> for the case where 'overflow' isn't known? That can leave the regular
>> io_get_cqe() avoiding this altogether.
>
> Was choosing from two ugly-ish solutions, but io_get_cqe() should be
> inline and shouldn't really matter, but that's only the case in theory
> though. If someone cleans up the CQE32 part and puts it into a separate
> non-inline function, it'll be actually inlined.
Yes, in theory the current one will be fine as it's known at compile
time. In theory... Didn't check if practice agrees with that, would
prefer if we didn't leave this to the compiler. Fiddling some other
bits, will check in a bit if I have a better idea.
--
Jens Axboe
next prev parent reply other threads:[~2022-09-23 14:35 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-23 13:53 [PATCH for-next] io_uring: fix CQE reordering Pavel Begunkov
2022-09-23 14:19 ` Jens Axboe
2022-09-23 14:26 ` Pavel Begunkov
2022-09-23 14:35 ` Jens Axboe [this message]
2022-09-23 14:43 ` Pavel Begunkov
2022-09-23 14:51 ` Jens Axboe
2022-09-23 14:32 ` Dylan Yudaken
2022-09-23 14:34 ` Jens Axboe
2022-09-23 21:05 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox