public inbox for [email protected]
 help / color / mirror / Atom feed
From: Pavel Begunkov <[email protected]>
To: Jens Axboe <[email protected]>, [email protected]
Cc: [email protected], Kanchan Joshi <[email protected]>,
	Ming Lei <[email protected]>
Subject: Re: [PATCH 11/11] io_uring: get rid of intermediate aux cqe caches
Date: Fri, 15 Mar 2024 17:26:43 +0000	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 3/15/24 16:49, Jens Axboe wrote:
> On 3/15/24 10:44 AM, Pavel Begunkov wrote:
>> On 3/15/24 16:27, Jens Axboe wrote:
>>> On 3/15/24 10:25 AM, Jens Axboe wrote:
>>>> On 3/15/24 10:23 AM, Pavel Begunkov wrote:
>>>>> On 3/15/24 16:20, Jens Axboe wrote:
>>>>>> On 3/15/24 9:30 AM, Pavel Begunkov wrote:
>>>>>>> io_post_aux_cqe(), which is used for multishot requests, delays
>>>>>>> completions by putting CQEs into a temporary array for the purpose
>>>>>>> completion lock/flush batching.
>>>>>>>
>>>>>>> DEFER_TASKRUN doesn't need any locking, so for it we can put completions
>>>>>>> directly into the CQ and defer post completion handling with a flag.
>>>>>>> That leaves !DEFER_TASKRUN, which is not that interesting / hot for
>>>>>>> multishot requests, so have conditional locking with deferred flush
>>>>>>> for them.
>>>>>>
>>>>>> This breaks the read-mshot test case, looking into what is going on
>>>>>> there.
>>>>>
>>>>> I forgot to mention, yes it does, the test makes odd assumptions about
>>>>> overflows, IIRC it expects that the kernel allows one and only one aux
>>>>> CQE to be overflown. Let me double check
>>>>
>>>> Yeah this is very possible, the overflow checking could be broken in
>>>> there. I'll poke at it and report back.
>>>
>>> It does, this should fix it:
>>>
>>>
>>> diff --git a/test/read-mshot.c b/test/read-mshot.c
>>> index 8fcb79857bf0..501ca69a98dc 100644
>>> --- a/test/read-mshot.c
>>> +++ b/test/read-mshot.c
>>> @@ -236,7 +236,7 @@ static int test(int first_good, int async, int overflow)
>>>            }
>>>            if (!(cqe->flags & IORING_CQE_F_MORE)) {
>>>                /* we expect this on overflow */
>>> -            if (overflow && (i - 1 == NR_OVERFLOW))
>>> +            if (overflow && i >= NR_OVERFLOW)
>>
>> Which is not ideal either, e.g. I wouldn't mind if the kernel stops
>> one entry before CQ is full, so that the request can complete w/o
>> overflowing. Not supposing the change because it's a marginal
>> case, but we shouldn't limit ourselves.
> 
> But if the event keeps triggering we have to keep posting CQEs,
> otherwise we could get stuck. 

Or we can complete the request, then the user consumes CQEs
and restarts as usual

> As far as I'm concerned, the behavior with
> the patch looks correct. The last CQE is overflown, and that terminates
> it, and it doesn't have MORE set. The one before that has MORE set, but
> it has to, unless you aborted it early. But that seems impossible,
> because what if that was indeed the last current CQE, and we reap CQEs
> before the next one is posted.
> 
> So unless I'm missing something, I don't think we can be doing any
> better.

You can opportunistically try to avoid overflows, unreliably

bool io_post_cqe() {
	// Not enough space in the CQ left, so if there is a next
	// completion pending we'd have to overflow. Avoid that by
	// terminating it now.
	//
	// If there are no more CQEs after this one, we might
	// terminate a bit earlier, but that better because
	// overflows are so expensive and unhandy and so on.
	if (cq_space_left() <= 1)
		return false;
	fill_cqe();
	return true;
}

some_multishot_function(req) {
	if (!io_post_cqe(res))
		complete_req(req, res);
}

Again, not suggesting the change for all the obvious reasons, but
I think semantically we should be able to do it.

-- 
Pavel Begunkov

  reply	other threads:[~2024-03-15 17:27 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-15 15:29 [PATCH 00/11] remove aux CQE caches Pavel Begunkov
2024-03-15 15:29 ` [PATCH 01/11] io_uring: fix poll_remove stalled req completion Pavel Begunkov
2024-03-15 15:29 ` [PATCH 02/11] io_uring/cmd: kill one issue_flags to tw conversion Pavel Begunkov
2024-03-15 15:29 ` [PATCH 03/11] io_uring/cmd: fix tw <-> issue_flags conversion Pavel Begunkov
2024-03-15 15:29 ` [PATCH 04/11] io_uring/cmd: introduce io_uring_cmd_complete Pavel Begunkov
2024-03-15 15:29 ` [PATCH 05/11] ublk: don't hard code IO_URING_F_UNLOCKED Pavel Begunkov
2024-03-15 15:29 ` [PATCH 06/11] nvme/io_uring: " Pavel Begunkov
2024-03-15 15:29 ` [PATCH 07/11] io_uring/rw: avoid punting to io-wq directly Pavel Begunkov
2024-03-15 15:29 ` [PATCH 08/11] io_uring: force tw ctx locking Pavel Begunkov
2024-03-15 15:40   ` Jens Axboe
2024-03-15 16:14     ` Pavel Begunkov
2024-03-15 15:29 ` [PATCH 09/11] io_uring: remove struct io_tw_state::locked Pavel Begunkov
2024-03-15 15:30 ` [PATCH 10/11] io_uring: refactor io_fill_cqe_req_aux Pavel Begunkov
2024-03-15 15:30 ` [PATCH 11/11] io_uring: get rid of intermediate aux cqe caches Pavel Begunkov
2024-03-15 16:20   ` Jens Axboe
2024-03-15 16:23     ` Pavel Begunkov
2024-03-15 16:25       ` Jens Axboe
2024-03-15 16:27         ` Jens Axboe
2024-03-15 16:44           ` Pavel Begunkov
2024-03-15 16:49             ` Jens Axboe
2024-03-15 17:26               ` Pavel Begunkov [this message]
2024-03-15 18:26                 ` Jens Axboe
2024-03-15 18:51                   ` Pavel Begunkov
2024-03-15 19:02                     ` Jens Axboe
2024-03-15 16:29         ` Pavel Begunkov
2024-03-15 16:33           ` Jens Axboe
2024-03-15 15:42 ` [PATCH 00/11] remove aux CQE caches Jens Axboe
2024-03-15 16:00 ` Jens Axboe
2024-03-15 22:53 ` (subset) " Jens Axboe
2024-03-16  2:03   ` Ming Lei
2024-03-16  2:24     ` Ming Lei
2024-03-16  2:54       ` Pavel Begunkov
2024-03-16  3:54         ` Ming Lei
2024-03-16  4:13           ` Pavel Begunkov
2024-03-16  4:20             ` Pavel Begunkov
2024-03-16  9:53               ` Ming Lei
2024-03-16 11:52   ` Ming Lei
2024-03-16 13:27     ` Pavel Begunkov
2024-03-16 13:56       ` Ming Lei
2024-03-17 20:55         ` Pavel Begunkov
2024-03-17 21:24           ` Jens Axboe
2024-03-17 21:29             ` Pavel Begunkov
2024-03-17 21:32               ` Jens Axboe
2024-03-17 21:34                 ` Pavel Begunkov
2024-03-17 21:47                   ` Pavel Begunkov
2024-03-17 21:51                     ` Jens Axboe
2024-03-17 22:07                       ` Jens Axboe
2024-03-17 22:24                         ` Jens Axboe
2024-03-18  0:15                           ` Ming Lei
2024-03-18  1:34                             ` Jens Axboe
2024-03-18  1:44                               ` Jens Axboe
2024-03-18  1:49                               ` Ming Lei
2024-03-17 23:16                       ` Pavel Begunkov
2024-03-16 14:39       ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox