Re: IORING_OP_POLL_ADD slower than linux-aio IOCB_CMD_POLL

public inbox for [email protected]
 help / color / mirror / Atom feed

From: Avi Kivity <[email protected]>
To: Pavel Begunkov <[email protected]>,
	Jens Axboe <[email protected]>,
	[email protected]
Subject: Re: IORING_OP_POLL_ADD slower than linux-aio IOCB_CMD_POLL
Date: Wed, 15 Jun 2022 14:36:48 +0300	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>


On 15/06/2022 14.30, Pavel Begunkov wrote:
> On 6/15/22 12:04, Avi Kivity wrote:
>>
>> On 15/06/2022 13.48, Pavel Begunkov wrote:
>>> On 6/15/22 11:12, Avi Kivity wrote:
>>>>
>>>> On 19/04/2022 20.14, Jens Axboe wrote:
>>>>> On 4/19/22 9:21 AM, Jens Axboe wrote:
>>>>>> On 4/19/22 6:31 AM, Jens Axboe wrote:
>>>>>>> On 4/19/22 6:21 AM, Avi Kivity wrote:
>>>>>>>> On 19/04/2022 15.04, Jens Axboe wrote:
>>>>>>>>> On 4/19/22 5:57 AM, Avi Kivity wrote:
>>>>>>>>>> On 19/04/2022 14.38, Jens Axboe wrote:
>>>>>>>>>>> On 4/19/22 5:07 AM, Avi Kivity wrote:
>>>>>>>>>>>> A simple webserver shows about 5% loss compared to linux-aio.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I expect the loss is due to an optimization that io_uring 
>>>>>>>>>>>> lacks -
>>>>>>>>>>>> inline completion vs workqueue completion:
>>>>>>>>>>> I don't think that's it, io_uring never punts to a workqueue 
>>>>>>>>>>> for
>>>>>>>>>>> completions.
>>>>>>>>>> I measured this:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>    Performance counter stats for 'system wide':
>>>>>>>>>>
>>>>>>>>>>            1,273,756 io_uring:io_uring_task_add
>>>>>>>>>>
>>>>>>>>>>         12.288597765 seconds time elapsed
>>>>>>>>>>
>>>>>>>>>> Which exactly matches with the number of requests sent. If 
>>>>>>>>>> that's the
>>>>>>>>>> wrong counter to measure, I'm happy to try again with the 
>>>>>>>>>> correct
>>>>>>>>>> counter.
>>>>>>>>> io_uring_task_add() isn't a workqueue, it's task_work. So that is
>>>>>>>>> expected.
>>>>>> Might actually be implicated. Not because it's a async worker, but
>>>>>> because I think we might be losing some affinity in this case. 
>>>>>> Looking
>>>>>> at traces, we're definitely bouncing between the poll completion 
>>>>>> side
>>>>>> and then execution the completion.
>>>>>>
>>>>>> Can you try this hack? It's against -git + for-5.19/io_uring. If 
>>>>>> you let
>>>>>> me know what base you prefer, I can do a version against that. I see
>>>>>> about a 3% win with io_uring with this, and was slower before 
>>>>>> against
>>>>>> linux-aio as you saw as well.
>>>>> Another thing to try - get rid of the IPI for TWA_SIGNAL, which I
>>>>> believe may be the underlying cause of it.
>>>>>
>>>>
>>>> Resurrecting an old thread. I have a question about timeliness of 
>>>> completions. Let's assume a request has completed. From the patch, 
>>>> it appears that io_uring will only guarantee that a completion 
>>>> appears on the completion ring if the thread has entered kernel 
>>>> mode since the completion happened. So user-space polling of the 
>>>> completion ring can cause unbounded delays.
>>>
>>> Right, but polling the CQ is a bad pattern, 
>>> io_uring_{wait,peek}_cqe/etc.
>>> will do the polling vs syscalling dance for you.
>>
>>
>> Can you be more explicit?
>>
>>
>> I don't think peek is enough. If there is a cqe pending, it will 
>> return it, but will not cause compeleted-but-unqueued events to 
>> generate completions.
>>
>>
>> And wait won't enter the kernel if a cqe is pending, IIUC.
>
> Right, usually it won't, but works if you eventually end up
> waiting, e.g. by waiting for all expected cqes.
>
>
>>> For larger audience, I'll remind that it's an opt-in feature
>>>
>>
>> I don't understand - what is an opt-in feature?
>
> The behaviour that you worry about when CQEs are not posted until
> you do syscall, it's only so if you set IORING_SETUP_COOP_TASKRUN.
>

Ah! I wasn't aware of this new flag. This is exactly what I want - 
either ask for timely completions, or optimize for throughput.


Of course, it puts me in a dilemma because I want both, but that's my 
problem.


Thanks!

     prev parent reply	other threads:[~2022-06-15 11:36 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-19 11:07 IORING_OP_POLL_ADD slower than linux-aio IOCB_CMD_POLL Avi Kivity
2022-04-19 11:38 ` Jens Axboe
2022-04-19 11:57   ` Avi Kivity
2022-04-19 12:04     ` Jens Axboe
2022-04-19 12:21       ` Avi Kivity
2022-04-19 12:31         ` Jens Axboe
2022-04-19 15:21           ` Jens Axboe
2022-04-19 15:51             ` Avi Kivity
2022-04-19 17:14             ` Jens Axboe
2022-04-19 19:41               ` Avi Kivity
2022-04-19 19:58                 ` Jens Axboe
2022-04-20 11:55                   ` Avi Kivity
2022-04-20 12:09                     ` Jens Axboe
2022-04-21  9:05                       ` Avi Kivity
2022-06-15 10:12               ` Avi Kivity
2022-06-15 10:48                 ` Pavel Begunkov
2022-06-15 11:04                   ` Avi Kivity
2022-06-15 11:07                     ` Avi Kivity
2022-06-15 11:38                       ` Pavel Begunkov
2022-06-15 12:21                         ` Jens Axboe
2022-06-15 13:43                           ` Avi Kivity
2022-06-15 11:30                     ` Pavel Begunkov
2022-06-15 11:36                       ` Avi Kivity [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox