public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jens Axboe <[email protected]>
To: Pavel Begunkov <[email protected]>, [email protected]
Subject: Re: [PATCH 4/4] io_uring: optimise compl locking for non-shared rings
Date: Fri, 18 Mar 2022 09:21:08 -0600	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 3/18/22 9:13 AM, Pavel Begunkov wrote:
> On 3/18/22 14:54, Jens Axboe wrote:
>> On 3/18/22 7:52 AM, Pavel Begunkov wrote:
>>> When only one task submits requests, most of CQEs are expected to be
>>> filled from that task context so we have natural serialisation. That
>>> would mean that in those cases we don't need spinlocking around CQE
>>> posting. One downside is that it also mean that io-wq workers can't emit
>>> CQEs directly but should do it through the original task context using
>>> task_works. That may hurt latency and performance and might matter much
>>> to some workloads, but it's not a huge deal in general as io-wq is a
>>> slow path and there is some additional merit from tw completion
>>> batching.
>>
>> Not too worried about io-wq task_work for cq filling, it is the slower
>> path after all. And I think we can get away with doing notifications as
>> it's just for CQ filling. If the task is currently waiting in
>> cqring_wait, then it'll get woken anyway and it will process task work.
>> If it's in userspace, it doesn't need a notification. That should make
>> it somewhat lighter than requiring using TIF_NOTIFY_SIGNAL for that.
>>
>>> The feature should be opted-in by the userspace by setting a new
>>> IORING_SETUP_PRIVATE_CQ flag. It doesn't work with IOPOLL, and also for
>>> now only the task that created a ring can submit requests to it.
>>
>> I know this is a WIP, but why do we need CQ_PRIVATE? And this needs to
> 
> One reason is because of the io-wq -> tw punting, which is not optimal
> for e.g. active users of IOSQE_ASYNC. The second is because the
> fundamental requirement is that only one task should be submitting
> requests. Was thinking about automating it, e.g. when we register
> a second tctx we go through a slow path waiting for all current tw
> to complete and then removing an internal and not userspace visible
> CQ_PRIVATE flag.

Was thinking something along those lines too. The alternative is setting
up the ring with SETUP_SINGLE_ISSUER or something like that, having the
application tell us that it is a single issuer and no submits are
shared across threads. Serves the same kind of purpose as CQ_PRIVATE,
but enables us to simply fail things if the task violates those
constraints. Would also be a better name I believe as it might enable
further optimizations in the future, like for example the mutex
reduction for submits.

> Also, as SQPOLL task is by definition the only one submitting SQEs,
> was thinking about enabling it by default for them, but didn't do
> because of the io-wq / IOSQE_ASYNC.

Gotcha.

>> work with registered files (and ring fd) as that is probably a bigger
>> win than skipping the completion_lock if you're not shared anyway.
> 
> It does work with fixed/registered files and registered io_uring fds.

t/io_uring fails for me with registered files or rings, getting EINVAL.
Might be user error, but that's simply just setting CQ_PRIVATE for
setup.

> In regards of "a bigger win", probably in many cases, but if you submit
> a good batch at once, and completion tw batching doesn't kick in (e.g.
> direct bdev read of not too high intensity), it might save
> N spinlock/unlock when registered ring fd would kill only one pair of
> fdget/fdput.

Definitely, various cases where one would be a bigger win than the
other, agree on that. But let's just ensure that both work together :-)

-- 
Jens Axboe


  reply	other threads:[~2022-03-18 15:21 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-18 13:52 [RFC 0/4] completion locking optimisation feature Pavel Begunkov
2022-03-18 13:52 ` [PATCH 1/4] io_uring: get rid of raw fill cqe in kill_timeout Pavel Begunkov
2022-03-18 13:52 ` [PATCH 2/4] io_uring: get rid of raw fill_cqe in io_fail_links Pavel Begunkov
2022-03-18 13:52 ` [PATCH 3/4] io_uring: remove raw fill_cqe from linked timeout Pavel Begunkov
2022-03-18 13:52 ` [PATCH 4/4] io_uring: optimise compl locking for non-shared rings Pavel Begunkov
2022-03-18 14:54   ` Jens Axboe
2022-03-18 15:13     ` Pavel Begunkov
2022-03-18 15:21       ` Jens Axboe [this message]
2022-03-18 15:32         ` Pavel Begunkov
2022-03-18 16:06           ` Jens Axboe
2022-03-18 14:42 ` [RFC 0/4] completion locking optimisation feature Pavel Begunkov
2022-03-18 14:52   ` Jens Axboe
2022-03-18 15:00     ` Pavel Begunkov
2022-03-18 15:22       ` Jens Axboe
2022-03-18 15:34         ` Pavel Begunkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox