From: Jens Axboe <[email protected]>
To: Pavel Begunkov <[email protected]>, [email protected]
Subject: Re: [PATCH 4/4] io_uring: optimise compl locking for non-shared rings
Date: Fri, 18 Mar 2022 09:21:08 -0600 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 3/18/22 9:13 AM, Pavel Begunkov wrote:
> On 3/18/22 14:54, Jens Axboe wrote:
>> On 3/18/22 7:52 AM, Pavel Begunkov wrote:
>>> When only one task submits requests, most of CQEs are expected to be
>>> filled from that task context so we have natural serialisation. That
>>> would mean that in those cases we don't need spinlocking around CQE
>>> posting. One downside is that it also mean that io-wq workers can't emit
>>> CQEs directly but should do it through the original task context using
>>> task_works. That may hurt latency and performance and might matter much
>>> to some workloads, but it's not a huge deal in general as io-wq is a
>>> slow path and there is some additional merit from tw completion
>>> batching.
>>
>> Not too worried about io-wq task_work for cq filling, it is the slower
>> path after all. And I think we can get away with doing notifications as
>> it's just for CQ filling. If the task is currently waiting in
>> cqring_wait, then it'll get woken anyway and it will process task work.
>> If it's in userspace, it doesn't need a notification. That should make
>> it somewhat lighter than requiring using TIF_NOTIFY_SIGNAL for that.
>>
>>> The feature should be opted-in by the userspace by setting a new
>>> IORING_SETUP_PRIVATE_CQ flag. It doesn't work with IOPOLL, and also for
>>> now only the task that created a ring can submit requests to it.
>>
>> I know this is a WIP, but why do we need CQ_PRIVATE? And this needs to
>
> One reason is because of the io-wq -> tw punting, which is not optimal
> for e.g. active users of IOSQE_ASYNC. The second is because the
> fundamental requirement is that only one task should be submitting
> requests. Was thinking about automating it, e.g. when we register
> a second tctx we go through a slow path waiting for all current tw
> to complete and then removing an internal and not userspace visible
> CQ_PRIVATE flag.
Was thinking something along those lines too. The alternative is setting
up the ring with SETUP_SINGLE_ISSUER or something like that, having the
application tell us that it is a single issuer and no submits are
shared across threads. Serves the same kind of purpose as CQ_PRIVATE,
but enables us to simply fail things if the task violates those
constraints. Would also be a better name I believe as it might enable
further optimizations in the future, like for example the mutex
reduction for submits.
> Also, as SQPOLL task is by definition the only one submitting SQEs,
> was thinking about enabling it by default for them, but didn't do
> because of the io-wq / IOSQE_ASYNC.
Gotcha.
>> work with registered files (and ring fd) as that is probably a bigger
>> win than skipping the completion_lock if you're not shared anyway.
>
> It does work with fixed/registered files and registered io_uring fds.
t/io_uring fails for me with registered files or rings, getting EINVAL.
Might be user error, but that's simply just setting CQ_PRIVATE for
setup.
> In regards of "a bigger win", probably in many cases, but if you submit
> a good batch at once, and completion tw batching doesn't kick in (e.g.
> direct bdev read of not too high intensity), it might save
> N spinlock/unlock when registered ring fd would kill only one pair of
> fdget/fdput.
Definitely, various cases where one would be a bigger win than the
other, agree on that. But let's just ensure that both work together :-)
--
Jens Axboe
next prev parent reply other threads:[~2022-03-18 15:21 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-18 13:52 [RFC 0/4] completion locking optimisation feature Pavel Begunkov
2022-03-18 13:52 ` [PATCH 1/4] io_uring: get rid of raw fill cqe in kill_timeout Pavel Begunkov
2022-03-18 13:52 ` [PATCH 2/4] io_uring: get rid of raw fill_cqe in io_fail_links Pavel Begunkov
2022-03-18 13:52 ` [PATCH 3/4] io_uring: remove raw fill_cqe from linked timeout Pavel Begunkov
2022-03-18 13:52 ` [PATCH 4/4] io_uring: optimise compl locking for non-shared rings Pavel Begunkov
2022-03-18 14:54 ` Jens Axboe
2022-03-18 15:13 ` Pavel Begunkov
2022-03-18 15:21 ` Jens Axboe [this message]
2022-03-18 15:32 ` Pavel Begunkov
2022-03-18 16:06 ` Jens Axboe
2022-03-18 14:42 ` [RFC 0/4] completion locking optimisation feature Pavel Begunkov
2022-03-18 14:52 ` Jens Axboe
2022-03-18 15:00 ` Pavel Begunkov
2022-03-18 15:22 ` Jens Axboe
2022-03-18 15:34 ` Pavel Begunkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox