From: Jens Axboe <[email protected]>
To: Jann Horn <[email protected]>
Cc: io-uring <[email protected]>
Subject: Re: [PATCH RFC] io_uring/rsrc: add last-lookup cache hit to io_rsrc_node_lookup()
Date: Wed, 30 Oct 2024 14:52:46 -0600 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 10/30/24 2:25 PM, Jens Axboe wrote:
> On 10/30/24 11:20 AM, Jann Horn wrote:
>> On Wed, Oct 30, 2024 at 5:58?PM Jens Axboe <[email protected]> wrote:
>>> This avoids array_index_nospec() for repeated lookups on the same node,
>>> which can be quite common (and costly). If a cached node is removed from
>>
>> You're saying array_index_nospec() can be quite costly - which
>> architecture is this on? Is this the cost of the compare+subtract+and
>> making the critical path longer?
>
> Tested this on arm64, in a vm to be specific. Let me try and generate
> some numbers/profiles on x86-64 as well. It's noticeable there as well,
> though not quite as bad as the below example. For arm64, with the patch,
> we get roughly 8.7% of the time spent getting a resource - without it's
> 66% of the time. This is just doing a microbenchmark, but it clearly
> shows that anything following the barrier on arm64 is very costly:
>
> 0.98 ? ldr x21, [x0, #96]
> ? ? tbnz w2, #1, b8
> 1.04 ? ldr w1, [x21, #144]
> ? cmp w1, w19
> ? ? b.ls a0
> ? 30: mov w1, w1
> ? sxtw x0, w19
> ? cmp x0, x1
> ? ngc x0, xzr
> ? csdb
> ? ldr x1, [x21, #160]
> ? and w19, w19, w0
> 93.98 ? ldr x19, [x1, w19, sxtw #3]
>
> and accounts for most of that 66% of the total cost of the micro bench,
> even though it's doing a ton more stuff than simple getting this node
> via a lookup.
Ran some x86-64 testing, and there's no such effect on x86-64. So mostly
useful on archs with more expensive array_index_nospec(). There's
obviously a cost associated with it, but it's more of an even trade off
in terms of having the extra branch vs the nospec indexing. Which means
at that point you may as well not add the extra cache, as this
particular case always hits it, and hence it's a best case kind of test.
--
Jens Axboe
next prev parent reply other threads:[~2024-10-30 20:52 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-30 16:58 [PATCH RFC] io_uring/rsrc: add last-lookup cache hit to io_rsrc_node_lookup() Jens Axboe
2024-10-30 17:20 ` Jann Horn
2024-10-30 20:25 ` Jens Axboe
2024-10-30 20:52 ` Jens Axboe [this message]
2024-10-30 21:01 ` Jann Horn
2024-10-30 21:04 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox