From: Pavel Begunkov <[email protected]>
To: David Wei <[email protected]>, Jens Axboe <[email protected]>,
[email protected]
Subject: Re: [PATCH next v1 2/2] io_uring: limit local tw done
Date: Fri, 22 Nov 2024 15:57:52 +0000 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 11/21/24 17:53, David Wei wrote:
> On 2024-11-21 07:07, Pavel Begunkov wrote:
>> On 11/21/24 14:31, Jens Axboe wrote:
>>> On 11/21/24 7:25 AM, Pavel Begunkov wrote:
>>>> On 11/21/24 01:12, Jens Axboe wrote:
>>>>> On 11/20/24 4:56 PM, Pavel Begunkov wrote:
>>>>>> On 11/20/24 22:14, David Wei wrote:
>> ...
>>>>> I think that can only work if we change work_llist to be a regular list
>>>>> with regular locking. Otherwise it's a bit of a mess with the list being
>>>>
>>>> Dylan once measured the overhead of locks vs atomics in this
>>>> path for some artificial case, we can pull the numbers up.
>>>
>>> I did it more recently if you'll remember, actually posted a patch I
>>> think a few months ago changing it to that. But even that approach adds
>>
>> Right, and it's be a separate topic from this set.
>>
>>> extra overhead, if you want to add it to the same list as now you need
>>
>> Extra overhead to the retry path, which is not the hot path,
>> and coldness of it is uncertain.
>>
>>> to re-grab (and re-disable interrupts) the lock to add it back. My gut
>>> says that would be _worse_ than the current approach. And if you keep a
>>> separate list instead, well then you're back to identical overhead in
>>> terms of now needing to check both when needing to know if anything is
>>> pending, and checking both when running it.
>>>
>>>>> reordered, and then you're spending extra cycles on potentially
>>>>> reordering all the entries again.
>>>>
>>>> That sucks, I agree, but then it's same question of how often
>>>> it happens.
>>>
>>> At least for now, there's a real issue reported and we should fix it. I
>>> think the current patches are fine in that regard. That doesn't mean we
>>> can't potentially make it better, we should certainly investigate that.
>>> But I don't see the current patches as being suboptimal really, they are
>>> definitely good enough as-is for solving the issue.
>>
>> That's fair enough, but I still would love to know how frequent
>> it is. There is no purpose in optimising it as hot/slow path if
>> it triggers every fifth run or such. David, how easy it is to
>> get some stats? We can hack up some bpftrace script
>>
>
> Here is a sample distribution of how many task work is done per
> __io_run_local_work():
>
> @work_done:
> [1] 15385954 |@ |
> [2, 4) 33424809 |@@@@ |
> [4, 8) 196055270 |@@@@@@@@@@@@@@@@@@@@@@@@ |
> [8, 16) 419060191 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> [16, 32) 48395043 |@@@@@@ |
> [32, 64) 1573469 | |
> [64, 128) 98151 | |
> [128, 256) 14288 | |
> [256, 512) 2035 | |
> [512, 1K) 268 | |
> [1K, 2K) 13 | |
Nice
> This workload had wait_nr set to 20 and the timeout set to 500 µs.
>
> Empirically, I know that any task work done > 50 will violate the
> latency limit for this workload. In these cases, all the requests must
> be dropped. So even if excessive task work happens in a small % of time,
> the impact is far larger than this.
So you've got a long tail, which spikes your nines, that makes sense.
On the other hand it's perhaps 5-10% of total, though hard to judge
as the [16,32) bucket is split by the constant 20. My guess would be
a small optimisation for the normal case adding a bit more to the
requeue may well worth it but depends on how sharp the skew in the
bucket is.
--
Pavel Begunkov
next prev parent reply other threads:[~2024-11-22 15:56 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-20 22:14 [PATCH next v1 0/2] limit local tw done David Wei
2024-11-20 22:14 ` [PATCH next v1 1/2] io_uring: add io_local_work_pending() David Wei
2024-11-20 23:45 ` Pavel Begunkov
2024-11-20 22:14 ` [PATCH next v1 2/2] io_uring: limit local tw done David Wei
2024-11-20 23:56 ` Pavel Begunkov
2024-11-21 0:52 ` David Wei
2024-11-21 14:29 ` Pavel Begunkov
2024-11-21 14:34 ` Jens Axboe
2024-11-21 14:58 ` Pavel Begunkov
2024-11-21 15:02 ` Jens Axboe
2024-11-21 1:12 ` Jens Axboe
2024-11-21 14:25 ` Pavel Begunkov
2024-11-21 14:31 ` Jens Axboe
2024-11-21 15:07 ` Pavel Begunkov
2024-11-21 15:15 ` Jens Axboe
2024-11-21 15:22 ` Jens Axboe
2024-11-21 16:00 ` Pavel Begunkov
2024-11-21 16:05 ` Jens Axboe
2024-11-21 16:18 ` Pavel Begunkov
2024-11-21 16:20 ` Jens Axboe
2024-11-21 16:43 ` Pavel Begunkov
2024-11-21 16:57 ` Jens Axboe
2024-11-21 17:05 ` Jens Axboe
2024-11-22 17:01 ` Pavel Begunkov
2024-11-22 17:08 ` Jens Axboe
2024-11-23 0:50 ` Pavel Begunkov
2024-11-21 17:53 ` David Wei
2024-11-22 15:57 ` Pavel Begunkov [this message]
2024-11-21 1:12 ` [PATCH next v1 0/2] " Jens Axboe
2024-11-21 14:16 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox