From: Jens Axboe <[email protected]>
To: Pavel Begunkov <[email protected]>,
David Wei <[email protected]>,
[email protected]
Subject: Re: [PATCH next v1 2/2] io_uring: limit local tw done
Date: Thu, 21 Nov 2024 08:02:54 -0700 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 11/21/24 7:58 AM, Pavel Begunkov wrote:
> On 11/21/24 14:34, Jens Axboe wrote:
>> On 11/21/24 7:29 AM, Pavel Begunkov wrote:
>>> On 11/21/24 00:52, David Wei wrote:
>>>> On 2024-11-20 15:56, Pavel Begunkov wrote:
>>>>> On 11/20/24 22:14, David Wei wrote:
> ...
>>>> There is a Memcache-like workload that has load shedding based on the
>>>> time spent doing work. With epoll, the work of reading sockets and
>>>> processing a request is done by user, which can decide after some amount
>>>> of time to drop the remaining work if it takes too long. With io_uring,
>>>> the work of reading sockets is done eagerly inside of task work. If
>>>> there is a burst of work, then so much time is spent in task work
>>>> reading from sockets that, by the time control returns to user the
>>>> timeout has already elapsed.
>>>
>>> Interesting, it also sounds like instead of an arbitrary 20 we
>>> might want the user to feed it to us. Might be easier to do it
>>> with the bpf toy not to carve another argument.
>>
>> David and I did discuss that, and I was not in favor of having an extra
>> argument. We really just need some kind of limit to prevent it
>> over-running. Arguably that should always be min_events, which we
>> already have, but that kind of runs afoul of applications just doing
>> io_uring_wait_cqe() and hence asking for 1. That's why the hand wavy
>> number exists, which is really no different than other hand wavy numbers
>> we have to limit running of "something" - eg other kinds of retries.
>>
>> Adding another argument to this just again doubles wait logic complexity
>> in terms of the API. If it's needed down the line for whatever reason,
>> then yeah we can certainly do it, probably via the wait regions. But
>> adding it to the generic wait path would be a mistake imho.
>
> Right, I don't like the idea of a wait argument either, messy and
> too advanced of a tuning for it. BPF would be fine as it could be
> made as a hint and easily removed if needed. It could also be a per
> ring REGISTER based hint, a bit worse but with the same deprecation
> argument. Obviously would need a good user / use case first.
Sure I agree on that, if you already have BPF integration for other
things, then yeah you could add tighter control for it. Even with that,
I doubt it'd be useful or meaningful really. Current case seems like a
worst case kind of thing, recv task_work is arguably one of the more
expensive things that can be done out of task_work.
We can revisit down the line if it ever becomes needed.
--
Jens Axboe
next prev parent reply other threads:[~2024-11-21 15:03 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-20 22:14 [PATCH next v1 0/2] limit local tw done David Wei
2024-11-20 22:14 ` [PATCH next v1 1/2] io_uring: add io_local_work_pending() David Wei
2024-11-20 23:45 ` Pavel Begunkov
2024-11-20 22:14 ` [PATCH next v1 2/2] io_uring: limit local tw done David Wei
2024-11-20 23:56 ` Pavel Begunkov
2024-11-21 0:52 ` David Wei
2024-11-21 14:29 ` Pavel Begunkov
2024-11-21 14:34 ` Jens Axboe
2024-11-21 14:58 ` Pavel Begunkov
2024-11-21 15:02 ` Jens Axboe [this message]
2024-11-21 1:12 ` Jens Axboe
2024-11-21 14:25 ` Pavel Begunkov
2024-11-21 14:31 ` Jens Axboe
2024-11-21 15:07 ` Pavel Begunkov
2024-11-21 15:15 ` Jens Axboe
2024-11-21 15:22 ` Jens Axboe
2024-11-21 16:00 ` Pavel Begunkov
2024-11-21 16:05 ` Jens Axboe
2024-11-21 16:18 ` Pavel Begunkov
2024-11-21 16:20 ` Jens Axboe
2024-11-21 16:43 ` Pavel Begunkov
2024-11-21 16:57 ` Jens Axboe
2024-11-21 17:05 ` Jens Axboe
2024-11-21 17:53 ` David Wei
2024-11-21 1:12 ` [PATCH next v1 0/2] " Jens Axboe
2024-11-21 14:16 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox