From: Pavel Begunkov <[email protected]>
To: Jens Axboe <[email protected]>, io-uring <[email protected]>
Subject: Re: [PATCH] io_uring/poll: use IOU_F_TWQ_LAZY_WAKE for wakeups
Date: Tue, 7 Nov 2023 17:47:40 +0000 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 10/19/23 14:01, Jens Axboe wrote:
> With poll triggered retries, each event trigger will cause a task_work
> item to be added for processing. If the ring is setup with
> IORING_SETUP_DEFER_TASKRUN and a task is waiting on multiple events to
> complete, any task_work addition will wake the task for processing these
> items. This can cause more context switches than we would like, if the
> application is deliberately waiting on multiple items to increase
> efficiency.
I'm a bit late here. The reason why I didn't enable it for polling is
because it changes the behaviour. Let's think of a situation where we
want to accept 2 sockets, so we send a multishot accept and do
cq_wait(nr=2). It was perfectly fine before, but now it'll hung as
there's only 1 request and so 1 tw queued. And same would happen with
multishot recv even though it's more relevant to packet based protocols
like UDP.
It might be not specific to multishots:
listen(backlog=1), queue N oneshot accepts and cq_wait(N).
Now we get the first connection in the queue to accept.
[IORING_OP_ACCEPT] = {
.poll_exclusive = 1,
}
Due to poll_exclusive (I assume) it wakes only one accept. That
will try to queue up a tw for it, but it'll not be executed
because it's just one item. No other connection can be queued
up because of the backlog limit => presumably no other request
will be woken up => that first tw never runs. It's more subtle
and timing specific than the previous example, but nevertheless
it's concerning we might step on sth like that.
> For example, if an application has receive multishot armed for sockets
> and wants to wait for N to complete within M usec of time, we should not
> be waking up and processing these items until we have all the events we
> asked for. By switching the poll trigger to lazy wake, we'll process
> them when they are all ready, in one swoop, rather than wake multiple
> times only to process one and then go back to sleep.
>
> At some point we probably want to look at just making the lazy wake
> the default, but for now, let's just selectively enable it where it
> makes sense.
>
> Signed-off-by: Jens Axboe <[email protected]>
>
> ---
>
> diff --git a/io_uring/poll.c b/io_uring/poll.c
> index 4c360ba8793a..d38d05edb4fa 100644
> --- a/io_uring/poll.c
> +++ b/io_uring/poll.c
> @@ -370,7 +370,7 @@ static void __io_poll_execute(struct io_kiocb *req, int mask)
> req->io_task_work.func = io_poll_task_func;
>
> trace_io_uring_task_add(req, mask);
> - io_req_task_work_add(req);
> + __io_req_task_work_add(req, IOU_F_TWQ_LAZY_WAKE);
> }
>
> static inline void io_poll_execute(struct io_kiocb *req, int res)
>
--
Pavel Begunkov
next prev parent reply other threads:[~2023-11-07 17:48 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-19 13:01 [PATCH] io_uring/poll: use IOU_F_TWQ_LAZY_WAKE for wakeups Jens Axboe
2023-11-07 17:47 ` Pavel Begunkov [this message]
2023-11-07 19:57 ` Jens Axboe
2023-11-08 16:14 ` Jens Axboe
2023-11-09 13:07 ` Pavel Begunkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox