From: David Wei <[email protected]>
To: Jens Axboe <[email protected]>, [email protected]
Cc: Pavel Begunkov <[email protected]>
Subject: Re: [PATCH liburing v1] Add io_uring_iowait_toggle()
Date: Fri, 16 Aug 2024 16:13:19 -0700 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 2024-08-16 15:49, Jens Axboe wrote:
> On 8/16/24 4:40 PM, David Wei wrote:
>> diff --git a/man/io_uring_enter.2 b/man/io_uring_enter.2
>> index 5e4121b..c06050a 100644
>> --- a/man/io_uring_enter.2
>> +++ b/man/io_uring_enter.2
>> @@ -104,6 +104,11 @@ If the ring file descriptor has been registered through use of
>> then setting this flag will tell the kernel that the
>> .I ring_fd
>> passed in is the registered ring offset rather than a normal file descriptor.
>> +.TP
>> +.B IORING_ENTER_NO_IOWAIT
>> +If this flag is set, then in_iowait will not be set for the current task if
>> +.BR io_uring_enter (2)
>> +results in waiting.
>
> I'd probably would say something ala:
>
> If this flag is set, then waiting on events will not be accounted as
> iowait for the task if
> .BR io_uring_enter (2)
> results in waiting.
>
> or something like that. io_iowait is an in-kernel thing, iowait is what
> application writers will know about.
>
>> +.B #include <liburing.h>
>> +.PP
>> +.BI "int io_uring_iowait_toggle(struct io_uring *" ring ",
>> +.BI " bool " enabled ");"
>> +.BI "
>> +.fi
>> +.SH DESCRIPTION
>> +.PP
>> +The
>> +.BR io_uring_iowait_toggle (3)
>> +function toggles for a given
>> +.I ring
>> +whether in_iowait is set for the current task while waiting for completions.
>> +When in_iowait is set, time spent waiting is accounted as iowait time;
>> +otherwise, it is accounted as idle time. The default behavior is to always set
>> +in_iowait to true.
>
> And ditto here
>
>> +Setting in_iowait achieves two things:
>> +
>> +1. Account time spent waiting as iowait time
>> +
>> +2. Enable cpufreq optimisations, setting SCHED_CPUFREQ_IOWAIT on the rq
>
> This should probably be something ala:
>
> Setting in_iowait achieves two things:
> .TP
> .B Account time spent waiting as iowait time
> .TP
> .B Enable cpufreq optimisations, setting SCHED_CPUFREQ_IOWAIT on the rq
> .PP
>
> to make that format like a man page.
>
>> +Some user tooling attributes iowait time as CPU utilization time, so high
>> +iowait time can look like apparent high CPU utilization, even though the task
>> +is not scheduled and the CPU is free to run other tasks. This function
>> +provides a way to disable this behavior where it makes sense to do so.
>
> And here. Since this is the main man page, maybe also add something
> about how iowait is a relic from the old days of only having one CPU,
> and it indicates that the task is block uninterruptibly waiting for IO
> and hence cannot do other things. These days it's mostly a bogus
> accounting value, but it does help with the cpufreq boost for certain
> high frequency waits. Rephrase as needed :-)
>
>> diff --git a/src/queue.c b/src/queue.c
>> index c436061..bd2e6af 100644
>> --- a/src/queue.c
>> +++ b/src/queue.c
>> @@ -110,6 +110,8 @@ static int _io_uring_get_cqe(struct io_uring *ring,
>>
>> if (ring->int_flags & INT_FLAG_REG_RING)
>> flags |= IORING_ENTER_REGISTERED_RING;
>> + if (ring->int_flags & INT_FLAG_NO_IOWAIT)
>> + flags |= IORING_ENTER_NO_IOWAIT;
>> ret = __sys_io_uring_enter2(ring->enter_ring_fd, data->submit,
>> data->wait_nr, flags, data->arg,
>> data->sz);
>
> Not strictly related, and we can always do that after, but now we have
> two branches here. Since the INT flags are purely internal, we can
> renumber them so that INT_FLAG_REG_RING matches
> IORING_ENTER_REGISTERED_RING and INT_FLAG_NO_IOWAIT matches
> IORING_ENTER_NO_IOWAIT. With that, you could kill the above two branches
> and simply do:
>
> flags |= (ring->int_flags & INT_TO_ENTER_MASK);
>
> which I think would be a nice thing to do.
SG, I'll do this in a follow up.
>
> Rest looks good, no further comments.
>
prev parent reply other threads:[~2024-08-16 23:13 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-16 22:40 [PATCH liburing v1] Add io_uring_iowait_toggle() David Wei
2024-08-16 22:49 ` Jens Axboe
2024-08-16 23:13 ` David Wei [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox