* io_uring_prep_timeout() leading to an IO pressure close to 100
@ 2026-04-01 14:59 Fiona Ebner
2026-04-01 15:03 ` Jens Axboe
0 siblings, 1 reply; 4+ messages in thread
From: Fiona Ebner @ 2026-04-01 14:59 UTC (permalink / raw)
To: linux-kernel; +Cc: hannes, surenb, peterz, io-uring, Jens Axboe
Dear maintainers,
I'm currently investigating an issue with QEMU causing an IO pressure
value of nearly 100 when io_uring is used for the event loop of a QEMU
iothread (which is the case since QEMU 10.2 if io_uring is enabled
during configuration and available).
The cause seems to be the io_uring_prep_timeout() call that is used for
blocking wait. I attached a minimal reproducer below, which exposes the
issue [0].
This was observed on a kernel based on 7.0-rc6 as well as 6.17.13. I
haven't investigated what happens inside the kernel yet, so I don't know
if it is an accounting issue or within io_uring.
Let me know if you need more information or if I should test something
specific.
Best Regards,
Fiona
[0]:
#include <errno.h>
#include <stdio.h>
#include <liburing.h>
int main(void) {
int ret;
struct io_uring ring;
struct __kernel_timespec ts;
struct io_uring_sqe *sqe;
ret = io_uring_queue_init(128, &ring, 0);
if (ret != 0) {
printf("Failed to initialize io_uring\n");
return ret;
}
ts = (struct __kernel_timespec){
.tv_sec = 60,
};
sqe = io_uring_get_sqe(&ring);
if (!sqe) {
printf("Full sq\n");
return -1;
}
io_uring_prep_timeout(sqe, &ts, 1, 0);
io_uring_sqe_set_data(sqe, NULL);
do {
ret = io_uring_submit_and_wait(&ring, 1);
printf("got ret %d\n", ret);
} while (ret == -EINTR);
io_uring_queue_exit(&ring);
return 0;
}
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: io_uring_prep_timeout() leading to an IO pressure close to 100
2026-04-01 14:59 io_uring_prep_timeout() leading to an IO pressure close to 100 Fiona Ebner
@ 2026-04-01 15:03 ` Jens Axboe
2026-04-02 9:12 ` Fiona Ebner
0 siblings, 1 reply; 4+ messages in thread
From: Jens Axboe @ 2026-04-01 15:03 UTC (permalink / raw)
To: Fiona Ebner, linux-kernel; +Cc: hannes, surenb, peterz, io-uring
On 4/1/26 8:59 AM, Fiona Ebner wrote:
> Dear maintainers,
>
> I'm currently investigating an issue with QEMU causing an IO pressure
> value of nearly 100 when io_uring is used for the event loop of a QEMU
> iothread (which is the case since QEMU 10.2 if io_uring is enabled
> during configuration and available).
It's not "IO pressure", it's the useless iowait metric...
> The cause seems to be the io_uring_prep_timeout() call that is used for
> blocking wait. I attached a minimal reproducer below, which exposes the
> issue [0].
>
> This was observed on a kernel based on 7.0-rc6 as well as 6.17.13. I
> haven't investigated what happens inside the kernel yet, so I don't know
> if it is an accounting issue or within io_uring.
>
> Let me know if you need more information or if I should test something
> specific.
If you won't want it, just turn it off with io_uring_set_iowait().
--
Jens Axboe
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: io_uring_prep_timeout() leading to an IO pressure close to 100
2026-04-01 15:03 ` Jens Axboe
@ 2026-04-02 9:12 ` Fiona Ebner
2026-04-02 12:31 ` Fiona Ebner
0 siblings, 1 reply; 4+ messages in thread
From: Fiona Ebner @ 2026-04-02 9:12 UTC (permalink / raw)
To: Jens Axboe, linux-kernel
Cc: hannes, surenb, peterz, io-uring, Thomas Lamprecht
Am 01.04.26 um 5:02 PM schrieb Jens Axboe:
> On 4/1/26 8:59 AM, Fiona Ebner wrote:
>> I'm currently investigating an issue with QEMU causing an IO pressure
>> value of nearly 100 when io_uring is used for the event loop of a QEMU
>> iothread (which is the case since QEMU 10.2 if io_uring is enabled
>> during configuration and available).
>
> It's not "IO pressure", it's the useless iowait metric...
But it is reported as IO pressure by the kernel, i.e. /proc/pressure/io
(and for a cgroup, /sys/fs/cgroup/foo.slice/bar.scope/io.pressure).
>> The cause seems to be the io_uring_prep_timeout() call that is used for
>> blocking wait. I attached a minimal reproducer below, which exposes the
>> issue [0].
>>
>> This was observed on a kernel based on 7.0-rc6 as well as 6.17.13. I
>> haven't investigated what happens inside the kernel yet, so I don't know
>> if it is an accounting issue or within io_uring.
>>
>> Let me know if you need more information or if I should test something
>> specific.
>
> If you won't want it, just turn it off with io_uring_set_iowait().
QEMU does submit actual IO request on the same ring and I suppose iowait
should still be used for those?
Maybe setting the IORING_ENTER_NO_IOWAIT flag if only the timeout
request is being submitted and no actual IO requests is an option? But
even then, if a request is submitted later via another thread, iowait
for that new request won't be accounted for, right?
Is there a way to say "I don't want IO wait for timeout submissions"?
Wouldn't that even make sense by default?
Best Regards,
Fiona
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: io_uring_prep_timeout() leading to an IO pressure close to 100
2026-04-02 9:12 ` Fiona Ebner
@ 2026-04-02 12:31 ` Fiona Ebner
0 siblings, 0 replies; 4+ messages in thread
From: Fiona Ebner @ 2026-04-02 12:31 UTC (permalink / raw)
To: Jens Axboe, linux-kernel
Cc: hannes, surenb, peterz, io-uring, Thomas Lamprecht
Am 02.04.26 um 11:12 AM schrieb Fiona Ebner:
> Am 01.04.26 um 5:02 PM schrieb Jens Axboe:
>> On 4/1/26 8:59 AM, Fiona Ebner wrote:
>>> I'm currently investigating an issue with QEMU causing an IO pressure
>>> value of nearly 100 when io_uring is used for the event loop of a QEMU
>>> iothread (which is the case since QEMU 10.2 if io_uring is enabled
>>> during configuration and available).
>>
>> It's not "IO pressure", it's the useless iowait metric...
>
> But it is reported as IO pressure by the kernel, i.e. /proc/pressure/io
> (and for a cgroup, /sys/fs/cgroup/foo.slice/bar.scope/io.pressure).
>
>>> The cause seems to be the io_uring_prep_timeout() call that is used for
>>> blocking wait. I attached a minimal reproducer below, which exposes the
>>> issue [0].
>>>
>>> This was observed on a kernel based on 7.0-rc6 as well as 6.17.13. I
>>> haven't investigated what happens inside the kernel yet, so I don't know
>>> if it is an accounting issue or within io_uring.
>>>
>>> Let me know if you need more information or if I should test something
>>> specific.
>>
>> If you won't want it, just turn it off with io_uring_set_iowait().
>
> QEMU does submit actual IO request on the same ring and I suppose iowait
> should still be used for those?
>
> Maybe setting the IORING_ENTER_NO_IOWAIT flag if only the timeout
> request is being submitted and no actual IO requests is an option? But
> even then, if a request is submitted later via another thread, iowait
> for that new request won't be accounted for, right?
>
> Is there a way to say "I don't want IO wait for timeout submissions"?
> Wouldn't that even make sense by default?
Turns out, that in my QEMU instances, the branch doing the
io_uring_prep_timeout() call is not actually taken, so while the issue
could arise like that too, it's different in this practical case.
What I'm actually seeing is io_uring_submit_and_wait() being called with
wait_nr=1 while there is nothing else going on. So a more accurate
reproducer for the scenario is attached below [0]. Note that it does not
happen without sumbitting+completing a single request first.
Best Regards,
Fiona
[0]:
#include <errno.h>
#include <stdio.h>
#include <unistd.h>
#include <liburing.h>
int main(void) {
int fd;
int ret;
struct io_uring ring;
struct io_uring_sqe *sqe;
ret = io_uring_queue_init(128, &ring, 0);
if (ret != 0) {
printf("Failed to initialize io_uring\n");
return ret;
}
// before submitting+advancing the issue does not happen
// ret = io_uring_submit_and_wait(&ring, 1);
// printf("got ret %d\n", ret);
sqe = io_uring_get_sqe(&ring);
if (!sqe) {
printf("Full sq\n");
return -1;
}
io_uring_prep_nop(sqe);
do {
ret = io_uring_submit_and_wait(&ring, 1);
} while (ret == -EINTR);
if (ret != 1) {
printf("Expected to submit one\n");
return -1;
}
// using peek+seen has the same effect
// struct io_uring_cqe* cqe;
// io_uring_peek_cqe(&ring, &cqe);
// io_uring_cqe_seen(&ring, cqe);
io_uring_cq_advance(&ring, 1);
ret = io_uring_submit_and_wait(&ring, 1);
printf("got ret %d\n", ret);
io_uring_queue_exit(&ring);
return 0;
}
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-04-02 12:31 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-01 14:59 io_uring_prep_timeout() leading to an IO pressure close to 100 Fiona Ebner
2026-04-01 15:03 ` Jens Axboe
2026-04-02 9:12 ` Fiona Ebner
2026-04-02 12:31 ` Fiona Ebner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox