* [PATCH] io_uring: check sqring and iopoll_list before shedule
@ 2021-04-21 15:19 Hao Xu
2021-04-21 15:46 ` Hao Xu
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Hao Xu @ 2021-04-21 15:19 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi
do this to avoid race below:
userspace kernel
| check sqring and iopoll_list
submit sqe |
check IORING_SQ_NEED_WAKEUP |
(which is not set) | |
| set IORING_SQ_NEED_WAKEUP
wait cqe | schedule(never wakeup again)
Signed-off-by: Hao Xu <[email protected]>
---
Hi all,
I'm doing some work to reduce cpu usage in low IO pression, and I
removed timeout logic in io_sq_thread() to do some test with fio-3.26,
I found that fio hangs in getevents, inifinitely trying to get a cqe,
While sq-thread is sleeping. It seems there is race situation, and it
is still there even after I fix the issue described above in the commit
message. I doubt it is something to do with memory barrier logic
between userspace and kernel, I'm trying to address it, not many clues
for now.
I'll send the fio config and kernel modification I did for test in
following mail soon.
Thanks,
Hao
fs/io_uring.c | 36 +++++++++++++++++++-----------------
1 file changed, 19 insertions(+), 17 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index dff34975d86b..042f1149db51 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6802,27 +6802,29 @@ static int io_sq_thread(void *data)
continue;
}
- needs_sched = true;
prepare_to_wait(&sqd->wait, &wait, TASK_INTERRUPTIBLE);
- list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
- if ((ctx->flags & IORING_SETUP_IOPOLL) &&
- !list_empty_careful(&ctx->iopoll_list)) {
- needs_sched = false;
- break;
- }
- if (io_sqring_entries(ctx)) {
- needs_sched = false;
- break;
- }
- }
-
- if (needs_sched && !test_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state)) {
+ if (!test_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state)) {
list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
io_ring_set_wakeup_flag(ctx);
- mutex_unlock(&sqd->lock);
- schedule();
- mutex_lock(&sqd->lock);
+ needs_sched = true;
+ list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
+ if ((ctx->flags & IORING_SETUP_IOPOLL) &&
+ !list_empty_careful(&ctx->iopoll_list)) {
+ needs_sched = false;
+ break;
+ }
+ if (io_sqring_entries(ctx)) {
+ needs_sched = false;
+ break;
+ }
+ }
+
+ if (needs_sched) {
+ mutex_unlock(&sqd->lock);
+ schedule();
+ mutex_lock(&sqd->lock);
+ }
list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
io_ring_clear_wakeup_flag(ctx);
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] io_uring: check sqring and iopoll_list before shedule
2021-04-21 15:19 [PATCH] io_uring: check sqring and iopoll_list before shedule Hao Xu
@ 2021-04-21 15:46 ` Hao Xu
2021-04-23 14:11 ` Pavel Begunkov
2021-04-23 14:27 ` Jens Axboe
2 siblings, 0 replies; 4+ messages in thread
From: Hao Xu @ 2021-04-21 15:46 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi
在 2021/4/21 下午11:19, Hao Xu 写道:
> do this to avoid race below:
>
> userspace kernel
>
> | check sqring and iopoll_list
> submit sqe |
> check IORING_SQ_NEED_WAKEUP |
> (which is not set) | |
> | set IORING_SQ_NEED_WAKEUP
> wait cqe | schedule(never wakeup again)
>
> Signed-off-by: Hao Xu <[email protected]>
> ---
>
> Hi all,
> I'm doing some work to reduce cpu usage in low IO pression, and I
> removed timeout logic in io_sq_thread() to do some test with fio-3.26,
> I found that fio hangs in getevents, inifinitely trying to get a cqe,
> While sq-thread is sleeping. It seems there is race situation, and it
> is still there even after I fix the issue described above in the commit
> message. I doubt it is something to do with memory barrier logic
> between userspace and kernel, I'm trying to address it, not many clues
> for now.
> I'll send the fio config and kernel modification I did for test in
> following mail soon.
>
fio test config:
[global]
ioengine=io_uring
sqthread_poll=1
hipri=1
thread=1
bs=4k
direct=1
rw=randread
time_based=1
runtime=30
group_reporting=1
filename=/dev/nvme1n1
sqthread_poll_cpu=30
[job0]
iodepth=1
the issue mainly occur when iodepth=1 during my test.
I removed timeout logic in io_sq_thread() like this:
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 042f1149db51..dd9c95016f7f 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6739,7 +6739,6 @@ static int io_sq_thread(void *data)
{
struct io_sq_data *sqd = data;
struct io_ring_ctx *ctx;
- unsigned long timeout = 0;
char buf[TASK_COMM_LEN];
DEFINE_WAIT(wait);
@@ -6777,7 +6776,6 @@ static int io_sq_thread(void *data)
io_run_task_work_head(&sqd->park_task_work);
if (did_sig)
break;
- timeout = jiffies + sqd->sq_thread_idle;
continue;
}
sqt_spin = false;
@@ -6794,11 +6792,9 @@ static int io_sq_thread(void *data)
sqt_spin = true;
}
- if (sqt_spin || !time_after(jiffies, timeout)) {
+ if (sqt_spin) {
io_run_task_work();
cond_resched();
- if (sqt_spin)
- timeout = jiffies + sqd->sq_thread_idle;
continue;
}
@@ -6831,7 +6827,6 @@ static int io_sq_thread(void *data)
finish_wait(&sqd->wait, &wait);
io_run_task_work_head(&sqd->park_task_work);
- timeout = jiffies + sqd->sq_thread_idle;
}
list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
~
~
> Thanks,
> Hao
>
> fs/io_uring.c | 36 +++++++++++++++++++-----------------
> 1 file changed, 19 insertions(+), 17 deletions(-)
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index dff34975d86b..042f1149db51 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -6802,27 +6802,29 @@ static int io_sq_thread(void *data)
> continue;
> }
>
> - needs_sched = true;
> prepare_to_wait(&sqd->wait, &wait, TASK_INTERRUPTIBLE);
> - list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
> - if ((ctx->flags & IORING_SETUP_IOPOLL) &&
> - !list_empty_careful(&ctx->iopoll_list)) {
> - needs_sched = false;
> - break;
> - }
> - if (io_sqring_entries(ctx)) {
> - needs_sched = false;
> - break;
> - }
> - }
> -
> - if (needs_sched && !test_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state)) {
> + if (!test_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state)) {
> list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
> io_ring_set_wakeup_flag(ctx);
>
> - mutex_unlock(&sqd->lock);
> - schedule();
> - mutex_lock(&sqd->lock);
> + needs_sched = true;
> + list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
> + if ((ctx->flags & IORING_SETUP_IOPOLL) &&
> + !list_empty_careful(&ctx->iopoll_list)) {
> + needs_sched = false;
> + break;
> + }
> + if (io_sqring_entries(ctx)) {
> + needs_sched = false;
> + break;
> + }
> + }
> +
> + if (needs_sched) {
> + mutex_unlock(&sqd->lock);
> + schedule();
> + mutex_lock(&sqd->lock);
> + }
> list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
> io_ring_clear_wakeup_flag(ctx);
> }
>
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] io_uring: check sqring and iopoll_list before shedule
2021-04-21 15:19 [PATCH] io_uring: check sqring and iopoll_list before shedule Hao Xu
2021-04-21 15:46 ` Hao Xu
@ 2021-04-23 14:11 ` Pavel Begunkov
2021-04-23 14:27 ` Jens Axboe
2 siblings, 0 replies; 4+ messages in thread
From: Pavel Begunkov @ 2021-04-23 14:11 UTC (permalink / raw)
To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi
On 4/21/21 4:19 PM, Hao Xu wrote:
> do this to avoid race below:
>
> userspace kernel
>
> | check sqring and iopoll_list
> submit sqe |
> check IORING_SQ_NEED_WAKEUP |
> (which is not set) | |
> | set IORING_SQ_NEED_WAKEUP
> wait cqe | schedule(never wakeup again)
Agree, the flag should be set first.
Haven't tried it, but the patch looks reasonable
>
> Signed-off-by: Hao Xu <[email protected]>
> ---
>
> Hi all,
> I'm doing some work to reduce cpu usage in low IO pression, and I
> removed timeout logic in io_sq_thread() to do some test with fio-3.26,
> I found that fio hangs in getevents, inifinitely trying to get a cqe,
> While sq-thread is sleeping. It seems there is race situation, and it
> is still there even after I fix the issue described above in the commit
> message. I doubt it is something to do with memory barrier logic
> between userspace and kernel, I'm trying to address it, not many clues
> for now.
> I'll send the fio config and kernel modification I did for test in
> following mail soon.
>
> Thanks,
> Hao
>
> fs/io_uring.c | 36 +++++++++++++++++++-----------------
> 1 file changed, 19 insertions(+), 17 deletions(-)
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index dff34975d86b..042f1149db51 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -6802,27 +6802,29 @@ static int io_sq_thread(void *data)
> continue;
> }
>
> - needs_sched = true;
> prepare_to_wait(&sqd->wait, &wait, TASK_INTERRUPTIBLE);
> - list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
> - if ((ctx->flags & IORING_SETUP_IOPOLL) &&
> - !list_empty_careful(&ctx->iopoll_list)) {
> - needs_sched = false;
> - break;
> - }
> - if (io_sqring_entries(ctx)) {
> - needs_sched = false;
> - break;
> - }
> - }
> -
> - if (needs_sched && !test_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state)) {
> + if (!test_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state)) {
> list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
> io_ring_set_wakeup_flag(ctx);
>
> - mutex_unlock(&sqd->lock);
> - schedule();
> - mutex_lock(&sqd->lock);
> + needs_sched = true;
> + list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
> + if ((ctx->flags & IORING_SETUP_IOPOLL) &&
> + !list_empty_careful(&ctx->iopoll_list)) {
> + needs_sched = false;
> + break;
> + }
> + if (io_sqring_entries(ctx)) {
> + needs_sched = false;
> + break;
> + }
> + }
> +
> + if (needs_sched) {
> + mutex_unlock(&sqd->lock);
> + schedule();
> + mutex_lock(&sqd->lock);
> + }
> list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
> io_ring_clear_wakeup_flag(ctx);
> }
>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] io_uring: check sqring and iopoll_list before shedule
2021-04-21 15:19 [PATCH] io_uring: check sqring and iopoll_list before shedule Hao Xu
2021-04-21 15:46 ` Hao Xu
2021-04-23 14:11 ` Pavel Begunkov
@ 2021-04-23 14:27 ` Jens Axboe
2 siblings, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2021-04-23 14:27 UTC (permalink / raw)
To: Hao Xu; +Cc: io-uring, Pavel Begunkov, Joseph Qi
On 4/21/21 9:19 AM, Hao Xu wrote:
> do this to avoid race below:
>
> userspace kernel
>
> | check sqring and iopoll_list
> submit sqe |
> check IORING_SQ_NEED_WAKEUP |
> (which is not set) | |
> | set IORING_SQ_NEED_WAKEUP
> wait cqe | schedule(never wakeup again)
Applied, thanks.
--
Jens Axboe
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-04-23 14:27 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-04-21 15:19 [PATCH] io_uring: check sqring and iopoll_list before shedule Hao Xu
2021-04-21 15:46 ` Hao Xu
2021-04-23 14:11 ` Pavel Begunkov
2021-04-23 14:27 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox