From: Hao Xu <[email protected]>
To: Jens Axboe <[email protected]>
Cc: [email protected], Pavel Begunkov <[email protected]>,
Joseph Qi <[email protected]>
Subject: Re: [PATCH] io_uring: check sqring and iopoll_list before shedule
Date: Wed, 21 Apr 2021 23:46:57 +0800 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
在 2021/4/21 下午11:19, Hao Xu 写道:
> do this to avoid race below:
>
> userspace kernel
>
> | check sqring and iopoll_list
> submit sqe |
> check IORING_SQ_NEED_WAKEUP |
> (which is not set) | |
> | set IORING_SQ_NEED_WAKEUP
> wait cqe | schedule(never wakeup again)
>
> Signed-off-by: Hao Xu <[email protected]>
> ---
>
> Hi all,
> I'm doing some work to reduce cpu usage in low IO pression, and I
> removed timeout logic in io_sq_thread() to do some test with fio-3.26,
> I found that fio hangs in getevents, inifinitely trying to get a cqe,
> While sq-thread is sleeping. It seems there is race situation, and it
> is still there even after I fix the issue described above in the commit
> message. I doubt it is something to do with memory barrier logic
> between userspace and kernel, I'm trying to address it, not many clues
> for now.
> I'll send the fio config and kernel modification I did for test in
> following mail soon.
>
fio test config:
[global]
ioengine=io_uring
sqthread_poll=1
hipri=1
thread=1
bs=4k
direct=1
rw=randread
time_based=1
runtime=30
group_reporting=1
filename=/dev/nvme1n1
sqthread_poll_cpu=30
[job0]
iodepth=1
the issue mainly occur when iodepth=1 during my test.
I removed timeout logic in io_sq_thread() like this:
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 042f1149db51..dd9c95016f7f 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6739,7 +6739,6 @@ static int io_sq_thread(void *data)
{
struct io_sq_data *sqd = data;
struct io_ring_ctx *ctx;
- unsigned long timeout = 0;
char buf[TASK_COMM_LEN];
DEFINE_WAIT(wait);
@@ -6777,7 +6776,6 @@ static int io_sq_thread(void *data)
io_run_task_work_head(&sqd->park_task_work);
if (did_sig)
break;
- timeout = jiffies + sqd->sq_thread_idle;
continue;
}
sqt_spin = false;
@@ -6794,11 +6792,9 @@ static int io_sq_thread(void *data)
sqt_spin = true;
}
- if (sqt_spin || !time_after(jiffies, timeout)) {
+ if (sqt_spin) {
io_run_task_work();
cond_resched();
- if (sqt_spin)
- timeout = jiffies + sqd->sq_thread_idle;
continue;
}
@@ -6831,7 +6827,6 @@ static int io_sq_thread(void *data)
finish_wait(&sqd->wait, &wait);
io_run_task_work_head(&sqd->park_task_work);
- timeout = jiffies + sqd->sq_thread_idle;
}
list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
~
~
> Thanks,
> Hao
>
> fs/io_uring.c | 36 +++++++++++++++++++-----------------
> 1 file changed, 19 insertions(+), 17 deletions(-)
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index dff34975d86b..042f1149db51 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -6802,27 +6802,29 @@ static int io_sq_thread(void *data)
> continue;
> }
>
> - needs_sched = true;
> prepare_to_wait(&sqd->wait, &wait, TASK_INTERRUPTIBLE);
> - list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
> - if ((ctx->flags & IORING_SETUP_IOPOLL) &&
> - !list_empty_careful(&ctx->iopoll_list)) {
> - needs_sched = false;
> - break;
> - }
> - if (io_sqring_entries(ctx)) {
> - needs_sched = false;
> - break;
> - }
> - }
> -
> - if (needs_sched && !test_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state)) {
> + if (!test_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state)) {
> list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
> io_ring_set_wakeup_flag(ctx);
>
> - mutex_unlock(&sqd->lock);
> - schedule();
> - mutex_lock(&sqd->lock);
> + needs_sched = true;
> + list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
> + if ((ctx->flags & IORING_SETUP_IOPOLL) &&
> + !list_empty_careful(&ctx->iopoll_list)) {
> + needs_sched = false;
> + break;
> + }
> + if (io_sqring_entries(ctx)) {
> + needs_sched = false;
> + break;
> + }
> + }
> +
> + if (needs_sched) {
> + mutex_unlock(&sqd->lock);
> + schedule();
> + mutex_lock(&sqd->lock);
> + }
> list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
> io_ring_clear_wakeup_flag(ctx);
> }
>
next prev parent reply other threads:[~2021-04-21 15:47 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-21 15:19 [PATCH] io_uring: check sqring and iopoll_list before shedule Hao Xu
2021-04-21 15:46 ` Hao Xu [this message]
2021-04-23 14:11 ` Pavel Begunkov
2021-04-23 14:27 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a7c70456-5c1d-b300-3449-00f822dda193@linux.alibaba.com \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox