public inbox for [email protected]
 help / color / mirror / Atom feed
From: Hao Xu <[email protected]>
To: Jens Axboe <[email protected]>
Cc: [email protected], Pavel Begunkov <[email protected]>,
	Joseph Qi <[email protected]>
Subject: Re: [PATCH] io_uring: check sqring and iopoll_list before shedule
Date: Wed, 21 Apr 2021 23:46:57 +0800	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

在 2021/4/21 下午11:19, Hao Xu 写道:
> do this to avoid race below:
> 
>           userspace                         kernel
> 
>                                 |  check sqring and iopoll_list
> submit sqe                     |
> check IORING_SQ_NEED_WAKEUP    |
> (which is not set)    |        |
>                                 |  set IORING_SQ_NEED_WAKEUP
> wait cqe                       |  schedule(never wakeup again)
> 
> Signed-off-by: Hao Xu <[email protected]>
> ---
> 
> Hi all,
> I'm doing some work to reduce cpu usage in low IO pression, and I
> removed timeout logic in io_sq_thread() to do some test with fio-3.26,
> I found that fio hangs in getevents, inifinitely trying to get a cqe,
> While sq-thread is sleeping. It seems there is race situation, and it
> is still there even after I fix the issue described above in the commit
> message. I doubt it is something to do with memory barrier logic
> between userspace and kernel, I'm trying to address it, not many clues
> for now.
> I'll send the fio config and kernel modification I did for test in
> following mail soon.
> 
fio test config:
[global]
ioengine=io_uring
sqthread_poll=1
hipri=1
thread=1
bs=4k
direct=1
rw=randread
time_based=1
runtime=30
group_reporting=1
filename=/dev/nvme1n1
sqthread_poll_cpu=30

[job0]
iodepth=1

the issue mainly occur when iodepth=1 during my test.
I removed timeout logic in io_sq_thread() like this:

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 042f1149db51..dd9c95016f7f 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6739,7 +6739,6 @@ static int io_sq_thread(void *data)
  {
         struct io_sq_data *sqd = data;
         struct io_ring_ctx *ctx;
-       unsigned long timeout = 0;
         char buf[TASK_COMM_LEN];
         DEFINE_WAIT(wait);

@@ -6777,7 +6776,6 @@ static int io_sq_thread(void *data)
                         io_run_task_work_head(&sqd->park_task_work);
                         if (did_sig)
                                 break;
-                       timeout = jiffies + sqd->sq_thread_idle;
                         continue;
                 }
                 sqt_spin = false;
@@ -6794,11 +6792,9 @@ static int io_sq_thread(void *data)
                                 sqt_spin = true;
                 }

-               if (sqt_spin || !time_after(jiffies, timeout)) {
+               if (sqt_spin) {
                         io_run_task_work();
                         cond_resched();
-                       if (sqt_spin)
-                               timeout = jiffies + sqd->sq_thread_idle;
                         continue;
                 }

@@ -6831,7 +6827,6 @@ static int io_sq_thread(void *data)

                 finish_wait(&sqd->wait, &wait);
                 io_run_task_work_head(&sqd->park_task_work);
-               timeout = jiffies + sqd->sq_thread_idle;
         }

         list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
~
~
> Thanks,
> Hao
> 
>   fs/io_uring.c | 36 +++++++++++++++++++-----------------
>   1 file changed, 19 insertions(+), 17 deletions(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index dff34975d86b..042f1149db51 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -6802,27 +6802,29 @@ static int io_sq_thread(void *data)
>   			continue;
>   		}
>   
> -		needs_sched = true;
>   		prepare_to_wait(&sqd->wait, &wait, TASK_INTERRUPTIBLE);
> -		list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
> -			if ((ctx->flags & IORING_SETUP_IOPOLL) &&
> -			    !list_empty_careful(&ctx->iopoll_list)) {
> -				needs_sched = false;
> -				break;
> -			}
> -			if (io_sqring_entries(ctx)) {
> -				needs_sched = false;
> -				break;
> -			}
> -		}
> -
> -		if (needs_sched && !test_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state)) {
> +		if (!test_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state)) {
>   			list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
>   				io_ring_set_wakeup_flag(ctx);
>   
> -			mutex_unlock(&sqd->lock);
> -			schedule();
> -			mutex_lock(&sqd->lock);
> +			needs_sched = true;
> +			list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
> +				if ((ctx->flags & IORING_SETUP_IOPOLL) &&
> +				    !list_empty_careful(&ctx->iopoll_list)) {
> +					needs_sched = false;
> +					break;
> +				}
> +				if (io_sqring_entries(ctx)) {
> +					needs_sched = false;
> +					break;
> +				}
> +			}
> +
> +			if (needs_sched) {
> +				mutex_unlock(&sqd->lock);
> +				schedule();
> +				mutex_lock(&sqd->lock);
> +			}
>   			list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
>   				io_ring_clear_wakeup_flag(ctx);
>   		}
> 


  reply	other threads:[~2021-04-21 15:47 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-21 15:19 [PATCH] io_uring: check sqring and iopoll_list before shedule Hao Xu
2021-04-21 15:46 ` Hao Xu [this message]
2021-04-23 14:11 ` Pavel Begunkov
2021-04-23 14:27 ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a7c70456-5c1d-b300-3449-00f822dda193@linux.alibaba.com \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox