public inbox for [email protected]
 help / color / mirror / Atom feed
From: Xiaoguang Wang <[email protected]>
To: Jens Axboe <[email protected]>, [email protected]
Cc: [email protected]
Subject: Re: [PATCH] io_uring: export cq overflow status to userspace
Date: Wed, 8 Jul 2020 00:29:39 +0800	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

hi,

> On 7/7/20 7:24 AM, Xiaoguang Wang wrote:
>> For those applications which are not willing to use io_uring_enter()
>> to reap and handle cqes, they may completely rely on liburing's
>> io_uring_peek_cqe(), but if cq ring has overflowed, currently because
>> io_uring_peek_cqe() is not aware of this overflow, it won't enter
>> kernel to flush cqes, below test program can reveal this bug:
>>
>> static void test_cq_overflow(struct io_uring *ring)
>> {
>>          struct io_uring_cqe *cqe;
>>          struct io_uring_sqe *sqe;
>>          int issued = 0;
>>          int ret = 0;
>>
>>          do {
>>                  sqe = io_uring_get_sqe(ring);
>>                  if (!sqe) {
>>                          fprintf(stderr, "get sqe failed\n");
>>                          break;;
>>                  }
>>                  ret = io_uring_submit(ring);
>>                  if (ret <= 0) {
>>                          if (ret != -EBUSY)
>>                                  fprintf(stderr, "sqe submit failed: %d\n", ret);
>>                          break;
>>                  }
>>                  issued++;
>>          } while (ret > 0);
>>          assert(ret == -EBUSY);
>>
>>          printf("issued requests: %d\n", issued);
>>
>>          while (issued) {
>>                  ret = io_uring_peek_cqe(ring, &cqe);
>>                  if (ret) {
>>                          if (ret != -EAGAIN) {
>>                                  fprintf(stderr, "peek completion failed: %s\n",
>>                                          strerror(ret));
>>                                  break;
>>                          }
>>                          printf("left requets: %d\n", issued);
>>                          continue;
>>                  }
>>                  io_uring_cqe_seen(ring, cqe);
>>                  issued--;
>>                  printf("left requets: %d\n", issued);
>>          }
>> }
>>
>> int main(int argc, char *argv[])
>> {
>>          int ret;
>>          struct io_uring ring;
>>
>>          ret = io_uring_queue_init(16, &ring, 0);
>>          if (ret) {
>>                  fprintf(stderr, "ring setup failed: %d\n", ret);
>>                  return 1;
>>          }
>>
>>          test_cq_overflow(&ring);
>>          return 0;
>> }
>>
>> To fix this issue, export cq overflow status to userspace, then
>> helper functions() in liburing, such as io_uring_peek_cqe, can be
>> aware of this cq overflow and do flush accordingly.
> 
> Is there any way we can accomplish the same without exporting
> another set of flags? 
I understand your concerns and will try to find some better methods later,
but not sure there're some better :)

> Would it be enough for the SQPOLl thread to set
> IORING_SQ_NEED_WAKEUP if we're in overflow condition? That should
> result in the app entering the kernel when it's flushed the user CQ
> side, and then the sqthread could attempt to flush the pending
> events as well.
> 
> Something like this, totally untested...
I haven't test your patch, but I think it doesn't work for non-sqpoll case, see
my above test program, it doesn't have SQPOLL enabled.

Regards,
Xiaoguang Wang
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index d37d7ea5ebe5..d409bd68553f 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -6110,8 +6110,18 @@ static int io_sq_thread(void *data)
>   		}
>   
>   		mutex_lock(&ctx->uring_lock);
> -		if (likely(!percpu_ref_is_dying(&ctx->refs)))
> +		if (likely(!percpu_ref_is_dying(&ctx->refs))) {
> +retry:
>   			ret = io_submit_sqes(ctx, to_submit, NULL, -1);
> +			if (unlikely(ret == -EBUSY)) {
> +				ctx->rings->sq_flags |= IORING_SQ_NEED_WAKEUP;
> +				smp_mb();
> +				if (io_cqring_overflow_flush(ctx, false)) {
> +					ctx->rings->sq_flags &= ~IORING_SQ_NEED_WAKEUP;
> +					goto retry;
> +				}
> +			}
> +		}
>   		mutex_unlock(&ctx->uring_lock);
>   		timeout = jiffies + ctx->sq_thread_idle;
>   	}
> 

  parent reply	other threads:[~2020-07-07 16:29 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-07 13:24 [PATCH] io_uring: export cq overflow status to userspace Xiaoguang Wang
2020-07-07 14:28 ` Jens Axboe
2020-07-07 16:21   ` Jens Axboe
2020-07-07 16:25     ` Pavel Begunkov
2020-07-07 16:30       ` Jens Axboe
2020-07-07 16:36     ` Xiaoguang Wang
2020-07-07 17:23       ` Jens Axboe
2020-07-08  3:25     ` Xiaoguang Wang
2020-07-08  3:46       ` Jens Axboe
2020-07-08  5:29         ` Xiaoguang Wang
2020-07-08 15:29           ` Jens Axboe
2020-07-08 15:39             ` Xiaoguang Wang
2020-07-08 15:41               ` Jens Axboe
2020-07-08 16:51                 ` Xiaoguang Wang
2020-07-08 21:33                   ` Jens Axboe
2020-07-09  0:52                     ` Xiaoguang Wang
2020-07-07 16:29   ` Xiaoguang Wang [this message]
2020-07-07 16:30     ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9b62548d-1a40-0706-21bd-7be699cc2c83@linux.alibaba.com \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox