public inbox for [email protected]
 help / color / mirror / Atom feed
From: Pavel Begunkov <[email protected]>
To: Hao Xu <[email protected]>, Jens Axboe <[email protected]>
Cc: [email protected], Joseph Qi <[email protected]>
Subject: Re: [PATCH 2/3] io_uring: maintain drain logic for multishot requests
Date: Wed, 7 Apr 2021 12:41:46 +0100	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 07/04/2021 12:23, Hao Xu wrote:
> Now that we have multishot poll requests, one sqe can emit multiple
> cqes. given below example:
>     sqe0(multishot poll)-->sqe1-->sqe2(drain req)
> sqe2 is designed to issue after sqe0 and sqe1 completed, but since sqe0
> is a multishot poll request, sqe2 may be issued after sqe0's event
> triggered twice before sqe1 completed. This isn't what users leverage
> drain requests for.
> Here a simple solution is to ignore all multishot poll cqes, which means
> drain requests won't wait those request to be done.
> To achieve this, we should reconsider the req_need_defer equation, the
> original one is:
> 
>     all_sqes(excluding dropped ones) == all_cqes(including dropped ones)
> 
> this means we issue a drain request when all the previous submitted
> sqes have generated their cqes.
> Now we should ignore multishot requests, so:
>     all_sqes - multishot_sqes == all_cqes - multishot_cqes ==>
>     all_sqes + multishot_cqes - multishot_cqes == all_cqes
> 
> Thus we have to track the submittion of a multishot request and the cqes
> generation of it, including the ECANCELLED cqes. Here we introduce
> cq_extra = multishot_cqes - multishot_cqes for it.
> 
> There are other solutions like:
>   - just track multishot (non-ECNCELLED)cqes, don't track multishot sqes.
>       this way we include multishot sqes in the left end of the equation
>       this means we have to see multishot sqes as normal ones, then we
>       have to keep right one cqe for each multishot sqe. It's hard to do
>       this since there may be some multishot sqes which triggered
>       several events and then was cancelled, meanwhile other multishot
>       sqes just triggered events but wasn't cancelled. We still need to
>       track number of multishot sqes that haven't been cancelled, which
>       make things complicated
> 
> For implementations, just do the submittion tracking in
> io_submit_sqe() --> io_init_req() to make things simple. Otherwise if
> we do it in per opcode issue place, then we need to carefully consider
> each caller of io_req_complete_failed() because trick cases like cancel
> multishot reqs in link.
> 
> Signed-off-by: Hao Xu <[email protected]>
> ---
>  fs/io_uring.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 192463bb977a..a7bd223ce2cc 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -423,6 +423,7 @@ struct io_ring_ctx {
>  		unsigned		cq_mask;
>  		atomic_t		cq_timeouts;
>  		unsigned		cq_last_tm_flush;
> +		unsigned		cq_extra;
>  		unsigned long		cq_check_overflow;
>  		struct wait_queue_head	cq_wait;
>  		struct fasync_struct	*cq_fasync;
> @@ -879,6 +880,8 @@ struct io_op_def {
>  	unsigned		needs_async_setup : 1;
>  	/* should block plug */
>  	unsigned		plug : 1;
> +	/* set if opcode may generate multiple cqes */
> +	unsigned		multi_cqes : 1;
>  	/* size of async data needed, if any */
>  	unsigned short		async_size;
>  };
> @@ -924,6 +927,7 @@ struct io_op_def {
>  	[IORING_OP_POLL_ADD] = {
>  		.needs_file		= 1,
>  		.unbound_nonreg_file	= 1,
> +		.multi_cqes		= 1,
>  	},
>  	[IORING_OP_POLL_REMOVE] = {},
>  	[IORING_OP_SYNC_FILE_RANGE] = {
> @@ -1186,7 +1190,7 @@ static bool req_need_defer(struct io_kiocb *req, u32 seq)
>  	if (unlikely(req->flags & REQ_F_IO_DRAIN)) {
>  		struct io_ring_ctx *ctx = req->ctx;
>  
> -		return seq != ctx->cached_cq_tail
> +		return seq + ctx->cq_extra != ctx->cached_cq_tail
>  				+ READ_ONCE(ctx->cached_cq_overflow);
>  	}
>  
> @@ -1516,6 +1520,9 @@ static bool __io_cqring_fill_event(struct io_kiocb *req, long res,
>  
>  	trace_io_uring_complete(ctx, req->user_data, res, cflags);
>  
> +	if (req->flags & REQ_F_MULTI_CQES)
> +		req->ctx->cq_extra++;
> +


Here we go, additional overhead burdening everyone but used for
a little new feature. All that can be done in poll or in *_prep()
on opcode by opcode basis.

>  	/*
>  	 * If we can't get a cq entry, userspace overflowed the
>  	 * submission (by quite a lot). Increment the overflow count in
> @@ -6504,6 +6511,13 @@ static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req,
>  	req->result = 0;
>  	req->work.creds = NULL;
>  
> +	if (sqe_flags & IOSQE_MULTI_CQES) {
> +		ctx->cq_extra--;
> +		if (!io_op_defs[req->opcode].multi_cqes) {
> +			return -EOPNOTSUPP;
> +		}
> +	}
> +

see above

>  	/* enforce forwards compatibility on users */
>  	if (unlikely(sqe_flags & ~SQE_VALID_FLAGS)) {
>  		req->flags = 0;
> 

-- 
Pavel Begunkov

  reply	other threads:[~2021-04-07 11:45 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-07 11:23 [PATCH 5.13 v2] io_uring: maintain drain requests' logic Hao Xu
2021-04-07 11:23 ` [PATCH 1/3] io_uring: add IOSQE_MULTI_CQES/REQ_F_MULTI_CQES for multishot requests Hao Xu
2021-04-07 11:38   ` Pavel Begunkov
2021-04-07 11:23 ` [PATCH 2/3] io_uring: maintain drain logic " Hao Xu
2021-04-07 11:41   ` Pavel Begunkov [this message]
2021-04-07 11:23 ` [PATCH 3/3] io_uring: use REQ_F_MULTI_CQES for multipoll IORING_OP_ADD Hao Xu
2021-04-07 15:49 ` [PATCH 5.13 v2] io_uring: maintain drain requests' logic Jens Axboe
2021-04-08 10:16   ` Hao Xu
2021-04-08 11:43     ` Hao Xu
2021-04-08 12:22       ` Pavel Begunkov
2021-04-08 16:18         ` Jens Axboe
2021-04-09  6:15           ` Hao Xu
2021-04-09  7:05             ` Hao Xu
2021-04-09  7:50               ` Pavel Begunkov
2021-04-12 15:07                 ` Hao Xu
2021-04-12 15:29                   ` Hao Xu
2021-04-09  3:12         ` Hao Xu
2021-04-09  3:43           ` Hao Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox