public inbox for [email protected]
 help / color / mirror / Atom feed
From: Stefan Roesch <[email protected]>
To: Jens Axboe <[email protected]>
Cc: [email protected], [email protected],
	[email protected],
	Olivier Langlois <[email protected]>,
	Jakub Kicinski <[email protected]>
Subject: Re: [PATCH v10 2/5] io-uring: add napi busy poll support
Date: Thu, 27 Apr 2023 09:27:02 -0700	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>


Jens Axboe <[email protected]> writes:

> On 4/25/23 12:18?PM, Stefan Roesch wrote:
>
> Not too much to complain about, just some minor cleanups that would be
> nice to do.
>
>> diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
>> index 1b2a20a42413..2b2ca990ee93 100644
>> --- a/include/linux/io_uring_types.h
>> +++ b/include/linux/io_uring_types.h
>> @@ -277,6 +278,15 @@ struct io_ring_ctx {
>>  	struct xarray		personalities;
>>  	u32			pers_next;
>>
>> +#ifdef CONFIG_NET_RX_BUSY_POLL
>> +	struct list_head	napi_list;	/* track busy poll napi_id */
>> +	spinlock_t		napi_lock;	/* napi_list lock */
>> +
>> +	DECLARE_HASHTABLE(napi_ht, 4);
>> +	unsigned int		napi_busy_poll_to; /* napi busy poll default timeout */
>> +	bool			napi_prefer_busy_poll;
>> +#endif
>> +
>
> I don't mind overly long lines if it's warranted, for a comment it is
> not. This should just go above the variable.
>

Fixed. I was just following what sq_creds was doing a bit earlier in the
file.

>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
>> index efbd6c9c56e5..fff8f84eb560 100644
>> --- a/io_uring/io_uring.c
>> +++ b/io_uring/io_uring.c
>>  	iowq.cq_tail = READ_ONCE(ctx->rings->cq.head) + min_events;
>>  	iowq.timeout = KTIME_MAX;
>>
>> -	if (uts) {
>> -		struct timespec64 ts;
>> +	if (!io_napi(ctx)) {
>> +		if (uts) {
>> +			struct timespec64 ts;
>>
>> -		if (get_timespec64(&ts, uts))
>> -			return -EFAULT;
>> -		iowq.timeout = ktime_add_ns(timespec64_to_ktime(ts), ktime_get_ns());
>> +			if (get_timespec64(&ts, uts))
>> +				return -EFAULT;
>> +			iowq.timeout = ktime_add_ns(timespec64_to_ktime(ts), ktime_get_ns());
>> +		}
>> +	} else {
>> +		if (uts) {
>> +			struct timespec64 ts;
>> +
>> +			if (get_timespec64(&ts, uts))
>> +				return -EFAULT;
>> +
>> +			io_napi_adjust_timeout(ctx, &iowq, &ts);
>> +			iowq.timeout = ktime_add_ns(timespec64_to_ktime(ts), ktime_get_ns());
>> +		} else {
>> +			io_napi_adjust_timeout(ctx, &iowq, NULL);
>> +		}
>> +		io_napi_busy_loop(ctx, &iowq);
>>  	}
>
> This is a little bit of a mess and has a lot of duplication, that is not
> ideal. I'd do something like the end-of-email incremental to avoid that.
> Note that it's totally untested...
>
>>  	trace_io_uring_cqring_wait(ctx, min_events);
>> +
>>  	do {
>>  		unsigned long check_cq;
>>
>
> Spurious line addition here.
>

Fixed.

 diff --git a/io_uring/napi.c b/io_uring/napi.c
>> new file mode 100644
>> index 000000000000..bb7d2b6b7e90
>> --- /dev/null
>> +++ b/io_uring/napi.c
>> +static inline void adjust_timeout(unsigned int poll_to, struct timespec64 *ts,
>> +		unsigned int *new_poll_to)
>> +{
>> +	struct timespec64 pollto = ns_to_timespec64(1000 * (s64)poll_to);
>
> There's a bunch of these, but I'll just mention it here - io_uring
> always just aligns a second line of arguments with the first one. We
> should do that here too.
>

Fixed.

>> +	if (timespec64_compare(ts, &pollto) > 0) {
>> +		*ts = timespec64_sub(*ts, pollto);
>> +		*new_poll_to = poll_to;
>> +	} else {
>> +		u64 to = timespec64_to_ns(ts);
>> +
>> +		do_div(to, 1000);
>
> Is this going to complain on 32-bit?
>

My understanding is this should work on 32-bit.

>> +static void io_napi_multi_busy_loop(struct list_head *napi_list,
>> +		struct io_wait_queue *iowq)
>> +{
>> +	unsigned long start_time = busy_loop_current_time();
>> +
>> +	do {
>> +		if (list_is_singular(napi_list))
>> +			break;
>> +		if (!__io_napi_busy_loop(napi_list, iowq->napi_prefer_busy_poll))
>> +			break;
>> +	} while (!io_napi_busy_loop_should_end(iowq, start_time));
>> +}
>
> Do we need to check for an empty list here?
>
This function is only called through io_cqring_wait(),
io_napi_busy_loop(). In io_cqring_wait() we check that the napi list is
not empty.

>> +static void io_napi_blocking_busy_loop(struct list_head *napi_list,
>> +		struct io_wait_queue *iowq)
>> +{
>> +	if (!list_is_singular(napi_list))
>> +		io_napi_multi_busy_loop(napi_list, iowq);
>> +
>> +	if (list_is_singular(napi_list)) {
>> +		struct io_napi_ht_entry *ne;
>> +
>> +		ne = list_first_entry(napi_list, struct io_napi_ht_entry, list);
>> +		napi_busy_loop(ne->napi_id, io_napi_busy_loop_should_end, iowq,
>> +			iowq->napi_prefer_busy_poll, BUSY_POLL_BUDGET);
>> +	}
>> +}
>
> Presumably io_napi_multi_busy_loop() can change the state of the list,
> which is why we have if (cond) and then if (!cond) here? Would probably
> warrant a comment as it looks a bit confusing.
>

I added a comment.

>> +/*
>> + * io_napi_adjust_timeout() - Add napi id to the busy poll list
>> + * @ctx: pointer to io-uring context structure
>> + * @iowq: pointer to io wait queue
>> + * @ts: pointer to timespec or NULL
>> + *
>> + * Adjust the busy loop timeout according to timespec and busy poll timeout.
>> + */
>> +void io_napi_adjust_timeout(struct io_ring_ctx *ctx, struct io_wait_queue *iowq,
>> +		struct timespec64 *ts)
>> +{
>> +	if (ts)
>> +		adjust_timeout(READ_ONCE(ctx->napi_busy_poll_to), ts,
>> +			&iowq->napi_busy_poll_to);
>> +	else
>> +		iowq->napi_busy_poll_to = READ_ONCE(ctx->napi_busy_poll_to);
>> +}
>
> We should probably just pass 'ctx' to adjust_timeout()? Or do
>
> 	unsigned int poll_to = READ_ONCE(ctx->napi_busy_poll_to);
>
> at the top and then use that for both. Would get rid of that overly long
> line too.
>
>
I think it makes sense to combine the two functions. I'll also add a
variable at the top of the function like your example above.

> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> index a4c9a404f631..390f54c546d6 100644
> --- a/io_uring/io_uring.c
> +++ b/io_uring/io_uring.c
> @@ -2617,29 +2617,17 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
>  	iowq.cq_tail = READ_ONCE(ctx->rings->cq.head) + min_events;
>  	iowq.timeout = KTIME_MAX;
>
> -	if (!io_napi(ctx)) {
> -		if (uts) {
> -			struct timespec64 ts;
> +	if (uts) {
> +		struct timespec64 ts;
>
> -			if (get_timespec64(&ts, uts))
> -				return -EFAULT;
> -			iowq.timeout = ktime_add_ns(timespec64_to_ktime(ts), ktime_get_ns());
> -		}
> -	} else {
> -		if (uts) {
> -			struct timespec64 ts;
> -
> -			if (get_timespec64(&ts, uts))
> -				return -EFAULT;
> -
> -			io_napi_adjust_timeout(ctx, &iowq, &ts);
> -			iowq.timeout = ktime_add_ns(timespec64_to_ktime(ts), ktime_get_ns());
> -		} else {
> -			io_napi_adjust_timeout(ctx, &iowq, NULL);
> -		}
> -		io_napi_busy_loop(ctx, &iowq);
> +		if (get_timespec64(&ts, uts))
> +			return -EFAULT;
> +		iowq.timeout = ktime_add_ns(timespec64_to_ktime(ts), ktime_get_ns());
> +		io_napi_adjust_timeout(ctx, &iowq, &ts);
>  	}
>
> +	io_napi_busy_loop(ctx, &iowq);
> +
>  	trace_io_uring_cqring_wait(ctx, min_events);
>
>  	do {
> diff --git a/io_uring/napi.c b/io_uring/napi.c
> index ca12ff5f5611..3a0d0317ceec 100644
> --- a/io_uring/napi.c
> +++ b/io_uring/napi.c
> @@ -100,7 +100,8 @@ static bool io_napi_busy_loop_should_end(void *p, unsigned long start_time)
>  	       io_napi_busy_loop_timeout(start_time, iowq->napi_busy_poll_to);
>  }
>
> -static bool __io_napi_busy_loop(struct list_head *napi_list, bool prefer_busy_poll)
> +static bool __io_napi_do_busy_loop(struct list_head *napi_list,
> +				   bool prefer_busy_poll)
>  {
>  	struct io_napi_ht_entry *e;
>  	struct io_napi_ht_entry *n;
> @@ -121,7 +122,7 @@ static void io_napi_multi_busy_loop(struct list_head *napi_list,
>  	do {
>  		if (list_is_singular(napi_list))
>  			break;
> -		if (!__io_napi_busy_loop(napi_list, iowq->napi_prefer_busy_poll))
> +		if (!__io_napi_do_busy_loop(napi_list, iowq->napi_prefer_busy_poll))
>  			break;
>  	} while (!io_napi_busy_loop_should_end(iowq, start_time));
>  }
> @@ -251,16 +252,18 @@ int io_unregister_napi(struct io_ring_ctx *ctx, void __user *arg)
>  }
>
>  /*
> - * io_napi_adjust_timeout() - Add napi id to the busy poll list
> + * __io_napi_adjust_timeout() - Add napi id to the busy poll list
>   * @ctx: pointer to io-uring context structure
>   * @iowq: pointer to io wait queue
>   * @ts: pointer to timespec or NULL
>   *
>   * Adjust the busy loop timeout according to timespec and busy poll timeout.
>   */
> -void io_napi_adjust_timeout(struct io_ring_ctx *ctx, struct io_wait_queue *iowq,
> -		struct timespec64 *ts)
> +void __io_napi_adjust_timeout(struct io_ring_ctx *ctx,
> +			      struct io_wait_queue *iowq, struct timespec64 *ts)
>  {
> +	if (!io_napi(ctx))
> +		return;
>  	if (ts)
>  		adjust_timeout(READ_ONCE(ctx->napi_busy_poll_to), ts,
>  			&iowq->napi_busy_poll_to);
> @@ -269,13 +272,13 @@ void io_napi_adjust_timeout(struct io_ring_ctx *ctx, struct io_wait_queue *iowq,
>  }
>
>  /*
> - * io_napi_busy_loop() - execute busy poll loop
> + * __io_napi_busy_loop() - execute busy poll loop
>   * @ctx: pointer to io-uring context structure
>   * @iowq: pointer to io wait queue
>   *
>   * Execute the busy poll loop and merge the spliced off list.
>   */
> -void io_napi_busy_loop(struct io_ring_ctx *ctx, struct io_wait_queue *iowq)
> +void __io_napi_busy_loop(struct io_ring_ctx *ctx, struct io_wait_queue *iowq)
>  {
>  	iowq->napi_prefer_busy_poll = READ_ONCE(ctx->napi_prefer_busy_poll);
>
> @@ -302,8 +305,8 @@ void io_napi_busy_loop(struct io_ring_ctx *ctx, struct io_wait_queue *iowq)
>   */
>  int io_napi_sqpoll_busy_poll(struct io_ring_ctx *ctx)
>  {
> -	int ret = 0;
>  	LIST_HEAD(napi_list);
> +	int ret;
>
>  	if (!READ_ONCE(ctx->napi_busy_poll_to))
>  		return 0;
> @@ -312,9 +315,7 @@ int io_napi_sqpoll_busy_poll(struct io_ring_ctx *ctx)
>  	list_splice_init(&ctx->napi_list, &napi_list);
>  	spin_unlock(&ctx->napi_lock);
>
> -	if (__io_napi_busy_loop(&napi_list, ctx->napi_prefer_busy_poll))
> -		ret = 1;
> -
> +	ret = __io_napi_do_busy_loop(&napi_list, ctx->napi_prefer_busy_poll);
>  	io_napi_merge_lists(ctx, &napi_list);
>  	return ret;
>  }
> diff --git a/io_uring/napi.h b/io_uring/napi.h
> index 8da8f032a441..b5e93b3777c0 100644
> --- a/io_uring/napi.h
> +++ b/io_uring/napi.h
> @@ -17,9 +17,9 @@ int io_unregister_napi(struct io_ring_ctx *ctx, void __user *arg);
>
>  void __io_napi_add(struct io_ring_ctx *ctx, struct file *file);
>
> -void io_napi_adjust_timeout(struct io_ring_ctx *ctx,
> +void __io_napi_adjust_timeout(struct io_ring_ctx *ctx,
>  		struct io_wait_queue *iowq, struct timespec64 *ts);
> -void io_napi_busy_loop(struct io_ring_ctx *ctx, struct io_wait_queue *iowq);
> +void __io_napi_busy_loop(struct io_ring_ctx *ctx, struct io_wait_queue *iowq);
>  int io_napi_sqpoll_busy_poll(struct io_ring_ctx *ctx);
>
>  static inline bool io_napi(struct io_ring_ctx *ctx)
> @@ -27,6 +27,23 @@ static inline bool io_napi(struct io_ring_ctx *ctx)
>  	return !list_empty(&ctx->napi_list);
>  }
>
> +static inline void io_napi_adjust_timeout(struct io_ring_ctx *ctx,
> +					  struct io_wait_queue *iowq,
> +					  struct timespec64 *ts)
> +{
> +	if (!io_napi(ctx))
> +		return;
> +	__io_napi_adjust_timeout(ctx, iowq, ts);
> +}
> +
> +static inline void io_napi_busy_loop(struct io_ring_ctx *ctx,
> +				     struct io_wait_queue *iowq)
> +{
> +	if (!io_napi(ctx))
> +		return;
> +	__io_napi_busy_loop(ctx, iowq);
> +}
> +
>  /*
>   * io_napi_add() - Add napi id to the busy poll list
>   * @req: pointer to io_kiocb request
>

I'll have a look at the above proposal.

  parent reply	other threads:[~2023-04-27 17:31 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-25 18:18 [PATCH v10 0/5] io_uring: add napi busy polling support Stefan Roesch
2023-04-25 18:18 ` [PATCH v10 1/5] io-uring: move io_wait_queue definition to header file Stefan Roesch
2023-04-25 18:18 ` [PATCH v10 2/5] io-uring: add napi busy poll support Stefan Roesch
2023-04-27  1:41   ` Jens Axboe
2023-04-27  1:46     ` Jens Axboe
2023-04-27 17:34       ` Stefan Roesch
2023-04-27  1:50     ` Jens Axboe
2023-04-27  1:59       ` Jens Axboe
2023-04-27 18:21         ` Stefan Roesch
2023-04-27 17:44       ` Stefan Roesch
2023-04-27 16:27     ` Stefan Roesch [this message]
2023-04-28  1:09       ` Jens Axboe
2023-04-27  2:56   ` Ammar Faizi
2023-04-27 11:16     ` Jens Axboe
2023-04-25 18:18 ` [PATCH v10 3/5] io-uring: add sqpoll support for napi busy poll Stefan Roesch
2023-04-25 18:18 ` [PATCH v10 4/5] io_uring: add register/unregister napi function Stefan Roesch
2023-04-25 18:18 ` [PATCH v10 5/5] io_uring: add prefer busy poll to register and unregister napi api Stefan Roesch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox