public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH 1/2] io_uring: micro optimization of __io_sq_thread() condition
  2024-07-30 21:19 [PATCH 0/2] io_uring: minor sqpoll code refactoring Olivier Langlois
@ 2024-07-30 20:56 ` Olivier Langlois
  2024-08-02 11:17   ` Pavel Begunkov
  2024-07-30 21:10 ` [PATCH 2/2] io_uring: do the sqpoll napi busy poll outside the submission block Olivier Langlois
  2024-08-02 13:11 ` (subset) [PATCH 0/2] io_uring: minor sqpoll code refactoring Jens Axboe
  2 siblings, 1 reply; 8+ messages in thread
From: Olivier Langlois @ 2024-07-30 20:56 UTC (permalink / raw)
  To: Jens Axboe, Pavel Begunkov, io-uring

reverse the order of the element evaluation in an if statement.

for many users that are not using iopoll, the iopoll_list will always
evaluate to false after having made a memory access whereas to_submit is
very likely already loaded in a register.

Signed-off-by: Olivier Langlois <[email protected]>
---
 io_uring/sqpoll.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c
index b3722e5275e7..cc4a25136030 100644
--- a/io_uring/sqpoll.c
+++ b/io_uring/sqpoll.c
@@ -176,7 +176,7 @@ static int __io_sq_thread(struct io_ring_ctx *ctx, bool cap_entries)
 	if (cap_entries && to_submit > IORING_SQPOLL_CAP_ENTRIES_VALUE)
 		to_submit = IORING_SQPOLL_CAP_ENTRIES_VALUE;
 
-	if (!wq_list_empty(&ctx->iopoll_list) || to_submit) {
+	if (to_submit || !wq_list_empty(&ctx->iopoll_list)) {
 		const struct cred *creds = NULL;
 
 		if (ctx->sq_creds != current_cred())
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] io_uring: do the sqpoll napi busy poll outside the submission block
  2024-07-30 21:19 [PATCH 0/2] io_uring: minor sqpoll code refactoring Olivier Langlois
  2024-07-30 20:56 ` [PATCH 1/2] io_uring: micro optimization of __io_sq_thread() condition Olivier Langlois
@ 2024-07-30 21:10 ` Olivier Langlois
  2024-08-02 11:14   ` Pavel Begunkov
  2024-08-02 13:11 ` (subset) [PATCH 0/2] io_uring: minor sqpoll code refactoring Jens Axboe
  2 siblings, 1 reply; 8+ messages in thread
From: Olivier Langlois @ 2024-07-30 21:10 UTC (permalink / raw)
  To: Jens Axboe, Pavel Begunkov, io-uring

there are many small reasons justifying this change.

1. busy poll must be performed even on rings that have no iopoll and no
   new sqe. It is quite possible that a ring configured for inbound
   traffic with multishot be several hours without receiving new request
   submissions
2. NAPI busy poll does not perform any credential validation
3. If the thread is awaken by task work, processing the task work is
   prioritary over NAPI busy loop. This is why a second loop has been
   created after the io_sq_tw() call instead of doing the busy loop in
   __io_sq_thread() outside its credential acquisition block.

Signed-off-by: Olivier Langlois <[email protected]>
---
 io_uring/napi.h   | 9 +++++++++
 io_uring/sqpoll.c | 7 ++++---
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/io_uring/napi.h b/io_uring/napi.h
index 88f1c21d5548..5506c6af1ff5 100644
--- a/io_uring/napi.h
+++ b/io_uring/napi.h
@@ -101,4 +101,13 @@ static inline int io_napi_sqpoll_busy_poll(struct io_ring_ctx *ctx)
 }
 #endif /* CONFIG_NET_RX_BUSY_POLL */
 
+static inline int io_do_sqpoll_napi(struct io_ring_ctx *ctx)
+{
+	int ret = 0;
+
+	if (io_napi(ctx))
+		ret = io_napi_sqpoll_busy_poll(ctx);
+	return ret;
+}
+
 #endif
diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c
index cc4a25136030..ec558daa0331 100644
--- a/io_uring/sqpoll.c
+++ b/io_uring/sqpoll.c
@@ -195,9 +195,6 @@ static int __io_sq_thread(struct io_ring_ctx *ctx, bool cap_entries)
 			ret = io_submit_sqes(ctx, to_submit);
 		mutex_unlock(&ctx->uring_lock);
 
-		if (io_napi(ctx))
-			ret += io_napi_sqpoll_busy_poll(ctx);
-
 		if (to_submit && wq_has_sleeper(&ctx->sqo_sq_wait))
 			wake_up(&ctx->sqo_sq_wait);
 		if (creds)
@@ -322,6 +319,10 @@ static int io_sq_thread(void *data)
 		if (io_sq_tw(&retry_list, IORING_TW_CAP_ENTRIES_VALUE))
 			sqt_spin = true;
 
+		list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
+			if (io_do_sqpoll_napi(ctx))
+				sqt_spin = true;
+		}
 		if (sqt_spin || !time_after(jiffies, timeout)) {
 			if (sqt_spin) {
 				io_sq_update_worktime(sqd, &start);
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 0/2] io_uring: minor sqpoll code refactoring
@ 2024-07-30 21:19 Olivier Langlois
  2024-07-30 20:56 ` [PATCH 1/2] io_uring: micro optimization of __io_sq_thread() condition Olivier Langlois
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Olivier Langlois @ 2024-07-30 21:19 UTC (permalink / raw)
  To: Jens Axboe, Pavel Begunkov, io-uring

the first patch is minor micro-optimization that attempts to avoid a memory
access if by testing a variable to is very likely already in a register

the second patch is also minor but this is much more serious. Without it,
it is possible to have a ring that is configured to enable NAPI busy polling
to NOT perform busy polling in specific conditions.

Olivier Langlois (2):
  io_uring: micro optimization of __io_sq_thread() condition
  io_uring: do the sqpoll napi busy poll outside the submission block

 io_uring/napi.h   | 9 +++++++++
 io_uring/sqpoll.c | 9 +++++----
 2 files changed, 14 insertions(+), 4 deletions(-)

-- 
2.45.2


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] io_uring: do the sqpoll napi busy poll outside the submission block
  2024-07-30 21:10 ` [PATCH 2/2] io_uring: do the sqpoll napi busy poll outside the submission block Olivier Langlois
@ 2024-08-02 11:14   ` Pavel Begunkov
  2024-08-02 14:22     ` Olivier Langlois
  0 siblings, 1 reply; 8+ messages in thread
From: Pavel Begunkov @ 2024-08-02 11:14 UTC (permalink / raw)
  To: Olivier Langlois, Jens Axboe, io-uring

On 7/30/24 22:10, Olivier Langlois wrote:
> there are many small reasons justifying this change.
> 
> 1. busy poll must be performed even on rings that have no iopoll and no
>     new sqe. It is quite possible that a ring configured for inbound
>     traffic with multishot be several hours without receiving new request
>     submissions
> 2. NAPI busy poll does not perform any credential validation
> 3. If the thread is awaken by task work, processing the task work is
>     prioritary over NAPI busy loop. This is why a second loop has been
>     created after the io_sq_tw() call instead of doing the busy loop in
>     __io_sq_thread() outside its credential acquisition block.

That patch should be first as it's a fix we care to backport.
It's also

Fixes: 8d0c12a80cdeb ("io-uring: add napi busy poll support")
Cc: [email protected]

And a comment below

> 
> Signed-off-by: Olivier Langlois <[email protected]>
> ---
>   io_uring/napi.h   | 9 +++++++++
>   io_uring/sqpoll.c | 7 ++++---
>   2 files changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/io_uring/napi.h b/io_uring/napi.h
> index 88f1c21d5548..5506c6af1ff5 100644
> --- a/io_uring/napi.h
> +++ b/io_uring/napi.h
> @@ -101,4 +101,13 @@ static inline int io_napi_sqpoll_busy_poll(struct io_ring_ctx *ctx)
>   }
>   #endif /* CONFIG_NET_RX_BUSY_POLL */
>   
> +static inline int io_do_sqpoll_napi(struct io_ring_ctx *ctx)
> +{
> +	int ret = 0;
> +
> +	if (io_napi(ctx))
> +		ret = io_napi_sqpoll_busy_poll(ctx);
> +	return ret;
> +}
> +
>   #endif
> diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c
> index cc4a25136030..ec558daa0331 100644
> --- a/io_uring/sqpoll.c
> +++ b/io_uring/sqpoll.c
> @@ -195,9 +195,6 @@ static int __io_sq_thread(struct io_ring_ctx *ctx, bool cap_entries)
>   			ret = io_submit_sqes(ctx, to_submit);
>   		mutex_unlock(&ctx->uring_lock);
>   
> -		if (io_napi(ctx))
> -			ret += io_napi_sqpoll_busy_poll(ctx);
> -
>   		if (to_submit && wq_has_sleeper(&ctx->sqo_sq_wait))
>   			wake_up(&ctx->sqo_sq_wait);
>   		if (creds)
> @@ -322,6 +319,10 @@ static int io_sq_thread(void *data)
>   		if (io_sq_tw(&retry_list, IORING_TW_CAP_ENTRIES_VALUE))
>   			sqt_spin = true;
>   
> +		list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
> +			if (io_do_sqpoll_napi(ctx))
> +				sqt_spin = true;

io_do_sqpoll_napi() returns 1 as long as there are napis in the list,
iow even if there is no activity it'll spin almost forever (60s is
forever) bypassing sq_thread_idle.

Let's not update sqt_spin here, if the user wants it to poll for
longer it can pass a larger SQPOLL idle timeout value.


> +		}
>   		if (sqt_spin || !time_after(jiffies, timeout)) {
>   			if (sqt_spin) {
>   				io_sq_update_worktime(sqd, &start);

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] io_uring: micro optimization of __io_sq_thread() condition
  2024-07-30 20:56 ` [PATCH 1/2] io_uring: micro optimization of __io_sq_thread() condition Olivier Langlois
@ 2024-08-02 11:17   ` Pavel Begunkov
  0 siblings, 0 replies; 8+ messages in thread
From: Pavel Begunkov @ 2024-08-02 11:17 UTC (permalink / raw)
  To: Olivier Langlois, Jens Axboe, io-uring

On 7/30/24 21:56, Olivier Langlois wrote:
> reverse the order of the element evaluation in an if statement.
> 
> for many users that are not using iopoll, the iopoll_list will always
> evaluate to false after having made a memory access whereas to_submit is
> very likely already loaded in a register.

doubt it'd make any difference, but it might be useful if sqpoll
submits requests often enough.

Reviewed-by: Pavel Begunkov <[email protected]>


> Signed-off-by: Olivier Langlois <[email protected]>
> ---
>   io_uring/sqpoll.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c
> index b3722e5275e7..cc4a25136030 100644
> --- a/io_uring/sqpoll.c
> +++ b/io_uring/sqpoll.c
> @@ -176,7 +176,7 @@ static int __io_sq_thread(struct io_ring_ctx *ctx, bool cap_entries)
>   	if (cap_entries && to_submit > IORING_SQPOLL_CAP_ENTRIES_VALUE)
>   		to_submit = IORING_SQPOLL_CAP_ENTRIES_VALUE;
>   
> -	if (!wq_list_empty(&ctx->iopoll_list) || to_submit) {
> +	if (to_submit || !wq_list_empty(&ctx->iopoll_list)) {
>   		const struct cred *creds = NULL;
>   
>   		if (ctx->sq_creds != current_cred())

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: (subset) [PATCH 0/2] io_uring: minor sqpoll code refactoring
  2024-07-30 21:19 [PATCH 0/2] io_uring: minor sqpoll code refactoring Olivier Langlois
  2024-07-30 20:56 ` [PATCH 1/2] io_uring: micro optimization of __io_sq_thread() condition Olivier Langlois
  2024-07-30 21:10 ` [PATCH 2/2] io_uring: do the sqpoll napi busy poll outside the submission block Olivier Langlois
@ 2024-08-02 13:11 ` Jens Axboe
  2 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2024-08-02 13:11 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring, Olivier Langlois


On Tue, 30 Jul 2024 17:19:30 -0400, Olivier Langlois wrote:
> the first patch is minor micro-optimization that attempts to avoid a memory
> access if by testing a variable to is very likely already in a register
> 
> the second patch is also minor but this is much more serious. Without it,
> it is possible to have a ring that is configured to enable NAPI busy polling
> to NOT perform busy polling in specific conditions.
> 
> [...]

Applied, thanks!

[1/2] io_uring: micro optimization of __io_sq_thread() condition
      commit: 5fa7a249d5bc847876e04b91133d6b18d5c17140

Best regards,
-- 
Jens Axboe




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] io_uring: do the sqpoll napi busy poll outside the submission block
  2024-08-02 11:14   ` Pavel Begunkov
@ 2024-08-02 14:22     ` Olivier Langlois
  2024-08-02 15:30       ` Pavel Begunkov
  0 siblings, 1 reply; 8+ messages in thread
From: Olivier Langlois @ 2024-08-02 14:22 UTC (permalink / raw)
  To: Pavel Begunkov, Jens Axboe, io-uring

On Fri, 2024-08-02 at 12:14 +0100, Pavel Begunkov wrote:
> 
> io_do_sqpoll_napi() returns 1 as long as there are napis in the list,
> iow even if there is no activity it'll spin almost forever (60s is
> forever) bypassing sq_thread_idle.
> 
> Let's not update sqt_spin here, if the user wants it to poll for
> longer it can pass a larger SQPOLL idle timeout value.
> 
> 
> 
fair enough...

in that case, maybe the man page SQPOLL idle timeout description should
include the mention that if NAPI busy loop is used, the idle timeout
should be at least as large as gro_flush_timeout to meet NAPI
requirement to not generate interrupts as described in

Documentation/networking/napi.rst
section "Software IRQ coalescing"

I have discovered this fact the hard way by having spent days to figure
out how to do busy poll the right way.

this simple mention could save the trouble to many new users of the
feature.

I'll rework the patch and send a new version in the few days.

Greetings,


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] io_uring: do the sqpoll napi busy poll outside the submission block
  2024-08-02 14:22     ` Olivier Langlois
@ 2024-08-02 15:30       ` Pavel Begunkov
  0 siblings, 0 replies; 8+ messages in thread
From: Pavel Begunkov @ 2024-08-02 15:30 UTC (permalink / raw)
  To: Olivier Langlois, Jens Axboe, io-uring

On 8/2/24 15:22, Olivier Langlois wrote:
> On Fri, 2024-08-02 at 12:14 +0100, Pavel Begunkov wrote:
>>
>> io_do_sqpoll_napi() returns 1 as long as there are napis in the list,
>> iow even if there is no activity it'll spin almost forever (60s is
>> forever) bypassing sq_thread_idle.
>>
>> Let's not update sqt_spin here, if the user wants it to poll for
>> longer it can pass a larger SQPOLL idle timeout value.
>>
>>
>>
> fair enough...
> 
> in that case, maybe the man page SQPOLL idle timeout description should
> include the mention that if NAPI busy loop is used, the idle timeout
> should be at least as large as gro_flush_timeout to meet NAPI
> requirement to not generate interrupts as described in

Would be great to have, I agree. We might also need to start
a tips and tricks document, not like many people are looking at
documentation.


> Documentation/networking/napi.rst
> section "Software IRQ coalescing"
> 
> I have discovered this fact the hard way by having spent days to figure
> out how to do busy poll the right way.
> 
> this simple mention could save the trouble to many new users of the
> feature.
> 
> I'll rework the patch and send a new version in the few days.

Awesome, thanks

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-08-02 15:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-30 21:19 [PATCH 0/2] io_uring: minor sqpoll code refactoring Olivier Langlois
2024-07-30 20:56 ` [PATCH 1/2] io_uring: micro optimization of __io_sq_thread() condition Olivier Langlois
2024-08-02 11:17   ` Pavel Begunkov
2024-07-30 21:10 ` [PATCH 2/2] io_uring: do the sqpoll napi busy poll outside the submission block Olivier Langlois
2024-08-02 11:14   ` Pavel Begunkov
2024-08-02 14:22     ` Olivier Langlois
2024-08-02 15:30       ` Pavel Begunkov
2024-08-02 13:11 ` (subset) [PATCH 0/2] io_uring: minor sqpoll code refactoring Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox