public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH] io_uring: fix bug in slow unregistering of nodes
@ 2022-01-21 12:38 Dylan Yudaken
  2022-01-21 20:23 ` Jens Axboe
  2022-01-23 16:36 ` Jens Axboe
  0 siblings, 2 replies; 3+ messages in thread
From: Dylan Yudaken @ 2022-01-21 12:38 UTC (permalink / raw)
  To: io-uring; +Cc: axboe, asml.silence, Dylan Yudaken

In some cases io_rsrc_ref_quiesce will call io_rsrc_node_switch_start,
and then immediately flush the delayed work queue &ctx->rsrc_put_work.

However the percpu_ref_put does not immediately destroy the node, it
will be called asynchronously via RCU. That ends up with
io_rsrc_node_ref_zero only being called after rsrc_put_work has been
flushed, and so the process ends up sleeping for 1 second unnecessarily.

This patch executes the put code immediately if we are busy
quiescing.

Signed-off-by: Dylan Yudaken <[email protected]>
---
 fs/io_uring.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index e54c4127422e..dd4c801c7afd 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -7822,10 +7822,15 @@ static __cold void io_rsrc_node_ref_zero(struct percpu_ref *ref)
 	struct io_ring_ctx *ctx = node->rsrc_data->ctx;
 	unsigned long flags;
 	bool first_add = false;
+	unsigned long delay = HZ;
 
 	spin_lock_irqsave(&ctx->rsrc_ref_lock, flags);
 	node->done = true;
 
+	/* if we are mid-quiesce then do not delay */
+	if (node->rsrc_data->quiesce)
+		delay = 0;
+
 	while (!list_empty(&ctx->rsrc_ref_list)) {
 		node = list_first_entry(&ctx->rsrc_ref_list,
 					    struct io_rsrc_node, node);
@@ -7838,7 +7843,7 @@ static __cold void io_rsrc_node_ref_zero(struct percpu_ref *ref)
 	spin_unlock_irqrestore(&ctx->rsrc_ref_lock, flags);
 
 	if (first_add)
-		mod_delayed_work(system_wq, &ctx->rsrc_put_work, HZ);
+		mod_delayed_work(system_wq, &ctx->rsrc_put_work, delay);
 }
 
 static struct io_rsrc_node *io_rsrc_node_alloc(struct io_ring_ctx *ctx)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] io_uring: fix bug in slow unregistering of nodes
  2022-01-21 12:38 [PATCH] io_uring: fix bug in slow unregistering of nodes Dylan Yudaken
@ 2022-01-21 20:23 ` Jens Axboe
  2022-01-23 16:36 ` Jens Axboe
  1 sibling, 0 replies; 3+ messages in thread
From: Jens Axboe @ 2022-01-21 20:23 UTC (permalink / raw)
  To: Dylan Yudaken, io-uring; +Cc: asml.silence

On 1/21/22 5:38 AM, Dylan Yudaken wrote:
> In some cases io_rsrc_ref_quiesce will call io_rsrc_node_switch_start,
> and then immediately flush the delayed work queue &ctx->rsrc_put_work.
> 
> However the percpu_ref_put does not immediately destroy the node, it
> will be called asynchronously via RCU. That ends up with
> io_rsrc_node_ref_zero only being called after rsrc_put_work has been
> flushed, and so the process ends up sleeping for 1 second unnecessarily.
> 
> This patch executes the put code immediately if we are busy
> quiescing.

Looks good to me, and as far as I can tell, this bug was introduced by:

commit 4a38aed2a0a729ccecd84dca5b76d827b9e1294d
Author: Jens Axboe <[email protected]>
Date:   Thu May 14 17:21:15 2020 -0600

    io_uring: batch reap of dead file registrations

so I'll add a fixes line to that effect to the commit. Thanks!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] io_uring: fix bug in slow unregistering of nodes
  2022-01-21 12:38 [PATCH] io_uring: fix bug in slow unregistering of nodes Dylan Yudaken
  2022-01-21 20:23 ` Jens Axboe
@ 2022-01-23 16:36 ` Jens Axboe
  1 sibling, 0 replies; 3+ messages in thread
From: Jens Axboe @ 2022-01-23 16:36 UTC (permalink / raw)
  To: Dylan Yudaken, io-uring; +Cc: asml.silence

On Fri, 21 Jan 2022 04:38:56 -0800, Dylan Yudaken wrote:
> In some cases io_rsrc_ref_quiesce will call io_rsrc_node_switch_start,
> and then immediately flush the delayed work queue &ctx->rsrc_put_work.
> 
> However the percpu_ref_put does not immediately destroy the node, it
> will be called asynchronously via RCU. That ends up with
> io_rsrc_node_ref_zero only being called after rsrc_put_work has been
> flushed, and so the process ends up sleeping for 1 second unnecessarily.
> 
> [...]

Applied, thanks!

[1/1] io_uring: fix bug in slow unregistering of nodes
      (no commit info)

Best regards,
-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-01-23 16:36 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-01-21 12:38 [PATCH] io_uring: fix bug in slow unregistering of nodes Dylan Yudaken
2022-01-21 20:23 ` Jens Axboe
2022-01-23 16:36 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox