public inbox for [email protected]
 help / color / mirror / Atom feed
* [RFC] io_uring: print COMM on ctx_exit hang
@ 2021-10-06 15:15 Pavel Begunkov
  2021-10-06 16:00 ` Jens Axboe
  0 siblings, 1 reply; 2+ messages in thread
From: Pavel Begunkov @ 2021-10-06 15:15 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, asml.silence

io_ring_exit_work() hangs are hard to debug partly because there is not
much information of who created the ctx by the time it's exiting, and
the function is running in a wq context, so the task name tells us
nothing. Save creator's task comm and print it when it hangs.

Signed-off-by: Pavel Begunkov <[email protected]>
---

Just for discussion, I hope there are better ways of doing it.

It leaves out the second wait_for_completion() in io_ring_exit_work(),
which is of interest, so would be great to cover the case as well.

 fs/io_uring.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 73135c5c6168..db0065637549 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -447,6 +447,8 @@ struct io_ring_ctx {
 		struct work_struct		exit_work;
 		struct list_head		tctx_list;
 		struct completion		ref_comp;
+		/* save owner thread's comm for debugging purposes */
+		char				owner_comm[TASK_COMM_LEN];
 	};
 };
 
@@ -9344,7 +9346,8 @@ static __cold void io_ring_exit_work(struct work_struct *work)
 
 		io_req_caches_free(ctx);
 
-		if (WARN_ON_ONCE(time_after(jiffies, timeout))) {
+		if (WARN_ONCE(time_after(jiffies, timeout), "comm %s\n",
+			      ctx->owner_comm)) {
 			/* there is little hope left, don't run it too often */
 			interval = HZ * 60;
 		}
@@ -10266,6 +10269,8 @@ static __cold int io_uring_create(unsigned entries, struct io_uring_params *p,
 	if (!capable(CAP_IPC_LOCK))
 		ctx->user = get_uid(current_user());
 
+	get_task_comm(ctx->owner_comm, current);
+
 	/*
 	 * This is just grabbed for accounting purposes. When a process exits,
 	 * the mm is exited and dropped before the files, hence we need to hang
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [RFC] io_uring: print COMM on ctx_exit hang
  2021-10-06 15:15 [RFC] io_uring: print COMM on ctx_exit hang Pavel Begunkov
@ 2021-10-06 16:00 ` Jens Axboe
  0 siblings, 0 replies; 2+ messages in thread
From: Jens Axboe @ 2021-10-06 16:00 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 10/6/21 9:15 AM, Pavel Begunkov wrote:
> io_ring_exit_work() hangs are hard to debug partly because there is not
> much information of who created the ctx by the time it's exiting, and
> the function is running in a wq context, so the task name tells us
> nothing. Save creator's task comm and print it when it hangs.
> 
> Signed-off-by: Pavel Begunkov <[email protected]>
> ---
> 
> Just for discussion, I hope there are better ways of doing it.
> 
> It leaves out the second wait_for_completion() in io_ring_exit_work(),
> which is of interest, so would be great to cover the case as well.

I've done identical patches in the past myself, and I don't think
there's a better way to do this. The task may be long gone, so we have
to capture it upfront.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-10-06 16:01 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-10-06 15:15 [RFC] io_uring: print COMM on ctx_exit hang Pavel Begunkov
2021-10-06 16:00 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox