public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH 5.12] io_uring: fix io_sq_offload_create error handling
@ 2021-03-08 17:30 Pavel Begunkov
  2021-03-08 18:51 ` Jens Axboe
  0 siblings, 1 reply; 2+ messages in thread
From: Pavel Begunkov @ 2021-03-08 17:30 UTC (permalink / raw)
  To: Jens Axboe, io-uring

Don't set IO_SQ_THREAD_SHOULD_STOP when io_sq_offload_create() has
failed on io_uring_alloc_task_context() but leave everything to
io_sq_thread_finish(), because currently io_sq_thread_finish()
hangs on trying to park it. That's great it stalls there, because
otherwise the following io_sq_thread_stop() would be skipped on
IO_SQ_THREAD_SHOULD_STOP check and the sqo would race for sqd with
freeing ctx.

A simple error injection gives something like this.

[  245.463955] INFO: task sqpoll-test-hang:523 blocked for more than 122 seconds.
[  245.463983] Call Trace:
[  245.463990]  __schedule+0x36b/0x950
[  245.464005]  schedule+0x68/0xe0
[  245.464013]  schedule_timeout+0x209/0x2a0
[  245.464032]  wait_for_completion+0x8b/0xf0
[  245.464043]  io_sq_thread_finish+0x44/0x1a0
[  245.464049]  io_uring_setup+0x9ea/0xc80
[  245.464058]  __x64_sys_io_uring_setup+0x16/0x20
[  245.464064]  do_syscall_64+0x38/0x50
[  245.464073]  entry_SYSCALL_64_after_hwframe+0x44/0xae

Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 5505e19f1391..5ca3c70e6640 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -7860,10 +7860,9 @@ static int io_sq_offload_create(struct io_ring_ctx *ctx,
 			ret = PTR_ERR(tsk);
 			goto err;
 		}
-		ret = io_uring_alloc_task_context(tsk, ctx);
-		if (ret)
-			set_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state);
+
 		sqd->thread = tsk;
+		ret = io_uring_alloc_task_context(tsk, ctx);
 		wake_up_new_task(tsk);
 		if (ret)
 			goto err;
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH 5.12] io_uring: fix io_sq_offload_create error handling
  2021-03-08 17:30 [PATCH 5.12] io_uring: fix io_sq_offload_create error handling Pavel Begunkov
@ 2021-03-08 18:51 ` Jens Axboe
  0 siblings, 0 replies; 2+ messages in thread
From: Jens Axboe @ 2021-03-08 18:51 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 3/8/21 10:30 AM, Pavel Begunkov wrote:
> Don't set IO_SQ_THREAD_SHOULD_STOP when io_sq_offload_create() has
> failed on io_uring_alloc_task_context() but leave everything to
> io_sq_thread_finish(), because currently io_sq_thread_finish()
> hangs on trying to park it. That's great it stalls there, because
> otherwise the following io_sq_thread_stop() would be skipped on
> IO_SQ_THREAD_SHOULD_STOP check and the sqo would race for sqd with
> freeing ctx.
> 
> A simple error injection gives something like this.
> 
> [  245.463955] INFO: task sqpoll-test-hang:523 blocked for more than 122 seconds.
> [  245.463983] Call Trace:
> [  245.463990]  __schedule+0x36b/0x950
> [  245.464005]  schedule+0x68/0xe0
> [  245.464013]  schedule_timeout+0x209/0x2a0
> [  245.464032]  wait_for_completion+0x8b/0xf0
> [  245.464043]  io_sq_thread_finish+0x44/0x1a0
> [  245.464049]  io_uring_setup+0x9ea/0xc80
> [  245.464058]  __x64_sys_io_uring_setup+0x16/0x20
> [  245.464064]  do_syscall_64+0x38/0x50
> [  245.464073]  entry_SYSCALL_64_after_hwframe+0x44/0xae

Applied, thanks.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-03-08 18:52 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-03-08 17:30 [PATCH 5.12] io_uring: fix io_sq_offload_create error handling Pavel Begunkov
2021-03-08 18:51 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox