From: Pavel Begunkov <[email protected]>
To: Jens Axboe <[email protected]>, [email protected]
Cc: [email protected], Joakim Hassila <[email protected]>
Subject: Re: [PATCH] io_uring: fix early sqd_list removal sqpoll hangs
Date: Wed, 14 Apr 2021 11:46:12 +0100 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <1592cc2b0418a0512c83898dbef0b1c9722e8645.1618310545.git.asml.silence@gmail.com>
On 13/04/2021 11:43, Pavel Begunkov wrote:
> [ 245.463317] INFO: task iou-sqp-1374:1377 blocked for more than 122 seconds.
> [ 245.463334] task:iou-sqp-1374 state:D flags:0x00004000
> [ 245.463345] Call Trace:
> [ 245.463352] __schedule+0x36b/0x950
> [ 245.463376] schedule+0x68/0xe0
> [ 245.463385] __io_uring_cancel+0xfb/0x1a0
> [ 245.463407] do_exit+0xc0/0xb40
> [ 245.463423] io_sq_thread+0x49b/0x710
> [ 245.463445] ret_from_fork+0x22/0x30
>
> It happens when sqpoll forgot to run park_task_work and goes to exit,
> then exiting user may remove ctx from sqd_list, and so corresponding
> io_sq_thread() -> io_uring_cancel_sqpoll() won't be executed. Hopefully
> it just stucks in do_exit() in this case.
fwiw, it's actually a 5.12 problem and I have a reliable enough
way to reproduce it.
> Cc: [email protected]
> Reported-by: Joakim Hassila <[email protected]>
> Signed-off-by: Pavel Begunkov <[email protected]>
> ---
> fs/io_uring.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index cadd7a65a7f4..f390914666b1 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -6817,6 +6817,9 @@ static int io_sq_thread(void *data)
> current->flags |= PF_NO_SETAFFINITY;
>
> mutex_lock(&sqd->lock);
> + /* a user may had exited before the thread wstarted */
> + io_run_task_work_head(&sqd->park_task_work);
> +
> while (!test_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state)) {
> int ret;
> bool cap_entries, sqt_spin, needs_sched;
> @@ -6833,10 +6836,10 @@ static int io_sq_thread(void *data)
> }
> cond_resched();
> mutex_lock(&sqd->lock);
> - if (did_sig)
> - break;
> io_run_task_work();
> io_run_task_work_head(&sqd->park_task_work);
> + if (did_sig)
> + break;
> timeout = jiffies + sqd->sq_thread_idle;
> continue;
> }
>
--
Pavel Begunkov
next prev parent reply other threads:[~2021-04-14 10:50 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-13 10:43 [PATCH] io_uring: fix early sqd_list removal sqpoll hangs Pavel Begunkov
2021-04-14 10:46 ` Pavel Begunkov [this message]
2021-04-14 16:19 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox