public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jens Axboe <[email protected]>
To: Dave Chinner <[email protected]>
Cc: io-uring <[email protected]>,
	linux-fsdevel <[email protected]>
Subject: Re: [5.15-rc1 regression] io_uring: fsstress hangs in do_coredump() on exit
Date: Tue, 21 Sep 2021 08:19:53 -0600	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 9/21/21 7:25 AM, Jens Axboe wrote:
> On 9/21/21 12:40 AM, Dave Chinner wrote:
>> Hi Jens,
>>
>> I updated all my trees from 5.14 to 5.15-rc2 this morning and
>> immediately had problems running the recoveryloop fstest group on
>> them. These tests have a typical pattern of "run load in the
>> background, shutdown the filesystem, kill load, unmount and test
>> recovery".
>>
>> Whent eh load includes fsstress, and it gets killed after shutdown,
>> it hangs on exit like so:
>>
>> # echo w > /proc/sysrq-trigger 
>> [  370.669482] sysrq: Show Blocked State
>> [  370.671732] task:fsstress        state:D stack:11088 pid: 9619 ppid:  9615 flags:0x00000000
>> [  370.675870] Call Trace:
>> [  370.677067]  __schedule+0x310/0x9f0
>> [  370.678564]  schedule+0x67/0xe0
>> [  370.679545]  schedule_timeout+0x114/0x160
>> [  370.682002]  __wait_for_common+0xc0/0x160
>> [  370.684274]  wait_for_completion+0x24/0x30
>> [  370.685471]  do_coredump+0x202/0x1150
>> [  370.690270]  get_signal+0x4c2/0x900
>> [  370.691305]  arch_do_signal_or_restart+0x106/0x7a0
>> [  370.693888]  exit_to_user_mode_prepare+0xfb/0x1d0
>> [  370.695241]  syscall_exit_to_user_mode+0x17/0x40
>> [  370.696572]  do_syscall_64+0x42/0x80
>> [  370.697620]  entry_SYSCALL_64_after_hwframe+0x44/0xae
>>
>> It's 100% reproducable on one of my test machines, but only one of
>> them. That one machine is running fstests on pmem, so it has
>> synchronous storage. Every other test machine using normal async
>> storage (nvme, iscsi, etc) and none of them are hanging.
>>
>> A quick troll of the commit history between 5.14 and 5.15-rc2
>> indicates a couple of potential candidates. The 5th kernel build
>> (instead of ~16 for a bisect) told me that commit 15e20db2e0ce
>> ("io-wq: only exit on fatal signals") is the cause of the
>> regression. I've confirmed that this is the first commit where the
>> problem shows up.
> 
> Thanks for the report Dave, I'll take a look. Can you elaborate on
> exactly what is being run? And when killed, it's a non-fatal signal?

Can you try with this patch?

diff --git a/fs/io-wq.c b/fs/io-wq.c
index b5fd015268d7..1e55a0a2a217 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -586,7 +586,8 @@ static int io_wqe_worker(void *data)
 
 			if (!get_signal(&ksig))
 				continue;
-			if (fatal_signal_pending(current))
+			if (fatal_signal_pending(current) ||
+			    signal_group_exit(current->signal)) {
 				break;
 			continue;
 		}

-- 
Jens Axboe


  reply	other threads:[~2021-09-21 14:19 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-21  6:40 [5.15-rc1 regression] io_uring: fsstress hangs in do_coredump() on exit Dave Chinner
2021-09-21 13:25 ` Jens Axboe
2021-09-21 14:19   ` Jens Axboe [this message]
2021-09-21 21:35     ` Dave Chinner
2021-09-21 21:41       ` Jens Axboe
2021-09-23 14:05         ` Olivier Langlois

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox