From: Dave Chinner <[email protected]>
To: [email protected]
Cc: [email protected], Jens Axboe <[email protected]>,
[email protected]
Subject: Re: [regression, v6.0-rc0, io-uring?] filesystem freeze hangs on sb_wait_write()
Date: Tue, 11 Oct 2022 11:40:25 +1100 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On Mon, Oct 10, 2022 at 04:03:19PM +1100, Dave Chinner wrote:
> Hi Al,
>
> Just tried to run fstests on XFS on a current Linus kernel at commit
> 493ffd6605b2 ("Merge tag 'ucount-rlimits-cleanups-for-v5.19' of
> git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace")
> and generic/068 hangs trying to freeze the filesystem like so:
>
> [ 163.957724] task:xfs_io state:D stack:14680 pid: 4831 ppid: 4825 flags:0x00004000
> [ 163.961425] Call Trace:
> [ 163.962553] <TASK>
> [ 163.963497] __schedule+0x2f9/0xa30
> [ 163.965125] ? percpu_down_write+0x60/0x190
> [ 163.966888] schedule+0x5a/0xc0
> [ 163.968206] percpu_down_write+0xe8/0x190
> [ 163.969865] freeze_super+0x78/0x170
> [ 163.971247] __x64_sys_ioctl+0x61/0xb0
> [ 163.973947] do_syscall_64+0x35/0x80
> [ 163.975119] entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [ 163.976781] RIP: 0033:0x7ff78910bb07
> [ 163.978028] RSP: 002b:00007ffefe7279b8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
> [ 163.980734] RAX: ffffffffffffffda RBX: 000055a88c183ad0 RCX: 00007ff78910bb07
> [ 163.983346] RDX: 00007ffefe7279cc RSI: ffffffffc0045877 RDI: 0000000000000003
> [ 163.985864] RBP: 0000000000000008 R08: 00007ff7891d5ba0 R09: 00007ff7891d5c00
> [ 163.988371] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000001
> [ 163.990704] R13: 000055a88c184270 R14: 000055a88c184fc0 R15: 000055a88c184fe0
> [ 163.992964] </TASK>
> [ 163.993612] task:fsstress state:D stack:12464 pid: 4832 ppid: 4822 flags:0x00000000
> [ 163.996390] Call Trace:
> [ 163.997310] <TASK>
> [ 163.998076] __schedule+0x2f9/0xa30
> [ 163.999323] ? __smp_call_single_queue+0x23/0x40
> [ 164.000685] schedule+0x5a/0xc0
> [ 164.001719] ? percpu_rwsem_wait+0x123/0x150
> [ 164.003139] percpu_rwsem_wait+0x123/0x150
> [ 164.004535] ? __percpu_rwsem_trylock.part.0+0x50/0x50
> [ 164.006202] __percpu_down_read+0x5b/0x110
> [ 164.007560] mnt_want_write+0xa0/0xd0
> [ 164.008816] do_renameat2+0x17b/0x530
> [ 164.010006] ? xfs_can_free_eofblocks+0x39/0x1e0
> [ 164.011558] ? __might_fault+0x1e/0x20
> [ 164.012870] ? strncpy_from_user+0x1e/0x160
> [ 164.014236] __x64_sys_renameat2+0x4b/0x60
> [ 164.015451] do_syscall_64+0x35/0x80
> [ 164.016576] entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [ 164.018117] RIP: 0033:0x7f3c7fe97c0f
> [ 164.019124] RSP: 002b:00007fff67cf26b8 EFLAGS: 00000202 ORIG_RAX: 000000000000013c
> [ 164.021202] RAX: ffffffffffffffda RBX: 00007fff67cf2980 RCX: 00007f3c7fe97c0f
> [ 164.023309] RDX: 00000000ffffff9c RSI: 000055fd446825a0 RDI: 00000000ffffff9c
> [ 164.025479] RBP: 00007fff67cf2990 R08: 0000000000000002 R09: 00007fff67cf2960
> [ 164.027655] R10: 000055fd44682f10 R11: 0000000000000202 R12: 0000000000000079
> [ 164.029721] R13: 0000000000000002 R14: 000055fd446703a0 R15: 00000000ffffffff
> [ 164.031937] </TASK>
> [ 164.032675] task:fsstress state:D stack:14224 pid: 4833 ppid: 4822 flags:0x00000000
> [ 164.035086] Call Trace:
> [ 164.035803] <TASK>
> [ 164.036512] __schedule+0x2f9/0xa30
> [ 164.037661] ? __smp_call_single_queue+0x23/0x40
> [ 164.039048] schedule+0x5a/0xc0
> [ 164.040076] ? percpu_rwsem_wait+0x123/0x150
> [ 164.041315] percpu_rwsem_wait+0x123/0x150
> [ 164.042458] ? __percpu_rwsem_trylock.part.0+0x50/0x50
> [ 164.043888] __percpu_down_read+0x5b/0x110
> [ 164.045097] mnt_want_write+0xa0/0xd0
> [ 164.046117] do_fchownat+0x78/0xe0
> [ 164.047060] __x64_sys_lchown+0x1f/0x30
> [ 164.048210] do_syscall_64+0x35/0x80
> [ 164.049295] entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [ 164.050775] RIP: 0033:0x7f3c7ff2df97
> [ 164.051727] RSP: 002b:00007fff67cf2838 EFLAGS: 00000202 ORIG_RAX: 000000000000005e
> [ 164.053803] RAX: ffffffffffffffda RBX: 00007fff67cf29a0 RCX: 00007f3c7ff2df97
> [ 164.055773] RDX: 0000000000028dd3 RSI: 0000000000022124 RDI: 000055fd44672450
> [ 164.057745] RBP: 0000000000022124 R08: 0000000000000064 R09: 00007fff67cf299c
> [ 164.059668] R10: fffffffffffffb8b R11: 0000000000000202 R12: 0000000000000008
> [ 164.061609] R13: 0000000000028dd3 R14: 0000000000022124 R15: 000055fd44660b50
> [ 164.063578] </TASK>
>
> Eventually the hung task timer kicks in and reports all these
> blocked threads over and over again. The read locks look like
> sb_start_write() calls, and the freeze_super() call is waiting
> here:
>
>
> sb->s_writers.frozen = SB_FREEZE_WRITE;
> /* Release s_umount to preserve sb_start_write -> s_umount ordering */
> up_write(&sb->s_umount);
> >>>>>> sb_wait_write(sb, SB_FREEZE_WRITE);
> down_write(&sb->s_umount);
>
>
> Every test vm I ran this one failed in an identical manner. They all
> failed trying to freeze on the second iteration of the {freeze;thaw}
> loop, so whatever is going on should be easily reproducable:
>
> # cat /tmp/9061.out
> # QA output created by 068
> *** init FS
>
> *** iteration: 0
> *** freezing $SCRATCH_MNT
> *** thawing $SCRATCH_MNT
>
> *** iteration: 1
> *** freezing $SCRATCH_MNT
> #
>
> I just tested ext4 and it hangs in an identical fashion, so whatever
> is going on is not specific to XFS - it smells like a
> mnt_want_write() or sb_want_write() imbalance somewhere in the
> generic code. I haven't had time to bisect it, so this heads-up is
> all I'll have until I get can back to this tomorrow....
Bisect points to the io-uring merge commit:
$ git bisect log
git bisect start '--' 'fs/'
# status: waiting for both good and bad commits
# bad: [cdc94798820e5cdc2fc314540ee3d28e1f2fea0e] Merge remote-tracking branch 'linux-xfs/for-next' into working
git bisect bad cdc94798820e5cdc2fc314540ee3d28e1f2fea0e
# status: waiting for good commit(s), bad commit known
# good: [4fe89d07dcc2804c8b562f6c7896a45643d34b2f] Linux 6.0
git bisect good 4fe89d07dcc2804c8b562f6c7896a45643d34b2f
# good: [cbddcc4fa3443fe8cfb2ff8e210deb1f6a0eea38] btrfs: set generation before calling btrfs_clean_tree_block in btrfs_init_new_buffer
git bisect good cbddcc4fa3443fe8cfb2ff8e210deb1f6a0eea38
# good: [7f198ba7ae9874c64ffe8cd3aa60cf5dab78ce3a] Merge tag 'affs-for-6.1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
git bisect good 7f198ba7ae9874c64ffe8cd3aa60cf5dab78ce3a
# good: [9f4b9beeb9cf46c4b172fca06de5bd6831108641] Merge tag '6.1-rc-ksmbd-fixes' of git://git.samba.org/ksmbd
git bisect good 9f4b9beeb9cf46c4b172fca06de5bd6831108641
# bad: [e8bc52cb8df80c31c73c726ab58ea9746e9ff734] Merge tag 'driver-core-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
git bisect bad e8bc52cb8df80c31c73c726ab58ea9746e9ff734
# bad: [513389809e138ae903b6ef43c1d5d2ffaf4dca17] Merge tag 'for-6.1/block-2022-10-03' of git://git.kernel.dk/linux
git bisect bad 513389809e138ae903b6ef43c1d5d2ffaf4dca17
# good: [191249f708897fc34c78f4494f7156896aaaeca9] quota: Add more checking after reading from quota file
git bisect good 191249f708897fc34c78f4494f7156896aaaeca9
# good: [188943a15638ceb91f960e072ed7609b2d7f2a55] Merge tag 'fs-for_v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
git bisect good 188943a15638ceb91f960e072ed7609b2d7f2a55
# good: [118f3663fbc658e9ad6165e129076981c7b685c5] block: remove PSI accounting from the bio layer
git bisect good 118f3663fbc658e9ad6165e129076981c7b685c5
# bad: [0a78a376ef3c2f3d397df48909f00cd75f92137a] Merge tag 'for-6.1/io_uring-2022-10-03' of git://git.kernel.dk/linux
git bisect bad 0a78a376ef3c2f3d397df48909f00cd75f92137a
# good: [9f0deaa12d832f488500a5afe9b912e9b3cfc432] eventfd: guard wake_up in eventfd fs calls as well
git bisect good 9f0deaa12d832f488500a5afe9b912e9b3cfc432
# first bad commit: [0a78a376ef3c2f3d397df48909f00cd75f92137a] Merge tag 'for-6.1/io_uring-2022-10-03' of git://git.kernel.dk/linux
$
I note that there are changes to the the io_uring IO path and write
IO end accounting in the io_uring stack that was merged, and there
was no doubt about the success/failure of the reproducer at each
step. Hence I think the bisect is good, and the problem is someone
in the io-uring changes.
Jens, over to you.
The reproducer - generic/068 - is 100% reliable here, io_uring is
being exercised by fsstress in the background whilst the filesystem
is being frozen and thawed repeatedly. Some path in the io-uring
code has an unbalanced sb_start_write()/sb_end_write() pair by the
look of it....
-Dave.
--
Dave Chinner
[email protected]
next parent reply other threads:[~2022-10-11 0:40 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <[email protected]>
2022-10-11 0:40 ` Dave Chinner [this message]
2022-10-11 1:10 ` [regression, v6.0-rc0, io-uring?] filesystem freeze hangs on sb_wait_write() Pavel Begunkov
2022-10-11 2:01 ` Jens Axboe
2022-10-11 2:10 ` Pavel Begunkov
2022-10-11 2:54 ` Jens Axboe
2022-10-11 14:18 ` Jens Axboe
2022-10-11 14:39 ` Pavel Begunkov
2022-10-11 14:47 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox