From: Jens Axboe <[email protected]>
To: Andrew Marshall <[email protected]>
Cc: [email protected]
Subject: Re: PROBLEM: io_uring hang causing uninterruptible sleep state on 6.6.59
Date: Sun, 3 Nov 2024 16:53:27 -0700 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 11/3/24 4:47 PM, Andrew Marshall wrote:
> Hi,
>
> I, and others (see downstream report below), are encountering io_uring
> at times hanging on 6.6.59 LTS. If the process is killed, the process
> remains stuck in sleep uninterruptible ("D"). This failure can be
> fairly reliably reproduced via Node.js with `npm ci` in at least some
> projects; disabling that tool?s use of io_uring causes via its
> configuration causes it to succeed. I have identified what seems to be
> the problematic commit on linux-6.6.y (f4ce3b5).
>
> Summary of Kernel version triaging:
>
> - 6.6.56: succeeds
> - 6.6.57: fails
> - 6.6.58: fails
> - 6.6.59: fails
> - 6.6.59 (with f4ce3b5 reverted): succeeds
> - 6.11.6: succeeds
>
> System logs upon failure indicate hung task:
>
> kernel: INFO: task npm ci:47920 blocked for more than 245 seconds.
> kernel: Tainted: P O 6.6.58 #1-NixOS
> kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> kernel: task:npm ci state:D stack:0 pid:47920 ppid:47710 flags:0x00004006
> kernel: Call Trace:
> kernel: <TASK>
> kernel: __schedule+0x3fc/0x1430
> kernel: ? sysvec_apic_timer_interrupt+0xe/0x90
> kernel: schedule+0x5e/0xe0
> kernel: schedule_preempt_disabled+0x15/0x30
> kernel: __mutex_lock.constprop.0+0x3a2/0x6b0
> kernel: io_uring_del_tctx_node+0x61/0xf0
> kernel: io_uring_clean_tctx+0x5c/0xc0
> kernel: io_uring_cancel_generic+0x198/0x350
> kernel: ? srso_return_thunk+0x5/0x5f
> kernel: ? timerqueue_del+0x2e/0x50
> kernel: ? __pfx_autoremove_wake_function+0x10/0x10
> kernel: do_exit+0x167/0xad0
> kernel: ? __pfx_hrtimer_wakeup+0x10/0x10
> kernel: do_group_exit+0x31/0x80
> kernel: get_signal+0xa60/0xa60
> kernel: arch_do_signal_or_restart+0x3e/0x280
> kernel: exit_to_user_mode_prepare+0x1d4/0x230
> kernel: syscall_exit_to_user_mode+0x1b/0x50
> kernel: do_syscall_64+0x45/0x90
> kernel: entry_SYSCALL_64_after_hwframe+0x78/0xe2
>
> For more details, see the downstream bug report in Node.js: https://github.com/nodejs/node/issues/55587
>
> I identified f4ce3b5d26ce149e77e6b8e8f2058aa80e5b034e as the likely
> problematic commit simply by browsing git log. As indicated above;
> reverting that atop 6.6.59 results in success. Since it is passing on
> 6.11.6, I suspect there is some missing backport to 6.6.x, or some
> other semantic merge conflict. Unfortunately I do not have a compact,
> minimal reproducer, but can provide my large one (it is testing a
> larger build process in a VM) if needed?there are some additional
> details in the above-linked downstream bug report, though. I hope that
> having identified the problematic commit is enough for someone with
> more context to go off of. Happy to provide more information if
> needed.
Don't worry about not having a reproducer, having the backport commit
pin pointed will do just fine. I'll take a look at this.
--
Jens Axboe
next prev parent reply other threads:[~2024-11-03 23:53 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-03 23:47 PROBLEM: io_uring hang causing uninterruptible sleep state on 6.6.59 Andrew Marshall
2024-11-03 23:53 ` Jens Axboe [this message]
2024-11-03 23:58 ` Jens Axboe
2024-11-04 0:01 ` Keith Busch
2024-11-04 0:06 ` Jens Axboe
2024-11-04 2:38 ` Stable backport (was "Re: PROBLEM: io_uring hang causing uninterruptible sleep state on 6.6.59") Jens Axboe
2024-11-04 4:25 ` Andrew Marshall
2024-11-04 13:17 ` Andrew Marshall
2024-11-04 15:58 ` Jens Axboe
2024-11-06 6:05 ` Greg Kroah-Hartman
2024-11-06 14:11 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox