* [PATCHSET 0/2] Avoid spurious syzbot induced hung task panics
@ 2026-01-21 23:22 Jens Axboe
2026-01-21 23:22 ` [PATCH 1/2] io_uring: add IO_URING_EXIT_WAIT_MAX definition Jens Axboe
2026-01-21 23:22 ` [PATCH 2/2] io_uring/io-wq: don't trigger hung task for syzbot craziness Jens Axboe
0 siblings, 2 replies; 3+ messages in thread
From: Jens Axboe @ 2026-01-21 23:22 UTC (permalink / raw)
To: io-uring
Hi,
For details, see this saga:
https://lore.kernel.org/io-uring/68a2decc.050a0220.e29e5.0099.GAE@google.com/
where the tldr is that there's no real bug here, it's just syzbot doing
hundreds of 2GB /dev/msr* reads in a tiny vm and with a bunch of
debugging enabled. That leads to triggering the hung task detector when
we wait on io-wq workers to exit. I did queue a patch for 6.19 that
makes this less likely to occur, as it'll only run the very first of
the items before noticing the cancelation:
https://lore.kernel.org/io-uring/937c3e38-368e-43eb-9d7e-2dcc0697799f@kernel.dk/
but even that isn't quite enough due to how much syzbot overloads the
system.
This will still throw a WARN_ON_ONCE(), which perhaps should just be a
printk() of some sort as the trace isn't THAT interesting. But it will
avoid hitting the hung task timeout detector, which for syzbot leads
to a panic + reboot.
io_uring/io-wq.c | 22 +++++++++++++++++++++-
io_uring/io_uring.c | 2 +-
io_uring/io_uring.h | 6 ++++++
3 files changed, 28 insertions(+), 2 deletions(-)
--
Jens Axboe
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH 1/2] io_uring: add IO_URING_EXIT_WAIT_MAX definition
2026-01-21 23:22 [PATCHSET 0/2] Avoid spurious syzbot induced hung task panics Jens Axboe
@ 2026-01-21 23:22 ` Jens Axboe
2026-01-21 23:22 ` [PATCH 2/2] io_uring/io-wq: don't trigger hung task for syzbot craziness Jens Axboe
1 sibling, 0 replies; 3+ messages in thread
From: Jens Axboe @ 2026-01-21 23:22 UTC (permalink / raw)
To: io-uring; +Cc: Jens Axboe
Add the timeout we normally wait before complaining about things being
stuck waiting for cancelations to complete as a define, and use it in
io_ring_exit_work().
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
io_uring/io_uring.c | 2 +-
io_uring/io_uring.h | 6 ++++++
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index b7a077c11c21..8f01e8503a64 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2984,7 +2984,7 @@ static __cold void io_tctx_exit_cb(struct callback_head *cb)
static __cold void io_ring_exit_work(struct work_struct *work)
{
struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx, exit_work);
- unsigned long timeout = jiffies + HZ * 60 * 5;
+ unsigned long timeout = jiffies + IO_URING_EXIT_WAIT_MAX;
unsigned long interval = HZ / 20;
struct io_tctx_exit exit;
struct io_tctx_node *node;
diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index a790c16854d3..db5350d3ca3f 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -89,6 +89,12 @@ struct io_ctx_config {
IOSQE_BUFFER_SELECT |\
IOSQE_CQE_SKIP_SUCCESS)
+/*
+ * Complaint timeout for io_uring cancelation exits, and for io-wq exit
+ * worker waiting.
+ */
+#define IO_URING_EXIT_WAIT_MAX (HZ * 60 * 5)
+
enum {
IOU_COMPLETE = 0,
--
2.51.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH 2/2] io_uring/io-wq: don't trigger hung task for syzbot craziness
2026-01-21 23:22 [PATCHSET 0/2] Avoid spurious syzbot induced hung task panics Jens Axboe
2026-01-21 23:22 ` [PATCH 1/2] io_uring: add IO_URING_EXIT_WAIT_MAX definition Jens Axboe
@ 2026-01-21 23:22 ` Jens Axboe
1 sibling, 0 replies; 3+ messages in thread
From: Jens Axboe @ 2026-01-21 23:22 UTC (permalink / raw)
To: io-uring; +Cc: Jens Axboe
Use the same trick that blk_io_schedule() does to avoid triggering the
hung task warning (and potential reboot/panic, depending on system
settings), and only wait for half the hung task timeout at the time.
If we exceed the default IO_URING_EXIT_WAIT_MAX period where we expect
things to certainly have finished unless there's a bug, then throw a
WARN_ON_ONCE() for that case.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
io_uring/io-wq.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c
index 2fa7d3601edb..aa670909fece 100644
--- a/io_uring/io-wq.c
+++ b/io_uring/io-wq.c
@@ -17,6 +17,7 @@
#include <linux/task_work.h>
#include <linux/audit.h>
#include <linux/mmu_context.h>
+#include <linux/sched/sysctl.h>
#include <uapi/linux/io_uring.h>
#include "io-wq.h"
@@ -1313,6 +1314,8 @@ static void io_wq_cancel_tw_create(struct io_wq *wq)
static void io_wq_exit_workers(struct io_wq *wq)
{
+ unsigned long timeout, warn_timeout;
+
if (!wq->task)
return;
@@ -1322,7 +1325,24 @@ static void io_wq_exit_workers(struct io_wq *wq)
io_wq_for_each_worker(wq, io_wq_worker_wake, NULL);
rcu_read_unlock();
io_worker_ref_put(wq);
- wait_for_completion(&wq->worker_done);
+
+ /*
+ * Shut up hung task complaint, see for example
+ *
+ * https://lore.kernel.org/all/696fc9e7.a70a0220.111c58.0006.GAE@google.com/
+ *
+ * where completely overloading the system with tons of long running
+ * io-wq items can easily trigger the hung task timeout. Only sleep
+ * uninterruptibly for half that time, and warn if we exceeded end
+ * up waiting more than IO_URING_EXIT_WAIT_MAX.
+ */
+ timeout = sysctl_hung_task_timeout_secs * HZ / 2;
+ warn_timeout = jiffies + IO_URING_EXIT_WAIT_MAX;
+ do {
+ if (wait_for_completion_timeout(&wq->worker_done, timeout))
+ break;
+ WARN_ON_ONCE(time_after(jiffies, warn_timeout));
+ } while (1);
spin_lock_irq(&wq->hash->wait.lock);
list_del_init(&wq->wait.entry);
--
2.51.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-01-21 23:25 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-21 23:22 [PATCHSET 0/2] Avoid spurious syzbot induced hung task panics Jens Axboe
2026-01-21 23:22 ` [PATCH 1/2] io_uring: add IO_URING_EXIT_WAIT_MAX definition Jens Axboe
2026-01-21 23:22 ` [PATCH 2/2] io_uring/io-wq: don't trigger hung task for syzbot craziness Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox