From: Jens Axboe <axboe@kernel.dk>
To: io-uring <io-uring@vger.kernel.org>,
Caleb Sander Mateos <csander@purestorage.com>
Subject: [PATCH] io_uring: get rid of tw_pending for !DEFER task work
Date: Tue, 16 Jun 2026 06:20:51 -0600 [thread overview]
Message-ID: <0600ea2a-9a60-49e6-aeba-3bbab4b9d3d2@kernel.dk> (raw)
The normal task_work path used a tw_pending bit to ensure the callback
was only added once: the mpscq drains incrementally, so a single
tctx_task_work() run can take the queue through empty -> non-empty
several times, and each transition would otherwise re-add the already
pending callback_head. This corrupts the task_work list, and is what
tw_pending protects again.
This can go away, if we stop running the task_work as soon as the queue
empties.
Suggested-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
index 6415a3353ee0..87151a5b62c1 100644
--- a/include/linux/io_uring_types.h
+++ b/include/linux/io_uring_types.h
@@ -149,8 +149,6 @@ struct io_uring_task {
struct { /* task_work */
struct mpscq task_list;
- /* BIT(0) guards adding tw only once */
- unsigned long tw_pending;
struct callback_head task_work;
} ____cacheline_aligned_in_smp;
};
diff --git a/io_uring/mpscq.h b/io_uring/mpscq.h
index c801384c6a0a..f910526766fd 100644
--- a/io_uring/mpscq.h
+++ b/io_uring/mpscq.h
@@ -122,4 +122,13 @@ static inline struct llist_node *mpscq_pop(struct mpscq *q,
return NULL;
}
+/*
+ * Returns true if the most recent mpscq_pop() that returned a node also
+ * emptied the queue. Consumer must be serialized.
+ */
+static inline bool mpscq_pop_emptied(struct mpscq *q, struct llist_node *head)
+{
+ return head == &q->stub;
+}
+
#endif /* IOU_MPSCQ_H */
diff --git a/io_uring/tw.c b/io_uring/tw.c
index e74372233f40..f2ce806b01a1 100644
--- a/io_uring/tw.c
+++ b/io_uring/tw.c
@@ -34,10 +34,6 @@ void io_tctx_fallback_work(struct work_struct *work)
fallback_work);
unsigned int count = 0;
- /* see tctx_task_work() - a set bit must always have a run coming */
- clear_bit(0, &tctx->tw_pending);
- smp_mb__after_atomic();
-
/*
* Run the entries directly. We're in PF_KTHRED context, hence
* io_should_terminate_tw() is true and they will be marked as
@@ -101,6 +97,13 @@ void tctx_task_work_run(struct io_uring_task *tctx, unsigned int max_entries,
io_poll_task_func, io_req_rw_complete,
(struct io_tw_req){req}, ts);
(*count)++;
+ /*
+ * Break if most recent pop emptied the queue. This helps
+ * bound task_work run, and also protects the regular
+ * task_work addition.
+ */
+ if (mpscq_pop_emptied(&tctx->task_list, tctx->task_head))
+ break;
if (unlikely(need_resched())) {
ctx_flush_and_put(ctx, ts);
ctx = NULL;
@@ -127,8 +130,6 @@ void tctx_task_work(struct callback_head *cb)
unsigned int count = 0;
tctx = container_of(cb, struct io_uring_task, task_work);
- clear_bit(0, &tctx->tw_pending);
- smp_mb__after_atomic();
tctx_task_work_run(tctx, UINT_MAX, &count);
}
@@ -206,7 +207,7 @@ void io_req_normal_work_add(struct io_kiocb *req)
struct io_uring_task *tctx = req->tctx;
struct io_ring_ctx *ctx = req->ctx;
- /* task_work already pending, we're done */
+ /* tw run already pending, nothing else to do */
if (!mpscq_push(&tctx->task_list, &req->io_task_work.node))
return;
@@ -223,10 +224,6 @@ void io_req_normal_work_add(struct io_kiocb *req)
return;
}
- /* task_work must only be added once */
- if (test_and_set_bit(0, &tctx->tw_pending))
- return;
-
if (likely(!task_work_add(tctx->task, &tctx->task_work, ctx->notify_method)))
return;
--
Jens Axboe
next reply other threads:[~2026-06-16 12:20 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-16 12:20 Jens Axboe [this message]
2026-06-16 15:15 ` [PATCH] io_uring: get rid of tw_pending for !DEFER task work Caleb Sander Mateos
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0600ea2a-9a60-49e6-aeba-3bbab4b9d3d2@kernel.dk \
--to=axboe@kernel.dk \
--cc=csander@purestorage.com \
--cc=io-uring@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox