* [PATCH] io_uring: fix IOPOLL with passthrough I/O
@ 2026-01-14 15:12 Jens Axboe
2026-01-14 15:32 ` Ming Lei
0 siblings, 1 reply; 4+ messages in thread
From: Jens Axboe @ 2026-01-14 15:12 UTC (permalink / raw)
To: io-uring; +Cc: Yi Zhang
A previous commit improving IOPOLL made an incorrect assumption that
task_work isn't used with IOPOLL. This can cause crashes when doing
passthrough I/O on nvme, where queueing the completion task_work will
trample on the same memory that holds the completed list of requests.
Fix it up by shuffling the members around, so we're not sharing any
parts that end up getting used in this path.
Fixes: 3c7d76d6128a ("io_uring: IOPOLL polling improvements")
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Link: https://lore.kernel.org/linux-block/CAHj4cs_SLPj9v9w5MgfzHKy+983enPx3ZQY2kMuMJ1202DBefw@mail.gmail.com/
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
index e4c804f99c30..211686ad89fd 100644
--- a/include/linux/io_uring_types.h
+++ b/include/linux/io_uring_types.h
@@ -713,13 +713,10 @@ struct io_kiocb {
atomic_t refs;
bool cancel_seq_set;
- /*
- * IOPOLL doesn't use task_work, so use the ->iopoll_node list
- * entry to manage pending iopoll requests.
- */
union {
struct io_task_work io_task_work;
- struct list_head iopoll_node;
+ /* For IOPOLL setup queues, with hybrid polling */
+ u64 iopoll_start;
};
union {
@@ -728,8 +725,8 @@ struct io_kiocb {
* poll
*/
struct hlist_node hash_node;
- /* For IOPOLL setup queues, with hybrid polling */
- u64 iopoll_start;
+ /* IOPOLL completion handling */
+ struct list_head iopoll_node;
/* for private io_kiocb freeing */
struct rcu_head rcu_head;
};
--
Jens Axboe
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] io_uring: fix IOPOLL with passthrough I/O
2026-01-14 15:12 [PATCH] io_uring: fix IOPOLL with passthrough I/O Jens Axboe
@ 2026-01-14 15:32 ` Ming Lei
2026-01-14 15:33 ` Jens Axboe
0 siblings, 1 reply; 4+ messages in thread
From: Ming Lei @ 2026-01-14 15:32 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, Yi Zhang
On Wed, Jan 14, 2026 at 08:12:15AM -0700, Jens Axboe wrote:
> A previous commit improving IOPOLL made an incorrect assumption that
> task_work isn't used with IOPOLL. This can cause crashes when doing
> passthrough I/O on nvme, where queueing the completion task_work will
> trample on the same memory that holds the completed list of requests.
>
> Fix it up by shuffling the members around, so we're not sharing any
> parts that end up getting used in this path.
>
> Fixes: 3c7d76d6128a ("io_uring: IOPOLL polling improvements")
> Reported-by: Yi Zhang <yi.zhang@redhat.com>
> Link: https://lore.kernel.org/linux-block/CAHj4cs_SLPj9v9w5MgfzHKy+983enPx3ZQY2kMuMJ1202DBefw@mail.gmail.com/
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>
> ---
>
> diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
> index e4c804f99c30..211686ad89fd 100644
> --- a/include/linux/io_uring_types.h
> +++ b/include/linux/io_uring_types.h
> @@ -713,13 +713,10 @@ struct io_kiocb {
> atomic_t refs;
> bool cancel_seq_set;
>
> - /*
> - * IOPOLL doesn't use task_work, so use the ->iopoll_node list
> - * entry to manage pending iopoll requests.
> - */
> union {
> struct io_task_work io_task_work;
> - struct list_head iopoll_node;
> + /* For IOPOLL setup queues, with hybrid polling */
> + u64 iopoll_start;
> };
>
> union {
> @@ -728,8 +725,8 @@ struct io_kiocb {
> * poll
> */
> struct hlist_node hash_node;
> - /* For IOPOLL setup queues, with hybrid polling */
> - u64 iopoll_start;
> + /* IOPOLL completion handling */
> + struct list_head iopoll_node;
> /* for private io_kiocb freeing */
> struct rcu_head rcu_head;
->hash_node is used by uring_cmd in io_uring_cmd_mark_cancelable()/io_uring_cmd_del_cancelable(),
so this way may break uring_cmd if supporting iopoll and cancelable in future.
Thanks,
Ming
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] io_uring: fix IOPOLL with passthrough I/O
2026-01-14 15:32 ` Ming Lei
@ 2026-01-14 15:33 ` Jens Axboe
2026-01-14 15:50 ` Jens Axboe
0 siblings, 1 reply; 4+ messages in thread
From: Jens Axboe @ 2026-01-14 15:33 UTC (permalink / raw)
To: Ming Lei; +Cc: io-uring, Yi Zhang
On 1/14/26 8:32 AM, Ming Lei wrote:
> On Wed, Jan 14, 2026 at 08:12:15AM -0700, Jens Axboe wrote:
>> A previous commit improving IOPOLL made an incorrect assumption that
>> task_work isn't used with IOPOLL. This can cause crashes when doing
>> passthrough I/O on nvme, where queueing the completion task_work will
>> trample on the same memory that holds the completed list of requests.
>>
>> Fix it up by shuffling the members around, so we're not sharing any
>> parts that end up getting used in this path.
>>
>> Fixes: 3c7d76d6128a ("io_uring: IOPOLL polling improvements")
>> Reported-by: Yi Zhang <yi.zhang@redhat.com>
>> Link: https://lore.kernel.org/linux-block/CAHj4cs_SLPj9v9w5MgfzHKy+983enPx3ZQY2kMuMJ1202DBefw@mail.gmail.com/
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>
>> ---
>>
>> diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
>> index e4c804f99c30..211686ad89fd 100644
>> --- a/include/linux/io_uring_types.h
>> +++ b/include/linux/io_uring_types.h
>> @@ -713,13 +713,10 @@ struct io_kiocb {
>> atomic_t refs;
>> bool cancel_seq_set;
>>
>> - /*
>> - * IOPOLL doesn't use task_work, so use the ->iopoll_node list
>> - * entry to manage pending iopoll requests.
>> - */
>> union {
>> struct io_task_work io_task_work;
>> - struct list_head iopoll_node;
>> + /* For IOPOLL setup queues, with hybrid polling */
>> + u64 iopoll_start;
>> };
>>
>> union {
>> @@ -728,8 +725,8 @@ struct io_kiocb {
>> * poll
>> */
>> struct hlist_node hash_node;
>> - /* For IOPOLL setup queues, with hybrid polling */
>> - u64 iopoll_start;
>> + /* IOPOLL completion handling */
>> + struct list_head iopoll_node;
>> /* for private io_kiocb freeing */
>> struct rcu_head rcu_head;
>
> ->hash_node is used by uring_cmd in
> io_uring_cmd_mark_cancelable()/io_uring_cmd_del_cancelable(), so this
> way may break uring_cmd if supporting iopoll and cancelable in future.
We don't support cancelation on requests that go via the block stack,
never have and probably never will. But I should make a comment about
that, just in case...
--
Jens Axboe
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] io_uring: fix IOPOLL with passthrough I/O
2026-01-14 15:33 ` Jens Axboe
@ 2026-01-14 15:50 ` Jens Axboe
0 siblings, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2026-01-14 15:50 UTC (permalink / raw)
To: Ming Lei; +Cc: io-uring, Yi Zhang
On 1/14/26 8:33 AM, Jens Axboe wrote:
> On 1/14/26 8:32 AM, Ming Lei wrote:
>> On Wed, Jan 14, 2026 at 08:12:15AM -0700, Jens Axboe wrote:
>>> A previous commit improving IOPOLL made an incorrect assumption that
>>> task_work isn't used with IOPOLL. This can cause crashes when doing
>>> passthrough I/O on nvme, where queueing the completion task_work will
>>> trample on the same memory that holds the completed list of requests.
>>>
>>> Fix it up by shuffling the members around, so we're not sharing any
>>> parts that end up getting used in this path.
>>>
>>> Fixes: 3c7d76d6128a ("io_uring: IOPOLL polling improvements")
>>> Reported-by: Yi Zhang <yi.zhang@redhat.com>
>>> Link: https://lore.kernel.org/linux-block/CAHj4cs_SLPj9v9w5MgfzHKy+983enPx3ZQY2kMuMJ1202DBefw@mail.gmail.com/
>>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>>
>>> ---
>>>
>>> diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
>>> index e4c804f99c30..211686ad89fd 100644
>>> --- a/include/linux/io_uring_types.h
>>> +++ b/include/linux/io_uring_types.h
>>> @@ -713,13 +713,10 @@ struct io_kiocb {
>>> atomic_t refs;
>>> bool cancel_seq_set;
>>>
>>> - /*
>>> - * IOPOLL doesn't use task_work, so use the ->iopoll_node list
>>> - * entry to manage pending iopoll requests.
>>> - */
>>> union {
>>> struct io_task_work io_task_work;
>>> - struct list_head iopoll_node;
>>> + /* For IOPOLL setup queues, with hybrid polling */
>>> + u64 iopoll_start;
>>> };
>>>
>>> union {
>>> @@ -728,8 +725,8 @@ struct io_kiocb {
>>> * poll
>>> */
>>> struct hlist_node hash_node;
>>> - /* For IOPOLL setup queues, with hybrid polling */
>>> - u64 iopoll_start;
>>> + /* IOPOLL completion handling */
>>> + struct list_head iopoll_node;
>>> /* for private io_kiocb freeing */
>>> struct rcu_head rcu_head;
>>
>> ->hash_node is used by uring_cmd in
>> io_uring_cmd_mark_cancelable()/io_uring_cmd_del_cancelable(), so this
>> way may break uring_cmd if supporting iopoll and cancelable in future.
>
> We don't support cancelation on requests that go via the block stack,
> never have and probably never will. But I should make a comment about
> that, just in case...
Should be trivial enough to just explicitly disallow it.
diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index 197474911f04..ee7b49f47cb5 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -104,6 +104,15 @@ void io_uring_cmd_mark_cancelable(struct io_uring_cmd *cmd,
struct io_kiocb *req = cmd_to_io_kiocb(cmd);
struct io_ring_ctx *ctx = req->ctx;
+ /*
+ * Doing cancelations on IOPOLL requests are not supported. Both
+ * because they can't get canceled in the block stack, but also
+ * because iopoll completion data overlaps with the hash_node used
+ * for tracking.
+ */
+ if (ctx->flags & IORING_SETUP_IOPOLL)
+ return;
+
if (!(cmd->flags & IORING_URING_CMD_CANCELABLE)) {
cmd->flags |= IORING_URING_CMD_CANCELABLE;
io_ring_submit_lock(ctx, issue_flags);
--
Jens Axboe
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-01-14 15:50 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-14 15:12 [PATCH] io_uring: fix IOPOLL with passthrough I/O Jens Axboe
2026-01-14 15:32 ` Ming Lei
2026-01-14 15:33 ` Jens Axboe
2026-01-14 15:50 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox