* [PATCH v2] io_uring: fix IOPOLL with passthrough I/O
@ 2026-01-14 15:28 Jens Axboe
2026-01-14 22:08 ` Yi Zhang
2026-01-15 1:42 ` Ming Lei
0 siblings, 2 replies; 3+ messages in thread
From: Jens Axboe @ 2026-01-14 15:28 UTC (permalink / raw)
To: io-uring; +Cc: Yi Zhang, Ming Lei
A previous commit improving IOPOLL made an incorrect assumption that
task_work isn't used with IOPOLL. This can cause crashes when doing
passthrough I/O on nvme, where queueing the completion task_work will
trample on the same memory that holds the completed list of requests.
Fix it up by shuffling the members around, so we're not sharing any
parts that end up getting used in this path.
Fixes: 3c7d76d6128a ("io_uring: IOPOLL polling improvements")
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Link: https://lore.kernel.org/linux-block/CAHj4cs_SLPj9v9w5MgfzHKy+983enPx3ZQY2kMuMJ1202DBefw@mail.gmail.com/
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
v2: ensure ->iopoll_start is read before doing actual polling
diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
index e4c804f99c30..211686ad89fd 100644
--- a/include/linux/io_uring_types.h
+++ b/include/linux/io_uring_types.h
@@ -713,13 +713,10 @@ struct io_kiocb {
atomic_t refs;
bool cancel_seq_set;
- /*
- * IOPOLL doesn't use task_work, so use the ->iopoll_node list
- * entry to manage pending iopoll requests.
- */
union {
struct io_task_work io_task_work;
- struct list_head iopoll_node;
+ /* For IOPOLL setup queues, with hybrid polling */
+ u64 iopoll_start;
};
union {
@@ -728,8 +725,8 @@ struct io_kiocb {
* poll
*/
struct hlist_node hash_node;
- /* For IOPOLL setup queues, with hybrid polling */
- u64 iopoll_start;
+ /* IOPOLL completion handling */
+ struct list_head iopoll_node;
/* for private io_kiocb freeing */
struct rcu_head rcu_head;
};
diff --git a/io_uring/rw.c b/io_uring/rw.c
index 307f1f39d9f3..c33c533a267e 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -1296,12 +1296,13 @@ static int io_uring_hybrid_poll(struct io_kiocb *req,
struct io_comp_batch *iob, unsigned int poll_flags)
{
struct io_ring_ctx *ctx = req->ctx;
- u64 runtime, sleep_time;
+ u64 runtime, sleep_time, iopoll_start;
int ret;
+ iopoll_start = READ_ONCE(req->iopoll_start);
sleep_time = io_hybrid_iopoll_delay(ctx, req);
ret = io_uring_classic_poll(req, iob, poll_flags);
- runtime = ktime_get_ns() - req->iopoll_start - sleep_time;
+ runtime = ktime_get_ns() - iopoll_start - sleep_time;
/*
* Use minimum sleep time if we're polling devices with different
--
Jens Axboe
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH v2] io_uring: fix IOPOLL with passthrough I/O
2026-01-14 15:28 [PATCH v2] io_uring: fix IOPOLL with passthrough I/O Jens Axboe
@ 2026-01-14 22:08 ` Yi Zhang
2026-01-15 1:42 ` Ming Lei
1 sibling, 0 replies; 3+ messages in thread
From: Yi Zhang @ 2026-01-14 22:08 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, Ming Lei
On Wed, Jan 14, 2026 at 11:29 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> A previous commit improving IOPOLL made an incorrect assumption that
> task_work isn't used with IOPOLL. This can cause crashes when doing
> passthrough I/O on nvme, where queueing the completion task_work will
> trample on the same memory that holds the completed list of requests.
>
> Fix it up by shuffling the members around, so we're not sharing any
> parts that end up getting used in this path.
I tried the v2 and confirmed the issue was fixed:
Tested-by: Yi Zhang <yi.zhang@redhat.com>
# ./check nvme/049
nvme/049 => nvme0n1 (basic test for uring-passthrough I/O on /dev/ngX) [passed]
runtime ... 7.991s
nvme/049 => nvme1n1 (basic test for uring-passthrough I/O on /dev/ngX) [passed]
runtime ... 7.970s
nvme/049 => nvme2n1 (basic test for uring-passthrough I/O on /dev/ngX) [passed]
runtime ... 7.965s
nvme/049 => nvme3n1 (basic test for uring-passthrough I/O on /dev/ngX) [passed]
runtime ... 7.975s
nvme/049 => nvme4n1 (basic test for uring-passthrough I/O on /dev/ngX) [passed]
runtime ... 8.003s
nvme/049 => nvme5n1 (basic test for uring-passthrough I/O on /dev/ngX) [passed]
runtime ... 7.999s
>
> Fixes: 3c7d76d6128a ("io_uring: IOPOLL polling improvements")
> Reported-by: Yi Zhang <yi.zhang@redhat.com>
> Link: https://lore.kernel.org/linux-block/CAHj4cs_SLPj9v9w5MgfzHKy+983enPx3ZQY2kMuMJ1202DBefw@mail.gmail.com/
> Cc: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>
> ---
>
> v2: ensure ->iopoll_start is read before doing actual polling
>
> diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
> index e4c804f99c30..211686ad89fd 100644
> --- a/include/linux/io_uring_types.h
> +++ b/include/linux/io_uring_types.h
> @@ -713,13 +713,10 @@ struct io_kiocb {
> atomic_t refs;
> bool cancel_seq_set;
>
> - /*
> - * IOPOLL doesn't use task_work, so use the ->iopoll_node list
> - * entry to manage pending iopoll requests.
> - */
> union {
> struct io_task_work io_task_work;
> - struct list_head iopoll_node;
> + /* For IOPOLL setup queues, with hybrid polling */
> + u64 iopoll_start;
> };
>
> union {
> @@ -728,8 +725,8 @@ struct io_kiocb {
> * poll
> */
> struct hlist_node hash_node;
> - /* For IOPOLL setup queues, with hybrid polling */
> - u64 iopoll_start;
> + /* IOPOLL completion handling */
> + struct list_head iopoll_node;
> /* for private io_kiocb freeing */
> struct rcu_head rcu_head;
> };
> diff --git a/io_uring/rw.c b/io_uring/rw.c
> index 307f1f39d9f3..c33c533a267e 100644
> --- a/io_uring/rw.c
> +++ b/io_uring/rw.c
> @@ -1296,12 +1296,13 @@ static int io_uring_hybrid_poll(struct io_kiocb *req,
> struct io_comp_batch *iob, unsigned int poll_flags)
> {
> struct io_ring_ctx *ctx = req->ctx;
> - u64 runtime, sleep_time;
> + u64 runtime, sleep_time, iopoll_start;
> int ret;
>
> + iopoll_start = READ_ONCE(req->iopoll_start);
> sleep_time = io_hybrid_iopoll_delay(ctx, req);
> ret = io_uring_classic_poll(req, iob, poll_flags);
> - runtime = ktime_get_ns() - req->iopoll_start - sleep_time;
> + runtime = ktime_get_ns() - iopoll_start - sleep_time;
>
> /*
> * Use minimum sleep time if we're polling devices with different
> --
> Jens Axboe
>
--
Best Regards,
Yi Zhang
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH v2] io_uring: fix IOPOLL with passthrough I/O
2026-01-14 15:28 [PATCH v2] io_uring: fix IOPOLL with passthrough I/O Jens Axboe
2026-01-14 22:08 ` Yi Zhang
@ 2026-01-15 1:42 ` Ming Lei
1 sibling, 0 replies; 3+ messages in thread
From: Ming Lei @ 2026-01-15 1:42 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, Yi Zhang
On Wed, Jan 14, 2026 at 08:28:49AM -0700, Jens Axboe wrote:
> A previous commit improving IOPOLL made an incorrect assumption that
> task_work isn't used with IOPOLL. This can cause crashes when doing
> passthrough I/O on nvme, where queueing the completion task_work will
> trample on the same memory that holds the completed list of requests.
>
> Fix it up by shuffling the members around, so we're not sharing any
> parts that end up getting used in this path.
>
> Fixes: 3c7d76d6128a ("io_uring: IOPOLL polling improvements")
> Reported-by: Yi Zhang <yi.zhang@redhat.com>
> Link: https://lore.kernel.org/linux-block/CAHj4cs_SLPj9v9w5MgfzHKy+983enPx3ZQY2kMuMJ1202DBefw@mail.gmail.com/
> Cc: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>
> ---
>
> v2: ensure ->iopoll_start is read before doing actual polling
Looks fine, also not see regression in ublk selftest:
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Thanks,
Ming
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-01-15 1:42 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-14 15:28 [PATCH v2] io_uring: fix IOPOLL with passthrough I/O Jens Axboe
2026-01-14 22:08 ` Yi Zhang
2026-01-15 1:42 ` Ming Lei
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox