* [PATCH 0/2] timeout immediate arg
@ 2026-02-24 16:12 Pavel Begunkov
2026-02-24 16:12 ` [PATCH 1/2] io_uring/timeout: READ_ONCE sqe->addr Pavel Begunkov
2026-02-24 16:12 ` [PATCH 2/2] io_uring/timeout: immediate timeout arg Pavel Begunkov
0 siblings, 2 replies; 6+ messages in thread
From: Pavel Begunkov @ 2026-02-24 16:12 UTC (permalink / raw)
To: io-uring; +Cc: asml.silence, axboe
Allow the user to pass the timeout value inside the SQE instead of
pointing to a timespec, people asked for it as it makes user space
simpler. More details description is in Patch 2.
Pavel Begunkov (2):
io_uring/timeout: READ_ONCE sqe->addr
io_uring/timeout: immediate timeout arg
include/uapi/linux/io_uring.h | 5 +++++
io_uring/timeout.c | 11 +++++++++--
2 files changed, 14 insertions(+), 2 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] io_uring/timeout: READ_ONCE sqe->addr
2026-02-24 16:12 [PATCH 0/2] timeout immediate arg Pavel Begunkov
@ 2026-02-24 16:12 ` Pavel Begunkov
2026-02-24 16:47 ` Keith Busch
2026-02-24 16:12 ` [PATCH 2/2] io_uring/timeout: immediate timeout arg Pavel Begunkov
1 sibling, 1 reply; 6+ messages in thread
From: Pavel Begunkov @ 2026-02-24 16:12 UTC (permalink / raw)
To: io-uring; +Cc: asml.silence, axboe
We should use READ_ONCE when reading from a SQE, make sure timeout gets
a stable timespec address.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
io_uring/timeout.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/io_uring/timeout.c b/io_uring/timeout.c
index 84dda24f3eb2..d97f67d85ea3 100644
--- a/io_uring/timeout.c
+++ b/io_uring/timeout.c
@@ -557,7 +557,7 @@ static int __io_timeout_prep(struct io_kiocb *req,
data->req = req;
data->flags = flags;
- if (get_timespec64(&data->ts, u64_to_user_ptr(sqe->addr)))
+ if (get_timespec64(&data->ts, u64_to_user_ptr(READ_ONCE(sqe->addr))))
return -EFAULT;
if (data->ts.tv_sec < 0 || data->ts.tv_nsec < 0)
--
2.53.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] io_uring/timeout: immediate timeout arg
2026-02-24 16:12 [PATCH 0/2] timeout immediate arg Pavel Begunkov
2026-02-24 16:12 ` [PATCH 1/2] io_uring/timeout: READ_ONCE sqe->addr Pavel Begunkov
@ 2026-02-24 16:12 ` Pavel Begunkov
2026-02-24 18:31 ` Jens Axboe
1 sibling, 1 reply; 6+ messages in thread
From: Pavel Begunkov @ 2026-02-24 16:12 UTC (permalink / raw)
To: io-uring; +Cc: asml.silence, axboe
One the things the user has always keep in mind is that any user
pointers they put into an SQE is not going to be read by the kernel
until submission happens, and the user has to ensure the pointee
stays alive until then. For example, this snippet:
void prep_timeout(struct io_uring_sqe *sqe) {
struct __kernel_timespec ts = {...};
prep_timeout(sqe, &ts);
}
void submit() {
sqe = get_sqe();
prep_timeout(sqe);
io_uring_submit();
}
Would lead to UAF for the on stack variable ts. Instead of passing
the timeout value as a pointer allow to store it immediately in the SQE.
The user has to set a new flag called IORING_TIMEOUT_IMMEDIATE_ARG,
in which case sqe->addr will be interpreted as the timeout value in ns.
It only works with relative timeouts and rejected if set together with
IORING_TIMEOUT_ABS out of concerns of not having enough range in u64 to
represent a good long term API.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
include/uapi/linux/io_uring.h | 5 +++++
io_uring/timeout.c | 11 +++++++++--
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 6750c383a2ab..8f4de786e6e9 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -340,6 +340,10 @@ enum io_uring_op {
/*
* sqe->timeout_flags
+ *
+ * IORING_TIMEOUT_IMMEDIATE_ARG: If set, sqe->addr stores the timeout
+ * value in nanoseconds instead of
+ * pointing to a timespec.
*/
#define IORING_TIMEOUT_ABS (1U << 0)
#define IORING_TIMEOUT_UPDATE (1U << 1)
@@ -348,6 +352,7 @@ enum io_uring_op {
#define IORING_LINK_TIMEOUT_UPDATE (1U << 4)
#define IORING_TIMEOUT_ETIME_SUCCESS (1U << 5)
#define IORING_TIMEOUT_MULTISHOT (1U << 6)
+#define IORING_TIMEOUT_IMMEDIATE_ARG (1U << 7)
#define IORING_TIMEOUT_CLOCK_MASK (IORING_TIMEOUT_BOOTTIME | IORING_TIMEOUT_REALTIME)
#define IORING_TIMEOUT_UPDATE_MASK (IORING_TIMEOUT_UPDATE | IORING_LINK_TIMEOUT_UPDATE)
/*
diff --git a/io_uring/timeout.c b/io_uring/timeout.c
index d97f67d85ea3..e051c8374c1a 100644
--- a/io_uring/timeout.c
+++ b/io_uring/timeout.c
@@ -528,7 +528,8 @@ static int __io_timeout_prep(struct io_kiocb *req,
flags = READ_ONCE(sqe->timeout_flags);
if (flags & ~(IORING_TIMEOUT_ABS | IORING_TIMEOUT_CLOCK_MASK |
IORING_TIMEOUT_ETIME_SUCCESS |
- IORING_TIMEOUT_MULTISHOT))
+ IORING_TIMEOUT_MULTISHOT |
+ IORING_TIMEOUT_IMMEDIATE_ARG))
return -EINVAL;
/* more than one clock specified is invalid, obviously */
if (hweight32(flags & IORING_TIMEOUT_CLOCK_MASK) > 1)
@@ -557,8 +558,14 @@ static int __io_timeout_prep(struct io_kiocb *req,
data->req = req;
data->flags = flags;
- if (get_timespec64(&data->ts, u64_to_user_ptr(READ_ONCE(sqe->addr))))
+ if (flags & IORING_TIMEOUT_IMMEDIATE_ARG) {
+ if (flags & IORING_TIMEOUT_ABS)
+ return -EINVAL;
+ data->ts = ns_to_timespec64(READ_ONCE(sqe->addr));
+ } else if (get_timespec64(&data->ts,
+ u64_to_user_ptr(READ_ONCE(sqe->addr)))) {
return -EFAULT;
+ }
if (data->ts.tv_sec < 0 || data->ts.tv_nsec < 0)
return -EINVAL;
--
2.53.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] io_uring/timeout: READ_ONCE sqe->addr
2026-02-24 16:12 ` [PATCH 1/2] io_uring/timeout: READ_ONCE sqe->addr Pavel Begunkov
@ 2026-02-24 16:47 ` Keith Busch
2026-02-24 18:28 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Keith Busch @ 2026-02-24 16:47 UTC (permalink / raw)
To: Pavel Begunkov; +Cc: io-uring, axboe
On Tue, Feb 24, 2026 at 04:12:10PM +0000, Pavel Begunkov wrote:
> @@ -557,7 +557,7 @@ static int __io_timeout_prep(struct io_kiocb *req,
> data->req = req;
> data->flags = flags;
>
> - if (get_timespec64(&data->ts, u64_to_user_ptr(sqe->addr)))
> + if (get_timespec64(&data->ts, u64_to_user_ptr(READ_ONCE(sqe->addr))))
> return -EFAULT;
Should io_timeout_remove_prep() get the same update? Otherwise looks
good.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] io_uring/timeout: READ_ONCE sqe->addr
2026-02-24 16:47 ` Keith Busch
@ 2026-02-24 18:28 ` Jens Axboe
0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2026-02-24 18:28 UTC (permalink / raw)
To: Keith Busch, Pavel Begunkov; +Cc: io-uring
On 2/24/26 9:47 AM, Keith Busch wrote:
> On Tue, Feb 24, 2026 at 04:12:10PM +0000, Pavel Begunkov wrote:
>> @@ -557,7 +557,7 @@ static int __io_timeout_prep(struct io_kiocb *req,
>> data->req = req;
>> data->flags = flags;
>>
>> - if (get_timespec64(&data->ts, u64_to_user_ptr(sqe->addr)))
>> + if (get_timespec64(&data->ts, u64_to_user_ptr(READ_ONCE(sqe->addr))))
>> return -EFAULT;
>
> Should io_timeout_remove_prep() get the same update? Otherwise looks
> good.
Yep looks like. Just went hunting if we missed this in other spots,
and only other one I can find is in io_uring_cmd_getsockname().
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] io_uring/timeout: immediate timeout arg
2026-02-24 16:12 ` [PATCH 2/2] io_uring/timeout: immediate timeout arg Pavel Begunkov
@ 2026-02-24 18:31 ` Jens Axboe
0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2026-02-24 18:31 UTC (permalink / raw)
To: Pavel Begunkov, io-uring
On 2/24/26 9:12 AM, Pavel Begunkov wrote:
> One the things the user has always keep in mind is that any user
> pointers they put into an SQE is not going to be read by the kernel
> until submission happens, and the user has to ensure the pointee
> stays alive until then. For example, this snippet:
>
> void prep_timeout(struct io_uring_sqe *sqe) {
> struct __kernel_timespec ts = {...};
> prep_timeout(sqe, &ts);
> }
>
> void submit() {
> sqe = get_sqe();
> prep_timeout(sqe);
> io_uring_submit();
> }
>
> Would lead to UAF for the on stack variable ts. Instead of passing
> the timeout value as a pointer allow to store it immediately in the SQE.
> The user has to set a new flag called IORING_TIMEOUT_IMMEDIATE_ARG,
> in which case sqe->addr will be interpreted as the timeout value in ns.
> It only works with relative timeouts and rejected if set together with
> IORING_TIMEOUT_ABS out of concerns of not having enough range in u64 to
> represent a good long term API.
>
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
> include/uapi/linux/io_uring.h | 5 +++++
> io_uring/timeout.c | 11 +++++++++--
> 2 files changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
> index 6750c383a2ab..8f4de786e6e9 100644
> --- a/include/uapi/linux/io_uring.h
> +++ b/include/uapi/linux/io_uring.h
> @@ -340,6 +340,10 @@ enum io_uring_op {
>
> /*
> * sqe->timeout_flags
> + *
> + * IORING_TIMEOUT_IMMEDIATE_ARG: If set, sqe->addr stores the timeout
> + * value in nanoseconds instead of
> + * pointing to a timespec.
> */
> #define IORING_TIMEOUT_ABS (1U << 0)
> #define IORING_TIMEOUT_UPDATE (1U << 1)
> @@ -348,6 +352,7 @@ enum io_uring_op {
> #define IORING_LINK_TIMEOUT_UPDATE (1U << 4)
> #define IORING_TIMEOUT_ETIME_SUCCESS (1U << 5)
> #define IORING_TIMEOUT_MULTISHOT (1U << 6)
> +#define IORING_TIMEOUT_IMMEDIATE_ARG (1U << 7)
> #define IORING_TIMEOUT_CLOCK_MASK (IORING_TIMEOUT_BOOTTIME | IORING_TIMEOUT_REALTIME)
> #define IORING_TIMEOUT_UPDATE_MASK (IORING_TIMEOUT_UPDATE | IORING_LINK_TIMEOUT_UPDATE)
> /*
> diff --git a/io_uring/timeout.c b/io_uring/timeout.c
> index d97f67d85ea3..e051c8374c1a 100644
> --- a/io_uring/timeout.c
> +++ b/io_uring/timeout.c
> @@ -528,7 +528,8 @@ static int __io_timeout_prep(struct io_kiocb *req,
> flags = READ_ONCE(sqe->timeout_flags);
> if (flags & ~(IORING_TIMEOUT_ABS | IORING_TIMEOUT_CLOCK_MASK |
> IORING_TIMEOUT_ETIME_SUCCESS |
> - IORING_TIMEOUT_MULTISHOT))
> + IORING_TIMEOUT_MULTISHOT |
> + IORING_TIMEOUT_IMMEDIATE_ARG))
> return -EINVAL;
> /* more than one clock specified is invalid, obviously */
> if (hweight32(flags & IORING_TIMEOUT_CLOCK_MASK) > 1)
> @@ -557,8 +558,14 @@ static int __io_timeout_prep(struct io_kiocb *req,
> data->req = req;
> data->flags = flags;
>
> - if (get_timespec64(&data->ts, u64_to_user_ptr(READ_ONCE(sqe->addr))))
> + if (flags & IORING_TIMEOUT_IMMEDIATE_ARG) {
> + if (flags & IORING_TIMEOUT_ABS)
> + return -EINVAL;
> + data->ts = ns_to_timespec64(READ_ONCE(sqe->addr));
> + } else if (get_timespec64(&data->ts,
> + u64_to_user_ptr(READ_ONCE(sqe->addr)))) {
> return -EFAULT;
> + }
>
> if (data->ts.tv_sec < 0 || data->ts.tv_nsec < 0)
> return -EINVAL;
Looks good to me on the feature side, makes sense. But like the 1/2
patch, this one needs to update the remove side as well to support the
immediate arg.
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-02-24 18:31 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-24 16:12 [PATCH 0/2] timeout immediate arg Pavel Begunkov
2026-02-24 16:12 ` [PATCH 1/2] io_uring/timeout: READ_ONCE sqe->addr Pavel Begunkov
2026-02-24 16:47 ` Keith Busch
2026-02-24 18:28 ` Jens Axboe
2026-02-24 16:12 ` [PATCH 2/2] io_uring/timeout: immediate timeout arg Pavel Begunkov
2026-02-24 18:31 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox