public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Pavel Begunkov <asml.silence@gmail.com>, io-uring@vger.kernel.org
Cc: naup96721@gmail.com, stable@vger.kernel.org
Subject: Re: [PATCH 1/2] io_uring: ensure ctx->rings is stable for task work flags manipulation
Date: Wed, 11 Mar 2026 07:05:52 -0600	[thread overview]
Message-ID: <b50163ce-bb1c-48e4-9a32-037b26c50161@kernel.dk> (raw)
In-Reply-To: <39d1678f-7a7e-43d5-a92d-0b26b9bfd44e@gmail.com>

On 3/11/26 5:13 AM, Pavel Begunkov wrote:
> On 3/10/26 14:45, Jens Axboe wrote:
>> If DEFER_TASKRUN | SETUP_TASKRUN is used and task work is added while
>> the ring is being resized, it's possible for the OR'ing of
>> IORING_SQ_TASKRUN to happen in the small window of swapping into the
>> new rings and the old rings being freed.
>>
>> Prevent this by adding a 2nd ->rings pointer, ->rings_rcu, which is
>> protected by RCU. The task work flags manipulation is inside RCU
>> already, and if the resize ring freeing is done post an RCU synchronize,
>> then there's no need to add locking to the fast path of task work
>> additions.
>>
>> Note: this is only done for DEFER_TASKRUN, as that's the only setup mode
>> that supports ring resizing. If this ever changes, then they too need to
>> use the io_ctx_mark_taskrun() helper.
>>
>> Link: https://lore.kernel.org/io-uring/20260309062759.482210-1-naup96721@gmail.com/
>> Cc: stable@vger.kernel.org
>> Fixes: 79cfe9e59c2a ("io_uring/register: add IORING_REGISTER_RESIZE_RINGS")
>> Reported-by: Hao-Yu Yang <naup96721@gmail.com>
>> Suggested-by: Pavel Begunkov <asml.silence@gmail.com>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>>   include/linux/io_uring_types.h |  1 +
>>   io_uring/io_uring.c            |  2 ++
>>   io_uring/register.c            | 20 ++++++++++++++++++--
>>   io_uring/tw.c                  | 24 ++++++++++++++++++++++--
>>   4 files changed, 43 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
>> index 3e4a82a6f817..dd1420bfcb73 100644
>> --- a/include/linux/io_uring_types.h
>> +++ b/include/linux/io_uring_types.h
>> @@ -388,6 +388,7 @@ struct io_ring_ctx {
>>        * regularly bounce b/w CPUs.
>>        */
>>       struct {
>> +        struct io_rings    __rcu    *rings_rcu;
>>           struct llist_head    work_llist;
>>           struct llist_head    retry_llist;
>>           unsigned long        check_cq;
>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
>> index ccab8562d273..20fdc442e014 100644
>> --- a/io_uring/io_uring.c
>> +++ b/io_uring/io_uring.c
>> @@ -2066,6 +2066,7 @@ static void io_rings_free(struct io_ring_ctx *ctx)
>>       io_free_region(ctx->user, &ctx->sq_region);
>>       io_free_region(ctx->user, &ctx->ring_region);
>>       ctx->rings = NULL;
>> +    RCU_INIT_POINTER(ctx->rings_rcu, NULL);
>>       ctx->sq_sqes = NULL;
>>   }
>>   @@ -2703,6 +2704,7 @@ static __cold int io_allocate_scq_urings(struct io_ring_ctx *ctx,
>>       if (ret)
>>           return ret;
>>       ctx->rings = rings = io_region_get_ptr(&ctx->ring_region);
>> +    rcu_assign_pointer(ctx->rings_rcu, rings);
>>       if (!(ctx->flags & IORING_SETUP_NO_SQARRAY))
>>           ctx->sq_array = (u32 *)((char *)rings + rl->sq_array_offset);
>>   diff --git a/io_uring/register.c b/io_uring/register.c
>> index a839b22fd392..5f2985ba0879 100644
>> --- a/io_uring/register.c
>> +++ b/io_uring/register.c
>> @@ -487,6 +487,18 @@ static void io_register_free_rings(struct io_ring_ctx *ctx,
>>                IORING_SETUP_CQE32 | IORING_SETUP_NO_MMAP | \
>>                IORING_SETUP_CQE_MIXED | IORING_SETUP_SQE_MIXED)
>>   +static void io_resize_assign_rings(struct io_ring_ctx *ctx, struct io_rings *rings)
>> +{
>> +    /*
>> +     * Just mark any flag we may have missed and that the application
>> +     * should act on unconditionally. Worst case it'll be an extra
>> +     * syscall.
>> +     */
>> +    atomic_or(IORING_SQ_TASKRUN | IORING_SQ_NEED_WAKEUP, &rings->sq_flags);
>> +    ctx->rings = rings;
>> +    rcu_assign_pointer(ctx->rings_rcu, rings);
>> +}
>> +
>>   static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg)
>>   {
>>       struct io_ctx_config config;
>> @@ -579,6 +591,7 @@ static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg)
>>       spin_lock(&ctx->completion_lock);
>>       o.rings = ctx->rings;
>>       ctx->rings = NULL;
>> +    RCU_INIT_POINTER(ctx->rings_rcu, NULL);
>>       o.sq_sqes = ctx->sq_sqes;
>>       ctx->sq_sqes = NULL;
> 
> Should be better to not have a transient null, and then there
> is no need to check for that in task_work. I.e. don't zero it
> and only assign the new value if you successfully created a
> new set of rings.

That's a good idea, I like that. I'll make the change.

-- 
Jens Axboe


  reply	other threads:[~2026-03-11 13:05 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-10 14:45 [PATCHSET 0/2] Fix DEFER_TASKRUN ring resize flag manipulation Jens Axboe
2026-03-10 14:45 ` [PATCH 1/2] io_uring: ensure ctx->rings is stable for task work flags manipulation Jens Axboe
2026-03-11 11:13   ` Pavel Begunkov
2026-03-11 13:05     ` Jens Axboe [this message]
2026-03-10 14:45 ` [PATCH 2/2] io_uring/eventfd: use ctx->rings_rcu for flags checking Jens Axboe
  -- strict thread matches above, loose matches on Subject: below --
2026-03-11 13:11 [PATCHSET v2] Fix DEFER_TASKRUN ring resize flag manipulation Jens Axboe
2026-03-11 13:11 ` [PATCH 1/2] io_uring: ensure ctx->rings is stable for task work flags manipulation Jens Axboe
2026-03-11 15:06   ` Keith Busch
2026-03-11 15:12     ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b50163ce-bb1c-48e4-9a32-037b26c50161@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=asml.silence@gmail.com \
    --cc=io-uring@vger.kernel.org \
    --cc=naup96721@gmail.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox