Re: [PATCH v1] io_uring/register.c: fix NULL pointer dereference in io_register_resize_rings

public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed

From: Jens Axboe <axboe@kernel.dk>
To: Pavel Begunkov <asml.silence@gmail.com>,
	Linus Torvalds <torvalds@linuxfoundation.org>
Cc: Hao-Yu Yang <naup96721@gmail.com>,
	security@kernel.org, io-uring@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1] io_uring/register.c: fix NULL pointer dereference in io_register_resize_rings
Date: Mon, 9 Mar 2026 13:16:39 -0600	[thread overview]
Message-ID: <031212ee-bd05-46dd-ab2e-2be5859f3880@kernel.dk> (raw)
In-Reply-To: <b2dec323-e512-4570-a273-724c7c94a12a@gmail.com>

On 3/9/26 12:57 PM, Pavel Begunkov wrote:
> On 3/9/26 18:34, Jens Axboe wrote:
>> On 3/9/26 10:29 AM, Jens Axboe wrote:
>>> On Mar 9, 2026, at 10:05?AM, Linus Torvalds <torvalds@linuxfoundation.org> wrote:
>>>>
>>>> ?On Mon, 9 Mar 2026 at 06:11, Jens Axboe <axboe@kernel.dk> wrote:
>>>>>
>>>>> You probably want something ala:
>>>>>
>>>>> mutex_lock(&ctx->mmap_lock);
>>>>> spin_lock(&ctx->completion_lock();
>>>>> + local_irq_disable();
>>>>
>>>> How could that ever work?
>>>>
>>>> Irqs will happily continue on other CPUs, so disabling interrupts is
>>>> complete nonsense as far as I can tell - whether done with
>>>> spin_lock_irq() *or* with local_irq_disable()/.
>>>>
>>>> So basically, touching ctx->rings from irq context in this section is
>>>> simply not an option - or the rings pointer just needs to be updated
>>>> atomically so that it doesn't matter.
>>>>
>>>> I assume this was tested in qemu on a single-core setup, so that
>>>> fundamental mistake was hidden by an irrelevant configuration.
>>>>
>>>> Where is the actual oops - for some inexplicable reason that had been
>>>> edited out, and it only had the call trace leading up toit? Based on
>>>> the incomplete information and the faulting address of 0x24, I'm
>>>> *guessing* that it is
>>>>
>>>>         if (ctx->flags & IORING_SETUP_TASKRUN_FLAG)
>>>>                 atomic_or(IORING_SQ_TASKRUN, &ctx->rings->sq_flags);
>>>>
>>>> in io_req_normal_work_add(), but that may be complete garbage.
>>>>
>>>> So the actual fix may be to just make damn sure that
>>>> IORING_SETUP_TASKRUN_FLAG is *not* set when the rings are resized.
>>>>
>>>> But for all I know, (a) I may be looking at entirely the wrong place
>>>> and (b) there might be millions of other places that want to access
>>>> ctx->rings, so the above may be the rantings of a crazy old man.
>>>
>>> Nah you?re totally right. I?m operating in few hours of sleep and on a
>>> plane. I?ll take a closer look later. The flag mask protecting it is a
>>> good idea, another one could be just a specific irq safe resize lock
>>> would be better here.
>>
>> How about something like this? I don't particularly like using ->flags
>> for this, as these are otherwise static after the ring has been set up.
>> Hence it'd be better to to just use a separate value for this,
>> ->in_resize, and use smp_load_acquire/release. The write side can be as
>> expensive as we want it to be, as it's not a hot path at all. And the
>> acquire read should light weight enough here.
>>
>>
>> diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
>> index 3e4a82a6f817..428eb5b2c624 100644
>> --- a/include/linux/io_uring_types.h
>> +++ b/include/linux/io_uring_types.h
>> @@ -394,6 +394,7 @@ struct io_ring_ctx {
>>           atomic_t        cq_wait_nr;
>>           atomic_t        cq_timeouts;
>>           struct wait_queue_head    cq_wait;
>> +        int            in_resize;
>>       } ____cacheline_aligned_in_smp;
>>         /* timeouts */
>> diff --git a/io_uring/register.c b/io_uring/register.c
>> index 3378014e51fb..048a1dcd9df1 100644
>> --- a/io_uring/register.c
>> +++ b/io_uring/register.c
>> @@ -575,6 +575,7 @@ static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg)
>>        * ctx->mmap_lock as well. Likewise, hold the completion lock over the
>>        * duration of the actual swap.
>>        */
>> +    smp_store_release(&ctx->in_resize, 1);
>>       mutex_lock(&ctx->mmap_lock);
>>       spin_lock(&ctx->completion_lock);
>>       o.rings = ctx->rings;
>> @@ -647,6 +648,7 @@ static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg)
>>       if (ctx->sq_data)
>>           io_sq_thread_unpark(ctx->sq_data);
>>   +    smp_store_release(&ctx->in_resize, 0);
>>       return ret;
>>   }
>>   diff --git a/io_uring/tw.c b/io_uring/tw.c
>> index 1ee2b8ab07c8..c66ffa787ec7 100644
>> --- a/io_uring/tw.c
>> +++ b/io_uring/tw.c
>> @@ -152,6 +152,13 @@ void tctx_task_work(struct callback_head *cb)
>>       WARN_ON_ONCE(ret);
>>   }
>>   +static void io_mark_taskrun(struct io_ring_ctx *ctx)
>> +{
>> +    if (ctx->flags & IORING_SETUP_TASKRUN_FLAG &&
>> +        !smp_load_acquire(&ctx->in_resize))
>> +        atomic_or(IORING_SQ_TASKRUN, &ctx->rings->sq_flags);
>> +}
> 
> That's not going to work, same raciness, but you can protect the
> pointer with rcu + rcu sync on resize. Tips:
> 
> 1) sq_flags might get out of sync at the end. Either say that
> users should never try to resize with inflight reqs, or just
> hand set all flags, e.g. SQ_WAKE can be set unconditionally
> 
> 2) For a fix, it'll likely be cleaner to keep ->rings as is
> and introduce a second pointer (rcu protected).

Yeah you are right, it's not enough, it just shrinks the gap but it's
still there. rcu sync on resize is probably the best way. I'll take
another look tomorrow after some sleep.

-- 
Jens Axboe

     prev parent reply	other threads:[~2026-03-09 19:16 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-09  6:27 [PATCH v1] io_uring/register.c: fix NULL pointer dereference in io_register_resize_rings Hao-Yu Yang
2026-03-09 13:11 ` Jens Axboe
2026-03-09 16:04   ` Linus Torvalds
2026-03-09 16:29     ` Jens Axboe
2026-03-09 18:34       ` Jens Axboe
2026-03-09 18:35         ` Jens Axboe
2026-03-09 19:03           ` Linus Torvalds
2026-03-09 19:22             ` Jens Axboe
2026-03-09 18:57         ` Pavel Begunkov
2026-03-09 19:16           ` Jens Axboe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=031212ee-bd05-46dd-ab2e-2be5859f3880@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=asml.silence@gmail.com \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=naup96721@gmail.com \
    --cc=security@kernel.org \
    --cc=torvalds@linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox