public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
From: Dan Carpenter <error27@gmail.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: io-uring@vger.kernel.org
Subject: [bug report] io_uring/register: fix ring resizing with mixed/large SQEs/CQEs
Date: Fri, 24 Apr 2026 08:34:51 +0300	[thread overview]
Message-ID: <aesA-_AEKgF_oucO@stanley.mountain> (raw)

Hello Jens Axboe,

Commit 45cd95763e19 ("io_uring/register: fix ring resizing with
mixed/large SQEs/CQEs") from Apr 20, 2026 (linux-next), leads to the
following Smatch static checker warning:

    io_uring/register.c:613 io_register_resize_rings()
    warn: potential integer overflow from user (local copy) 'p->sq_entries << 1'

    io_uring/register.c:643 io_register_resize_rings()
    warn: potential integer overflow from user (local copy) 'p->cq_entries << 1'

io_uring/register.c
    498 static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg)
    499 {
    500         struct io_ctx_config config;
    501         struct io_uring_region_desc rd;
    502         struct io_ring_ctx_rings o = { }, n = { }, *to_free = NULL;
    503         unsigned i, tail, old_head;
    504         struct io_uring_params *p = &config.p;
    505         struct io_rings_layout *rl = &config.layout;
    506         int ret;
    507 
    508         memset(&config, 0, sizeof(config));
    509 
    510         /* limited to DEFER_TASKRUN for now */
    511         if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN))
    512                 return -EINVAL;
    513         if (copy_from_user(p, arg, sizeof(*p)))

p comes from the user.  p->sq_entries and p->cq_entries are u32.

    514                 return -EFAULT;
    515         if (p->flags & ~RESIZE_FLAGS)
    516                 return -EINVAL;
    517 
    518         /* properties that are always inherited */
    519         p->flags |= (ctx->flags & COPY_FLAGS);
    520 
    521         ret = io_prepare_config(&config);
    522         if (unlikely(ret))
    523                 return ret;
    524 
    525         memset(&rd, 0, sizeof(rd));
    526         rd.size = PAGE_ALIGN(rl->rings_size);
    527         if (p->flags & IORING_SETUP_NO_MMAP) {
    528                 rd.user_addr = p->cq_off.user_addr;
    529                 rd.flags |= IORING_MEM_REGION_TYPE_USER;
    530         }
    531         ret = io_create_region(ctx, &n.ring_region, &rd, IORING_OFF_CQ_RING);
    532         if (ret)
    533                 return ret;
    534 
    535         n.rings = io_region_get_ptr(&n.ring_region);
    536 
    537         /*
    538          * At this point n.rings is shared with userspace, just like o.rings
    539          * is as well. While we don't expect userspace to modify it while
    540          * a resize is in progress, and it's most likely that userspace will
    541          * shoot itself in the foot if it does, we can't always assume good
    542          * intent... Use read/write once helpers from here on to indicate the
    543          * shared nature of it.
    544          */
    545         WRITE_ONCE(n.rings->sq_ring_mask, p->sq_entries - 1);
    546         WRITE_ONCE(n.rings->cq_ring_mask, p->cq_entries - 1);

If p->sq_entries is 0 then this wraps to U32_MAX

    547         WRITE_ONCE(n.rings->sq_ring_entries, p->sq_entries);
    548         WRITE_ONCE(n.rings->cq_ring_entries, p->cq_entries);
    549 
    550         if (copy_to_user(arg, p, sizeof(*p))) {
    551                 io_register_free_rings(ctx, &n);
    552                 return -EFAULT;
    553         }
    554 
    555         memset(&rd, 0, sizeof(rd));
    556         rd.size = PAGE_ALIGN(rl->sq_size);
    557         if (p->flags & IORING_SETUP_NO_MMAP) {
    558                 rd.user_addr = p->sq_off.user_addr;
    559                 rd.flags |= IORING_MEM_REGION_TYPE_USER;
    560         }
    561         ret = io_create_region(ctx, &n.sq_region, &rd, IORING_OFF_SQES);
    562         if (ret) {
    563                 io_register_free_rings(ctx, &n);
    564                 return ret;
    565         }
    566         n.sq_sqes = io_region_get_ptr(&n.sq_region);
    567 
    568         /*
    569          * If using SQPOLL, park the thread
    570          */
    571         if (ctx->sq_data) {
    572                 mutex_unlock(&ctx->uring_lock);
    573                 io_sq_thread_park(ctx->sq_data);
    574                 mutex_lock(&ctx->uring_lock);
    575         }
    576 
    577         /*
    578          * We'll do the swap. Grab the ctx->mmap_lock, which will exclude
    579          * any new mmap's on the ring fd. Clear out existing mappings to prevent
    580          * mmap from seeing them, as we'll unmap them. Any attempt to mmap
    581          * existing rings beyond this point will fail. Not that it could proceed
    582          * at this point anyway, as the io_uring mmap side needs go grab the
    583          * ctx->mmap_lock as well. Likewise, hold the completion lock over the
    584          * duration of the actual swap.
    585          */
    586         mutex_lock(&ctx->mmap_lock);
    587         spin_lock(&ctx->completion_lock);
    588         o.rings = ctx->rings;
    589         ctx->rings = NULL;
    590         o.sq_sqes = ctx->sq_sqes;
    591         ctx->sq_sqes = NULL;
    592 
    593         /*
    594          * Now copy SQ and CQ entries, if any. If either of the destination
    595          * rings can't hold what is already there, then fail the operation.
    596          */
    597         tail = READ_ONCE(o.rings->sq.tail);
    598         old_head = READ_ONCE(o.rings->sq.head);
    599         if (tail - old_head > p->sq_entries)
    600                 goto overflow;

I guess if p->sq_entries were zero then we would hit this goto

    601         for (i = old_head; i < tail; i++) {
    602                 unsigned index, dst_mask, src_mask;
    603                 size_t sq_size;
    604 
    605                 index = i;
    606                 sq_size = sizeof(struct io_uring_sqe);
    607                 src_mask = ctx->sq_entries - 1;
    608                 dst_mask = p->sq_entries - 1;
    609                 if (ctx->flags & IORING_SETUP_SQE128) {
    610                         index <<= 1;
    611                         sq_size <<= 1;
    612                         src_mask = (ctx->sq_entries << 1) - 1;
--> 613                         dst_mask = (p->sq_entries << 1) - 1;

These shifts could integer overflow.  So if you picked p->sq_entries
which was (1U << 31) then the (p->sq_entries << 1) would be zero and
the mask would be 0xffffffff.  Which might even be intentional, since
overflowing to zero and subtracting one is an idiom...

regards,
dan carpenter

    614                 }
    615                 memcpy(&n.sq_sqes[index & dst_mask], &o.sq_sqes[index & src_mask], sq_size);
    616         }
    617         WRITE_ONCE(n.rings->sq.head, old_head);
    618         WRITE_ONCE(n.rings->sq.tail, tail);
    619 
    620         tail = READ_ONCE(o.rings->cq.tail);
    621         old_head = READ_ONCE(o.rings->cq.head);
    622         if (tail - old_head > p->cq_entries) {
    623 overflow:
    624                 /* restore old rings, and return -EOVERFLOW via cleanup path */
    625                 ctx->rings = o.rings;
    626                 ctx->sq_sqes = o.sq_sqes;
    627                 to_free = &n;
    628                 ret = -EOVERFLOW;
    629                 goto out;
    630         }
    631         for (i = old_head; i < tail; i++) {
    632                 unsigned index, dst_mask, src_mask;
    633                 size_t cq_size;
    634 
    635                 index = i;
    636                 cq_size = sizeof(struct io_uring_cqe);
    637                 src_mask = ctx->cq_entries - 1;
    638                 dst_mask = p->cq_entries - 1;
    639                 if (ctx->flags & IORING_SETUP_CQE32) {
    640                         index <<= 1;
    641                         cq_size <<= 1;
    642                         src_mask = (ctx->cq_entries << 1) - 1;
    643                         dst_mask = (p->cq_entries << 1) - 1;
    644                 }
    645                 memcpy(&n.rings->cqes[index & dst_mask], &o.rings->cqes[index & src_mask], cq_size);
    646         }
    647         WRITE_ONCE(n.rings->cq.head, old_head);
    648         WRITE_ONCE(n.rings->cq.tail, tail);
    649         /* invalidate cached cqe refill */
    650         ctx->cqe_cached = ctx->cqe_sentinel = NULL;
    651 
    652         WRITE_ONCE(n.rings->sq_dropped, READ_ONCE(o.rings->sq_dropped));
    653         atomic_set(&n.rings->sq_flags, atomic_read(&o.rings->sq_flags));
    654         WRITE_ONCE(n.rings->cq_flags, READ_ONCE(o.rings->cq_flags));
    655         WRITE_ONCE(n.rings->cq_overflow, READ_ONCE(o.rings->cq_overflow));
    656 
    657         /* all done, store old pointers and assign new ones */
    658         if (!(ctx->flags & IORING_SETUP_NO_SQARRAY))
    659                 ctx->sq_array = (u32 *)((char *)n.rings + rl->sq_array_offset);
    660 
    661         ctx->sq_entries = p->sq_entries;
    662         ctx->cq_entries = p->cq_entries;
    663 
    664         /*
    665          * Just mark any flag we may have missed and that the application
    666          * should act on unconditionally. Worst case it'll be an extra
    667          * syscall.
    668          */
    669         atomic_or(IORING_SQ_TASKRUN | IORING_SQ_NEED_WAKEUP, &n.rings->sq_flags);
    670         ctx->rings = n.rings;
    671         rcu_assign_pointer(ctx->rings_rcu, n.rings);
    672 
    673         ctx->sq_sqes = n.sq_sqes;
    674         swap_old(ctx, o, n, ring_region);
    675         swap_old(ctx, o, n, sq_region);
    676         to_free = &o;
    677         ret = 0;
    678 out:
    679         spin_unlock(&ctx->completion_lock);
    680         mutex_unlock(&ctx->mmap_lock);
    681         /* Wait for concurrent io_ctx_mark_taskrun() */
    682         if (to_free == &o)
    683                 synchronize_rcu_expedited();
    684         io_register_free_rings(ctx, to_free);
    685 
    686         if (ctx->sq_data)
    687                 io_sq_thread_unpark(ctx->sq_data);
    688 
    689         return ret;
    690 }

This email is a free service from the Smatch-CI project [smatch.sf.net].

regards,
dan carpenter

             reply	other threads:[~2026-04-24  5:34 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-24  5:34 Dan Carpenter [this message]
2026-04-24 15:01 ` [bug report] io_uring/register: fix ring resizing with mixed/large SQEs/CQEs Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aesA-_AEKgF_oucO@stanley.mountain \
    --to=error27@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox