[bug report] io_uring/register: fix ring resizing with mixed/large SQEs/CQEs

public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed

* [bug report] io_uring/register: fix ring resizing with mixed/large SQEs/CQEs
@ 2026-04-24  5:34 Dan Carpenter
  2026-04-24 15:01 ` Jens Axboe
  0 siblings, 1 reply; 2+ messages in thread
From: Dan Carpenter @ 2026-04-24  5:34 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring

Hello Jens Axboe,

Commit 45cd95763e19 ("io_uring/register: fix ring resizing with
mixed/large SQEs/CQEs") from Apr 20, 2026 (linux-next), leads to the
following Smatch static checker warning:

    io_uring/register.c:613 io_register_resize_rings()
    warn: potential integer overflow from user (local copy) 'p->sq_entries << 1'

    io_uring/register.c:643 io_register_resize_rings()
    warn: potential integer overflow from user (local copy) 'p->cq_entries << 1'

io_uring/register.c
    498 static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg)
    499 {
    500         struct io_ctx_config config;
    501         struct io_uring_region_desc rd;
    502         struct io_ring_ctx_rings o = { }, n = { }, *to_free = NULL;
    503         unsigned i, tail, old_head;
    504         struct io_uring_params *p = &config.p;
    505         struct io_rings_layout *rl = &config.layout;
    506         int ret;
    507 
    508         memset(&config, 0, sizeof(config));
    509 
    510         /* limited to DEFER_TASKRUN for now */
    511         if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN))
    512                 return -EINVAL;
    513         if (copy_from_user(p, arg, sizeof(*p)))

p comes from the user.  p->sq_entries and p->cq_entries are u32.

    514                 return -EFAULT;
    515         if (p->flags & ~RESIZE_FLAGS)
    516                 return -EINVAL;
    517 
    518         /* properties that are always inherited */
    519         p->flags |= (ctx->flags & COPY_FLAGS);
    520 
    521         ret = io_prepare_config(&config);
    522         if (unlikely(ret))
    523                 return ret;
    524 
    525         memset(&rd, 0, sizeof(rd));
    526         rd.size = PAGE_ALIGN(rl->rings_size);
    527         if (p->flags & IORING_SETUP_NO_MMAP) {
    528                 rd.user_addr = p->cq_off.user_addr;
    529                 rd.flags |= IORING_MEM_REGION_TYPE_USER;
    530         }
    531         ret = io_create_region(ctx, &n.ring_region, &rd, IORING_OFF_CQ_RING);
    532         if (ret)
    533                 return ret;
    534 
    535         n.rings = io_region_get_ptr(&n.ring_region);
    536 
    537         /*
    538          * At this point n.rings is shared with userspace, just like o.rings
    539          * is as well. While we don't expect userspace to modify it while
    540          * a resize is in progress, and it's most likely that userspace will
    541          * shoot itself in the foot if it does, we can't always assume good
    542          * intent... Use read/write once helpers from here on to indicate the
    543          * shared nature of it.
    544          */
    545         WRITE_ONCE(n.rings->sq_ring_mask, p->sq_entries - 1);
    546         WRITE_ONCE(n.rings->cq_ring_mask, p->cq_entries - 1);

If p->sq_entries is 0 then this wraps to U32_MAX

    547         WRITE_ONCE(n.rings->sq_ring_entries, p->sq_entries);
    548         WRITE_ONCE(n.rings->cq_ring_entries, p->cq_entries);
    549 
    550         if (copy_to_user(arg, p, sizeof(*p))) {
    551                 io_register_free_rings(ctx, &n);
    552                 return -EFAULT;
    553         }
    554 
    555         memset(&rd, 0, sizeof(rd));
    556         rd.size = PAGE_ALIGN(rl->sq_size);
    557         if (p->flags & IORING_SETUP_NO_MMAP) {
    558                 rd.user_addr = p->sq_off.user_addr;
    559                 rd.flags |= IORING_MEM_REGION_TYPE_USER;
    560         }
    561         ret = io_create_region(ctx, &n.sq_region, &rd, IORING_OFF_SQES);
    562         if (ret) {
    563                 io_register_free_rings(ctx, &n);
    564                 return ret;
    565         }
    566         n.sq_sqes = io_region_get_ptr(&n.sq_region);
    567 
    568         /*
    569          * If using SQPOLL, park the thread
    570          */
    571         if (ctx->sq_data) {
    572                 mutex_unlock(&ctx->uring_lock);
    573                 io_sq_thread_park(ctx->sq_data);
    574                 mutex_lock(&ctx->uring_lock);
    575         }
    576 
    577         /*
    578          * We'll do the swap. Grab the ctx->mmap_lock, which will exclude
    579          * any new mmap's on the ring fd. Clear out existing mappings to prevent
    580          * mmap from seeing them, as we'll unmap them. Any attempt to mmap
    581          * existing rings beyond this point will fail. Not that it could proceed
    582          * at this point anyway, as the io_uring mmap side needs go grab the
    583          * ctx->mmap_lock as well. Likewise, hold the completion lock over the
    584          * duration of the actual swap.
    585          */
    586         mutex_lock(&ctx->mmap_lock);
    587         spin_lock(&ctx->completion_lock);
    588         o.rings = ctx->rings;
    589         ctx->rings = NULL;
    590         o.sq_sqes = ctx->sq_sqes;
    591         ctx->sq_sqes = NULL;
    592 
    593         /*
    594          * Now copy SQ and CQ entries, if any. If either of the destination
    595          * rings can't hold what is already there, then fail the operation.
    596          */
    597         tail = READ_ONCE(o.rings->sq.tail);
    598         old_head = READ_ONCE(o.rings->sq.head);
    599         if (tail - old_head > p->sq_entries)
    600                 goto overflow;

I guess if p->sq_entries were zero then we would hit this goto

    601         for (i = old_head; i < tail; i++) {
    602                 unsigned index, dst_mask, src_mask;
    603                 size_t sq_size;
    604 
    605                 index = i;
    606                 sq_size = sizeof(struct io_uring_sqe);
    607                 src_mask = ctx->sq_entries - 1;
    608                 dst_mask = p->sq_entries - 1;
    609                 if (ctx->flags & IORING_SETUP_SQE128) {
    610                         index <<= 1;
    611                         sq_size <<= 1;
    612                         src_mask = (ctx->sq_entries << 1) - 1;
--> 613                         dst_mask = (p->sq_entries << 1) - 1;

These shifts could integer overflow.  So if you picked p->sq_entries
which was (1U << 31) then the (p->sq_entries << 1) would be zero and
the mask would be 0xffffffff.  Which might even be intentional, since
overflowing to zero and subtracting one is an idiom...

regards,
dan carpenter

    614                 }
    615                 memcpy(&n.sq_sqes[index & dst_mask], &o.sq_sqes[index & src_mask], sq_size);
    616         }
    617         WRITE_ONCE(n.rings->sq.head, old_head);
    618         WRITE_ONCE(n.rings->sq.tail, tail);
    619 
    620         tail = READ_ONCE(o.rings->cq.tail);
    621         old_head = READ_ONCE(o.rings->cq.head);
    622         if (tail - old_head > p->cq_entries) {
    623 overflow:
    624                 /* restore old rings, and return -EOVERFLOW via cleanup path */
    625                 ctx->rings = o.rings;
    626                 ctx->sq_sqes = o.sq_sqes;
    627                 to_free = &n;
    628                 ret = -EOVERFLOW;
    629                 goto out;
    630         }
    631         for (i = old_head; i < tail; i++) {
    632                 unsigned index, dst_mask, src_mask;
    633                 size_t cq_size;
    634 
    635                 index = i;
    636                 cq_size = sizeof(struct io_uring_cqe);
    637                 src_mask = ctx->cq_entries - 1;
    638                 dst_mask = p->cq_entries - 1;
    639                 if (ctx->flags & IORING_SETUP_CQE32) {
    640                         index <<= 1;
    641                         cq_size <<= 1;
    642                         src_mask = (ctx->cq_entries << 1) - 1;
    643                         dst_mask = (p->cq_entries << 1) - 1;
    644                 }
    645                 memcpy(&n.rings->cqes[index & dst_mask], &o.rings->cqes[index & src_mask], cq_size);
    646         }
    647         WRITE_ONCE(n.rings->cq.head, old_head);
    648         WRITE_ONCE(n.rings->cq.tail, tail);
    649         /* invalidate cached cqe refill */
    650         ctx->cqe_cached = ctx->cqe_sentinel = NULL;
    651 
    652         WRITE_ONCE(n.rings->sq_dropped, READ_ONCE(o.rings->sq_dropped));
    653         atomic_set(&n.rings->sq_flags, atomic_read(&o.rings->sq_flags));
    654         WRITE_ONCE(n.rings->cq_flags, READ_ONCE(o.rings->cq_flags));
    655         WRITE_ONCE(n.rings->cq_overflow, READ_ONCE(o.rings->cq_overflow));
    656 
    657         /* all done, store old pointers and assign new ones */
    658         if (!(ctx->flags & IORING_SETUP_NO_SQARRAY))
    659                 ctx->sq_array = (u32 *)((char *)n.rings + rl->sq_array_offset);
    660 
    661         ctx->sq_entries = p->sq_entries;
    662         ctx->cq_entries = p->cq_entries;
    663 
    664         /*
    665          * Just mark any flag we may have missed and that the application
    666          * should act on unconditionally. Worst case it'll be an extra
    667          * syscall.
    668          */
    669         atomic_or(IORING_SQ_TASKRUN | IORING_SQ_NEED_WAKEUP, &n.rings->sq_flags);
    670         ctx->rings = n.rings;
    671         rcu_assign_pointer(ctx->rings_rcu, n.rings);
    672 
    673         ctx->sq_sqes = n.sq_sqes;
    674         swap_old(ctx, o, n, ring_region);
    675         swap_old(ctx, o, n, sq_region);
    676         to_free = &o;
    677         ret = 0;
    678 out:
    679         spin_unlock(&ctx->completion_lock);
    680         mutex_unlock(&ctx->mmap_lock);
    681         /* Wait for concurrent io_ctx_mark_taskrun() */
    682         if (to_free == &o)
    683                 synchronize_rcu_expedited();
    684         io_register_free_rings(ctx, to_free);
    685 
    686         if (ctx->sq_data)
    687                 io_sq_thread_unpark(ctx->sq_data);
    688 
    689         return ret;
    690 }

This email is a free service from the Smatch-CI project [smatch.sf.net].

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [bug report] io_uring/register: fix ring resizing with mixed/large SQEs/CQEs
  2026-04-24  5:34 [bug report] io_uring/register: fix ring resizing with mixed/large SQEs/CQEs Dan Carpenter
@ 2026-04-24 15:01 ` Jens Axboe
  0 siblings, 0 replies; 2+ messages in thread
From: Jens Axboe @ 2026-04-24 15:01 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: io-uring

On 4/23/26 11:34 PM, Dan Carpenter wrote:
> Hello Jens Axboe,
> 
> Commit 45cd95763e19 ("io_uring/register: fix ring resizing with
> mixed/large SQEs/CQEs") from Apr 20, 2026 (linux-next), leads to the
> following Smatch static checker warning:
> 
>     io_uring/register.c:613 io_register_resize_rings()
>     warn: potential integer overflow from user (local copy) 'p->sq_entries << 1'
> 
>     io_uring/register.c:643 io_register_resize_rings()
>     warn: potential integer overflow from user (local copy) 'p->cq_entries << 1'
> 
> io_uring/register.c
>     498 static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg)
>     499 {
>     500         struct io_ctx_config config;
>     501         struct io_uring_region_desc rd;
>     502         struct io_ring_ctx_rings o = { }, n = { }, *to_free = NULL;
>     503         unsigned i, tail, old_head;
>     504         struct io_uring_params *p = &config.p;
>     505         struct io_rings_layout *rl = &config.layout;
>     506         int ret;
>     507 
>     508         memset(&config, 0, sizeof(config));
>     509 
>     510         /* limited to DEFER_TASKRUN for now */
>     511         if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN))
>     512                 return -EINVAL;
>     513         if (copy_from_user(p, arg, sizeof(*p)))
> 
> p comes from the user.  p->sq_entries and p->cq_entries are u32.
> 
>     514                 return -EFAULT;
>     515         if (p->flags & ~RESIZE_FLAGS)
>     516                 return -EINVAL;
>     517 
>     518         /* properties that are always inherited */
>     519         p->flags |= (ctx->flags & COPY_FLAGS);
>     520 
>     521         ret = io_prepare_config(&config);
>     522         if (unlikely(ret))
>     523                 return ret;
>     524 
>     525         memset(&rd, 0, sizeof(rd));
>     526         rd.size = PAGE_ALIGN(rl->rings_size);
>     527         if (p->flags & IORING_SETUP_NO_MMAP) {
>     528                 rd.user_addr = p->cq_off.user_addr;
>     529                 rd.flags |= IORING_MEM_REGION_TYPE_USER;
>     530         }
>     531         ret = io_create_region(ctx, &n.ring_region, &rd, IORING_OFF_CQ_RING);
>     532         if (ret)
>     533                 return ret;
>     534 
>     535         n.rings = io_region_get_ptr(&n.ring_region);
>     536 
>     537         /*
>     538          * At this point n.rings is shared with userspace, just like o.rings
>     539          * is as well. While we don't expect userspace to modify it while
>     540          * a resize is in progress, and it's most likely that userspace will
>     541          * shoot itself in the foot if it does, we can't always assume good
>     542          * intent... Use read/write once helpers from here on to indicate the
>     543          * shared nature of it.
>     544          */
>     545         WRITE_ONCE(n.rings->sq_ring_mask, p->sq_entries - 1);
>     546         WRITE_ONCE(n.rings->cq_ring_mask, p->cq_entries - 1);
> 
> If p->sq_entries is 0 then this wraps to U32_MAX
> 
>     547         WRITE_ONCE(n.rings->sq_ring_entries, p->sq_entries);
>     548         WRITE_ONCE(n.rings->cq_ring_entries, p->cq_entries);
>     549 
>     550         if (copy_to_user(arg, p, sizeof(*p))) {
>     551                 io_register_free_rings(ctx, &n);
>     552                 return -EFAULT;
>     553         }
>     554 
>     555         memset(&rd, 0, sizeof(rd));
>     556         rd.size = PAGE_ALIGN(rl->sq_size);
>     557         if (p->flags & IORING_SETUP_NO_MMAP) {
>     558                 rd.user_addr = p->sq_off.user_addr;
>     559                 rd.flags |= IORING_MEM_REGION_TYPE_USER;
>     560         }
>     561         ret = io_create_region(ctx, &n.sq_region, &rd, IORING_OFF_SQES);
>     562         if (ret) {
>     563                 io_register_free_rings(ctx, &n);
>     564                 return ret;
>     565         }
>     566         n.sq_sqes = io_region_get_ptr(&n.sq_region);
>     567 
>     568         /*
>     569          * If using SQPOLL, park the thread
>     570          */
>     571         if (ctx->sq_data) {
>     572                 mutex_unlock(&ctx->uring_lock);
>     573                 io_sq_thread_park(ctx->sq_data);
>     574                 mutex_lock(&ctx->uring_lock);
>     575         }
>     576 
>     577         /*
>     578          * We'll do the swap. Grab the ctx->mmap_lock, which will exclude
>     579          * any new mmap's on the ring fd. Clear out existing mappings to prevent
>     580          * mmap from seeing them, as we'll unmap them. Any attempt to mmap
>     581          * existing rings beyond this point will fail. Not that it could proceed
>     582          * at this point anyway, as the io_uring mmap side needs go grab the
>     583          * ctx->mmap_lock as well. Likewise, hold the completion lock over the
>     584          * duration of the actual swap.
>     585          */
>     586         mutex_lock(&ctx->mmap_lock);
>     587         spin_lock(&ctx->completion_lock);
>     588         o.rings = ctx->rings;
>     589         ctx->rings = NULL;
>     590         o.sq_sqes = ctx->sq_sqes;
>     591         ctx->sq_sqes = NULL;
>     592 
>     593         /*
>     594          * Now copy SQ and CQ entries, if any. If either of the destination
>     595          * rings can't hold what is already there, then fail the operation.
>     596          */
>     597         tail = READ_ONCE(o.rings->sq.tail);
>     598         old_head = READ_ONCE(o.rings->sq.head);
>     599         if (tail - old_head > p->sq_entries)
>     600                 goto overflow;
> 
> I guess if p->sq_entries were zero then we would hit this goto
> 
>     601         for (i = old_head; i < tail; i++) {
>     602                 unsigned index, dst_mask, src_mask;
>     603                 size_t sq_size;
>     604 
>     605                 index = i;
>     606                 sq_size = sizeof(struct io_uring_sqe);
>     607                 src_mask = ctx->sq_entries - 1;
>     608                 dst_mask = p->sq_entries - 1;
>     609                 if (ctx->flags & IORING_SETUP_SQE128) {
>     610                         index <<= 1;
>     611                         sq_size <<= 1;
>     612                         src_mask = (ctx->sq_entries << 1) - 1;
> --> 613                         dst_mask = (p->sq_entries << 1) - 1;
> 
> These shifts could integer overflow.  So if you picked p->sq_entries
> which was (1U << 31) then the (p->sq_entries << 1) would be zero and
> the mask would be 0xffffffff.  Which might even be intentional, since
> overflowing to zero and subtracting one is an idiom...

These are both fine, as SQ entries are capped at IORING_MAX_ENTRIES
(32k) and CQ entries are capped at IORING_MAX_CQ_ENTRIES (64k). Anything
larger is rejected earlier, as is 0.

IOW, false positives.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-04-24 15:01 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-24  5:34 [bug report] io_uring/register: fix ring resizing with mixed/large SQEs/CQEs Dan Carpenter
2026-04-24 15:01 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox