From: Ming Lei <ming.lei@redhat.com>
To: Caleb Sander Mateos <csander@purestorage.com>
Cc: Keith Busch <kbusch@kernel.org>,
Chaitanya Kulkarni <kch@nvidia.com>, Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org, io-uring@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] io_uring/rsrc: don't use blk_rq_nr_phys_segments() as number of bvecs
Date: Wed, 12 Nov 2025 09:59:31 +0800 [thread overview]
Message-ID: <aRPqA1XGWnY4YpIm@fedora> (raw)
In-Reply-To: <CADUfDZovn5fPh_E6GGvGkPYbW12L2z6BS4jPkpQjuEjNd=bRGA@mail.gmail.com>
On Tue, Nov 11, 2025 at 05:44:18PM -0800, Caleb Sander Mateos wrote:
> On Tue, Nov 11, 2025 at 5:01 PM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > On Tue, Nov 11, 2025 at 12:15:29PM -0700, Caleb Sander Mateos wrote:
> > > io_buffer_register_bvec() currently uses blk_rq_nr_phys_segments() as
> > > the number of bvecs in the request. However, bvecs may be split into
> > > multiple segments depending on the queue limits. Thus, the number of
> > > segments may overestimate the number of bvecs. For ublk devices, the
> > > only current users of io_buffer_register_bvec(), virt_boundary_mask,
> > > seg_boundary_mask, max_segments, and max_segment_size can all be set
> > > arbitrarily by the ublk server process.
> > > Set imu->nr_bvecs based on the number of bvecs the rq_for_each_bvec()
> > > loop actually yields. However, continue using blk_rq_nr_phys_segments()
> > > as an upper bound on the number of bvecs when allocating imu to avoid
> > > needing to iterate the bvecs a second time.
> > >
> > > Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
> > > Fixes: 27cb27b6d5ea ("io_uring: add support for kernel registered bvecs")
> >
> > Reviewed-by: Ming Lei <ming.lei@redhat.com>
> >
> > BTW, this issue may not be a problem because ->nr_bvecs is only used in
> > iov_iter_bvec(), in which 'offset' and 'len' can control how far the
> > iterator can reach, so the uninitialized bvecs won't be touched basically.
>
> I see your point, but what about iov_iter_extract_bvec_pages()? That
> looks like it only uses i->nr_segs to bound the iteration, not
> i->count. Hopefully there aren't any other helpers relying on nr_segs.
iov_iter_extract_bvec_pages() is only called from iov_iter_extract_pages(),
in which 'maxsize' is capped by i->count.
> If you really don't think it's a problem, I'm fine deferring the patch
> to 6.19. We haven't encountered any problems caused by this bug, but
> we haven't tested with any non-default virt_boundary_mask,
> seg_boundary_mask, max_segments, or max_segment_size on the ublk
> device.
IMO it should belong to v6.18: your fix not only makes code more robust, but
also it is correct thing to do.
I am just thinking why the issue wasn't triggered because we have lots of
test cases(rw verify, mkfs & mount ...)
>
> >
> > Otherwise, the issue should have been triggered somewhere.
> >
> > Also the bvec allocation may be avoided in case of single-bio request,
> > which can be one future optimization.
>
> I'm not sure what you're suggesting. The bio_vec array is a flexible
> array member of io_mapped_ubuf, so unless we add another pointer
> indirection, I don't see how to reuse the bio's bi_io_vec array.
> io_mapped_ubuf is also used for user registered buffers, where this
> optimization isn't possible, so it may not be a clear win.
io_mapped_ubuf->acct_pages can be one field reused for the indirect
pointer, please see lo_rw_aio() about how to reuse the bvec array.
Thanks,
Ming
next prev parent reply other threads:[~2025-11-12 1:59 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-11 19:15 [PATCH] io_uring/rsrc: don't use blk_rq_nr_phys_segments() as number of bvecs Caleb Sander Mateos
2025-11-11 19:19 ` Chaitanya Kulkarni
2025-11-12 1:01 ` Ming Lei
2025-11-12 1:44 ` Caleb Sander Mateos
2025-11-12 1:59 ` Ming Lei [this message]
2025-11-12 15:26 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aRPqA1XGWnY4YpIm@fedora \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=csander@purestorage.com \
--cc=io-uring@vger.kernel.org \
--cc=kbusch@kernel.org \
--cc=kch@nvidia.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox