* [PATCH V2 0/3] block: avoid to use bi_vcnt in bio_may_need_split()
@ 2025-12-31 3:00 Ming Lei
2025-12-31 3:00 ` [PATCH V2 1/3] block: use bvec iterator helper for bio_may_need_split() Ming Lei
` (4 more replies)
0 siblings, 5 replies; 9+ messages in thread
From: Ming Lei @ 2025-12-31 3:00 UTC (permalink / raw)
To: Jens Axboe, linux-block
Cc: io-uring, Caleb Sander Mateos, Nitesh Shetty, Ming Lei
This series cleans up bio handling to use bi_iter consistently for both
cloned and non-cloned bios, removing the reliance on bi_vcnt which is
only meaningful for non-cloned bios.
Currently, bio_may_need_split() uses bi_vcnt to check if a bio has a
single segment. While this works, it's inconsistent with how cloned bios
operate - they use bi_iter for iteration, not bi_vcnt. This inconsistency
led to io_uring needing to recalculate iov_iter.nr_segs to ensure bi_vcnt
gets a correct value when copied.
This series unifies the approach:
1. Make bio_may_need_split() use bi_iter instead of bi_vcnt. This handles
both cloned and non-cloned bios in a consistent way. Also move bi_io_vec
adjacent to bi_iter in struct bio since they're commonly accessed
together.
2. Stop copying iov_iter.nr_segs to bi_vcnt in bio_iov_bvec_set(), since
cloned bios should rely on bi_iter, not bi_vcnt.
3. Remove the nr_segs recalculation in io_uring, which was only needed
to provide an accurate bi_vcnt value.
Nitesh verified no performance regression on NVMe 512-byte fio/t/io_uring
workloads.
V2:
- improve bio layout by putting bi_iter and bi_io_vec together
- improve commit log
Ming Lei (3):
block: use bvec iterator helper for bio_may_need_split()
block: don't initialize bi_vcnt for cloned bio in bio_iov_bvec_set()
io_uring: remove nr_segs recalculation in io_import_kbuf()
block/bio.c | 5 ++++-
block/blk.h | 12 +++++++++---
include/linux/blk_types.h | 4 ++--
io_uring/rsrc.c | 11 -----------
4 files changed, 15 insertions(+), 17 deletions(-)
--
2.47.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH V2 1/3] block: use bvec iterator helper for bio_may_need_split()
2025-12-31 3:00 [PATCH V2 0/3] block: avoid to use bi_vcnt in bio_may_need_split() Ming Lei
@ 2025-12-31 3:00 ` Ming Lei
2026-01-07 10:38 ` Nitesh Shetty
2025-12-31 3:00 ` [PATCH V2 2/3] block: don't initialize bi_vcnt for cloned bio in bio_iov_bvec_set() Ming Lei
` (3 subsequent siblings)
4 siblings, 1 reply; 9+ messages in thread
From: Ming Lei @ 2025-12-31 3:00 UTC (permalink / raw)
To: Jens Axboe, linux-block
Cc: io-uring, Caleb Sander Mateos, Nitesh Shetty, Ming Lei
bio_may_need_split() uses bi_vcnt to determine if a bio has a single
segment, but bi_vcnt is unreliable for cloned bios. Cloned bios share
the parent's bi_io_vec array but iterate over a subset via bi_iter,
so bi_vcnt may not reflect the actual segment count being iterated.
Replace the bi_vcnt check with bvec iterator access via
__bvec_iter_bvec(), comparing bi_iter.bi_size against the current
bvec's length. This correctly handles both cloned and non-cloned bios.
Move bi_io_vec into the first cache line adjacent to bi_iter. This is
a sensible layout since bi_io_vec and bi_iter are commonly accessed
together throughout the block layer - every bvec iteration requires
both fields. This displaces bi_end_io to the second cache line, which
is acceptable since bi_end_io and bi_private are always fetched
together in bio_endio() anyway.
The struct layout change requires bio_reset() to preserve and restore
bi_io_vec across the memset, since it now falls within BIO_RESET_BYTES.
Nitesh verified that this patch doesn't regress NVMe 512-byte IO perf [1].
Link: https://lore.kernel.org/linux-block/20251220081607.tvnrltcngl3cc2fh@green245.gost/ [1]
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
block/bio.c | 3 +++
block/blk.h | 12 +++++++++---
include/linux/blk_types.h | 4 ++--
3 files changed, 14 insertions(+), 5 deletions(-)
diff --git a/block/bio.c b/block/bio.c
index e726c0e280a8..0e936288034e 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -301,9 +301,12 @@ EXPORT_SYMBOL(bio_init);
*/
void bio_reset(struct bio *bio, struct block_device *bdev, blk_opf_t opf)
{
+ struct bio_vec *bv = bio->bi_io_vec;
+
bio_uninit(bio);
memset(bio, 0, BIO_RESET_BYTES);
atomic_set(&bio->__bi_remaining, 1);
+ bio->bi_io_vec = bv;
bio->bi_bdev = bdev;
if (bio->bi_bdev)
bio_associate_blkg(bio);
diff --git a/block/blk.h b/block/blk.h
index e4c433f62dfc..98f4dfd4ec75 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -371,12 +371,18 @@ struct bio *bio_split_zone_append(struct bio *bio,
static inline bool bio_may_need_split(struct bio *bio,
const struct queue_limits *lim)
{
+ const struct bio_vec *bv;
+
if (lim->chunk_sectors)
return true;
- if (bio->bi_vcnt != 1)
+
+ if (!bio->bi_io_vec)
+ return true;
+
+ bv = __bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
+ if (bio->bi_iter.bi_size > bv->bv_len)
return true;
- return bio->bi_io_vec->bv_len + bio->bi_io_vec->bv_offset >
- lim->max_fast_segment_size;
+ return bv->bv_len + bv->bv_offset > lim->max_fast_segment_size;
}
/**
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 5dc061d318a4..19a888a2f104 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -232,6 +232,8 @@ struct bio {
atomic_t __bi_remaining;
+ /* The actual vec list, preserved by bio_reset() */
+ struct bio_vec *bi_io_vec;
struct bvec_iter bi_iter;
union {
@@ -275,8 +277,6 @@ struct bio {
atomic_t __bi_cnt; /* pin count */
- struct bio_vec *bi_io_vec; /* the actual vec list */
-
struct bio_set *bi_pool;
};
--
2.47.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH V2 2/3] block: don't initialize bi_vcnt for cloned bio in bio_iov_bvec_set()
2025-12-31 3:00 [PATCH V2 0/3] block: avoid to use bi_vcnt in bio_may_need_split() Ming Lei
2025-12-31 3:00 ` [PATCH V2 1/3] block: use bvec iterator helper for bio_may_need_split() Ming Lei
@ 2025-12-31 3:00 ` Ming Lei
2026-01-07 10:39 ` Nitesh Shetty
2025-12-31 3:00 ` [PATCH V2 3/3] io_uring: remove nr_segs recalculation in io_import_kbuf() Ming Lei
` (2 subsequent siblings)
4 siblings, 1 reply; 9+ messages in thread
From: Ming Lei @ 2025-12-31 3:00 UTC (permalink / raw)
To: Jens Axboe, linux-block
Cc: io-uring, Caleb Sander Mateos, Nitesh Shetty, Ming Lei
bio_iov_bvec_set() creates a cloned bio that borrows a bvec array from
an iov_iter. For cloned bios, bi_vcnt is meaningless because iteration
is controlled entirely by bi_iter (bi_idx, bi_size, bi_bvec_done), not
by bi_vcnt. Remove the incorrect bi_vcnt assignment.
Explicitly initialize bi_iter.bi_idx to 0 to ensure iteration starts
at the first bvec. While bi_idx is typically already zero from bio
initialization, making this explicit improves clarity and correctness.
This change also avoids accessing iter->nr_segs, which is an iov_iter
implementation detail that block code should not depend on.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
block/bio.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/bio.c b/block/bio.c
index 0e936288034e..2359c0723b88 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1165,8 +1165,8 @@ void bio_iov_bvec_set(struct bio *bio, const struct iov_iter *iter)
{
WARN_ON_ONCE(bio->bi_max_vecs);
- bio->bi_vcnt = iter->nr_segs;
bio->bi_io_vec = (struct bio_vec *)iter->bvec;
+ bio->bi_iter.bi_idx = 0;
bio->bi_iter.bi_bvec_done = iter->iov_offset;
bio->bi_iter.bi_size = iov_iter_count(iter);
bio_set_flag(bio, BIO_CLONED);
--
2.47.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH V2 3/3] io_uring: remove nr_segs recalculation in io_import_kbuf()
2025-12-31 3:00 [PATCH V2 0/3] block: avoid to use bi_vcnt in bio_may_need_split() Ming Lei
2025-12-31 3:00 ` [PATCH V2 1/3] block: use bvec iterator helper for bio_may_need_split() Ming Lei
2025-12-31 3:00 ` [PATCH V2 2/3] block: don't initialize bi_vcnt for cloned bio in bio_iov_bvec_set() Ming Lei
@ 2025-12-31 3:00 ` Ming Lei
2026-01-07 10:40 ` Nitesh Shetty
2026-01-07 4:11 ` [PATCH V2 0/3] block: avoid to use bi_vcnt in bio_may_need_split() Ming Lei
2026-01-07 15:08 ` Jens Axboe
4 siblings, 1 reply; 9+ messages in thread
From: Ming Lei @ 2025-12-31 3:00 UTC (permalink / raw)
To: Jens Axboe, linux-block
Cc: io-uring, Caleb Sander Mateos, Nitesh Shetty, Ming Lei
io_import_kbuf() recalculates iter->nr_segs to reflect only the bvecs
needed for the requested byte range. This was added to provide an
accurate segment count to bio_iov_bvec_set(), which copied nr_segs to
bio->bi_vcnt for use as a bio split hint.
The previous two patches eliminated this dependency:
- bio_may_need_split() now uses bi_iter instead of bi_vcnt for split
decisions
- bio_iov_bvec_set() no longer copies nr_segs to bi_vcnt
Since nr_segs is no longer used for bio split decisions, the
recalculation loop is unnecessary. The iov_iter already has the correct
bi_size to cap iteration, so an oversized nr_segs is harmless.
Link: https://lkml.org/lkml/2025/4/16/351
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
io_uring/rsrc.c | 11 -----------
1 file changed, 11 deletions(-)
diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c
index 41c89f5c616d..ee6283676ba7 100644
--- a/io_uring/rsrc.c
+++ b/io_uring/rsrc.c
@@ -1055,17 +1055,6 @@ static int io_import_kbuf(int ddir, struct iov_iter *iter,
iov_iter_bvec(iter, ddir, imu->bvec, imu->nr_bvecs, count);
iov_iter_advance(iter, offset);
-
- if (count < imu->len) {
- const struct bio_vec *bvec = iter->bvec;
-
- len += iter->iov_offset;
- while (len > bvec->bv_len) {
- len -= bvec->bv_len;
- bvec++;
- }
- iter->nr_segs = 1 + bvec - iter->bvec;
- }
return 0;
}
--
2.47.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH V2 0/3] block: avoid to use bi_vcnt in bio_may_need_split()
2025-12-31 3:00 [PATCH V2 0/3] block: avoid to use bi_vcnt in bio_may_need_split() Ming Lei
` (2 preceding siblings ...)
2025-12-31 3:00 ` [PATCH V2 3/3] io_uring: remove nr_segs recalculation in io_import_kbuf() Ming Lei
@ 2026-01-07 4:11 ` Ming Lei
2026-01-07 15:08 ` Jens Axboe
4 siblings, 0 replies; 9+ messages in thread
From: Ming Lei @ 2026-01-07 4:11 UTC (permalink / raw)
To: Jens Axboe, linux-block; +Cc: io-uring, Caleb Sander Mateos, Nitesh Shetty
On Wed, Dec 31, 2025 at 11:00:54AM +0800, Ming Lei wrote:
> This series cleans up bio handling to use bi_iter consistently for both
> cloned and non-cloned bios, removing the reliance on bi_vcnt which is
> only meaningful for non-cloned bios.
>
> Currently, bio_may_need_split() uses bi_vcnt to check if a bio has a
> single segment. While this works, it's inconsistent with how cloned bios
> operate - they use bi_iter for iteration, not bi_vcnt. This inconsistency
> led to io_uring needing to recalculate iov_iter.nr_segs to ensure bi_vcnt
> gets a correct value when copied.
>
> This series unifies the approach:
>
> 1. Make bio_may_need_split() use bi_iter instead of bi_vcnt. This handles
> both cloned and non-cloned bios in a consistent way. Also move bi_io_vec
> adjacent to bi_iter in struct bio since they're commonly accessed
> together.
>
> 2. Stop copying iov_iter.nr_segs to bi_vcnt in bio_iov_bvec_set(), since
> cloned bios should rely on bi_iter, not bi_vcnt.
>
> 3. Remove the nr_segs recalculation in io_uring, which was only needed
> to provide an accurate bi_vcnt value.
>
> Nitesh verified no performance regression on NVMe 512-byte fio/t/io_uring
> workloads.
>
> V2:
> - improve bio layout by putting bi_iter and bi_io_vec together
> - improve commit log
Hello Guys,
Ping...
Thanks,
Ming
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH V2 1/3] block: use bvec iterator helper for bio_may_need_split()
2025-12-31 3:00 ` [PATCH V2 1/3] block: use bvec iterator helper for bio_may_need_split() Ming Lei
@ 2026-01-07 10:38 ` Nitesh Shetty
0 siblings, 0 replies; 9+ messages in thread
From: Nitesh Shetty @ 2026-01-07 10:38 UTC (permalink / raw)
To: Ming Lei; +Cc: Jens Axboe, linux-block, io-uring, Caleb Sander Mateos
[-- Attachment #1: Type: text/plain, Size: 1318 bytes --]
On 31/12/25 11:00AM, Ming Lei wrote:
>bio_may_need_split() uses bi_vcnt to determine if a bio has a single
>segment, but bi_vcnt is unreliable for cloned bios. Cloned bios share
>the parent's bi_io_vec array but iterate over a subset via bi_iter,
>so bi_vcnt may not reflect the actual segment count being iterated.
>
>Replace the bi_vcnt check with bvec iterator access via
>__bvec_iter_bvec(), comparing bi_iter.bi_size against the current
>bvec's length. This correctly handles both cloned and non-cloned bios.
>
>Move bi_io_vec into the first cache line adjacent to bi_iter. This is
>a sensible layout since bi_io_vec and bi_iter are commonly accessed
>together throughout the block layer - every bvec iteration requires
>both fields. This displaces bi_end_io to the second cache line, which
>is acceptable since bi_end_io and bi_private are always fetched
>together in bio_endio() anyway.
>
>The struct layout change requires bio_reset() to preserve and restore
>bi_io_vec across the memset, since it now falls within BIO_RESET_BYTES.
>
>Nitesh verified that this patch doesn't regress NVMe 512-byte IO perf [1].
>
>Link: https://lore.kernel.org/linux-block/20251220081607.tvnrltcngl3cc2fh@green245.gost/ [1]
>Signed-off-by: Ming Lei <ming.lei@redhat.com>
>---
Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH V2 2/3] block: don't initialize bi_vcnt for cloned bio in bio_iov_bvec_set()
2025-12-31 3:00 ` [PATCH V2 2/3] block: don't initialize bi_vcnt for cloned bio in bio_iov_bvec_set() Ming Lei
@ 2026-01-07 10:39 ` Nitesh Shetty
0 siblings, 0 replies; 9+ messages in thread
From: Nitesh Shetty @ 2026-01-07 10:39 UTC (permalink / raw)
To: Ming Lei; +Cc: Jens Axboe, linux-block, io-uring, Caleb Sander Mateos
[-- Attachment #1: Type: text/plain, Size: 758 bytes --]
On 31/12/25 11:00AM, Ming Lei wrote:
>bio_iov_bvec_set() creates a cloned bio that borrows a bvec array from
>an iov_iter. For cloned bios, bi_vcnt is meaningless because iteration
>is controlled entirely by bi_iter (bi_idx, bi_size, bi_bvec_done), not
>by bi_vcnt. Remove the incorrect bi_vcnt assignment.
>
>Explicitly initialize bi_iter.bi_idx to 0 to ensure iteration starts
>at the first bvec. While bi_idx is typically already zero from bio
>initialization, making this explicit improves clarity and correctness.
>
>This change also avoids accessing iter->nr_segs, which is an iov_iter
>implementation detail that block code should not depend on.
>
>Signed-off-by: Ming Lei <ming.lei@redhat.com>
>---
Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH V2 3/3] io_uring: remove nr_segs recalculation in io_import_kbuf()
2025-12-31 3:00 ` [PATCH V2 3/3] io_uring: remove nr_segs recalculation in io_import_kbuf() Ming Lei
@ 2026-01-07 10:40 ` Nitesh Shetty
0 siblings, 0 replies; 9+ messages in thread
From: Nitesh Shetty @ 2026-01-07 10:40 UTC (permalink / raw)
To: Ming Lei; +Cc: Jens Axboe, linux-block, io-uring, Caleb Sander Mateos
[-- Attachment #1: Type: text/plain, Size: 834 bytes --]
On 31/12/25 11:00AM, Ming Lei wrote:
>io_import_kbuf() recalculates iter->nr_segs to reflect only the bvecs
>needed for the requested byte range. This was added to provide an
>accurate segment count to bio_iov_bvec_set(), which copied nr_segs to
>bio->bi_vcnt for use as a bio split hint.
>
>The previous two patches eliminated this dependency:
> - bio_may_need_split() now uses bi_iter instead of bi_vcnt for split
> decisions
> - bio_iov_bvec_set() no longer copies nr_segs to bi_vcnt
>
>Since nr_segs is no longer used for bio split decisions, the
>recalculation loop is unnecessary. The iov_iter already has the correct
>bi_size to cap iteration, so an oversized nr_segs is harmless.
>
>Link: https://lkml.org/lkml/2025/4/16/351
>Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH V2 0/3] block: avoid to use bi_vcnt in bio_may_need_split()
2025-12-31 3:00 [PATCH V2 0/3] block: avoid to use bi_vcnt in bio_may_need_split() Ming Lei
` (3 preceding siblings ...)
2026-01-07 4:11 ` [PATCH V2 0/3] block: avoid to use bi_vcnt in bio_may_need_split() Ming Lei
@ 2026-01-07 15:08 ` Jens Axboe
4 siblings, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2026-01-07 15:08 UTC (permalink / raw)
To: linux-block, Ming Lei; +Cc: io-uring, Caleb Sander Mateos, Nitesh Shetty
On Wed, 31 Dec 2025 11:00:54 +0800, Ming Lei wrote:
> This series cleans up bio handling to use bi_iter consistently for both
> cloned and non-cloned bios, removing the reliance on bi_vcnt which is
> only meaningful for non-cloned bios.
>
> Currently, bio_may_need_split() uses bi_vcnt to check if a bio has a
> single segment. While this works, it's inconsistent with how cloned bios
> operate - they use bi_iter for iteration, not bi_vcnt. This inconsistency
> led to io_uring needing to recalculate iov_iter.nr_segs to ensure bi_vcnt
> gets a correct value when copied.
>
> [...]
Applied, thanks!
[1/3] block: use bvec iterator helper for bio_may_need_split()
commit: ee623c892aa59003fca173de0041abc2ccc2c72d
[2/3] block: don't initialize bi_vcnt for cloned bio in bio_iov_bvec_set()
commit: 641864314866dff382f64cd8b52fd6bf4c4d84f6
[3/3] io_uring: remove nr_segs recalculation in io_import_kbuf()
commit: 15f506a77ad61ac3273ade9b7ef87af9bdba22ad
Best regards,
--
Jens Axboe
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-01-07 15:08 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-31 3:00 [PATCH V2 0/3] block: avoid to use bi_vcnt in bio_may_need_split() Ming Lei
2025-12-31 3:00 ` [PATCH V2 1/3] block: use bvec iterator helper for bio_may_need_split() Ming Lei
2026-01-07 10:38 ` Nitesh Shetty
2025-12-31 3:00 ` [PATCH V2 2/3] block: don't initialize bi_vcnt for cloned bio in bio_iov_bvec_set() Ming Lei
2026-01-07 10:39 ` Nitesh Shetty
2025-12-31 3:00 ` [PATCH V2 3/3] io_uring: remove nr_segs recalculation in io_import_kbuf() Ming Lei
2026-01-07 10:40 ` Nitesh Shetty
2026-01-07 4:11 ` [PATCH V2 0/3] block: avoid to use bi_vcnt in bio_may_need_split() Ming Lei
2026-01-07 15:08 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox