From: Jens Axboe <axboe@kernel.dk>
To: Pavel Begunkov <asml.silence@gmail.com>,
Nitesh Shetty <nj.shetty@samsung.com>
Cc: io-uring@vger.kernel.org
Subject: Re: [PATCH 4/4] io_uring/rsrc: send exact nr_segs for fixed buffer
Date: Thu, 17 Apr 2025 07:41:36 -0600 [thread overview]
Message-ID: <603628d3-78ec-47a3-804a-ee6dc93639fd@kernel.dk> (raw)
In-Reply-To: <ca357dbb-cc51-487c-919e-c71d3856f915@gmail.com>
On 4/17/25 6:56 AM, Pavel Begunkov wrote:
> On 4/17/25 12:50, Nitesh Shetty wrote:
>> On 17/04/25 03:53PM, Nitesh Shetty wrote:
>>> On 17/04/25 10:34AM, Pavel Begunkov wrote:
>>>> On 4/17/25 10:32, Pavel Begunkov wrote:
>>>>> From: Nitesh Shetty <nj.shetty@samsung.com>
>>>> ...
>>>>> diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c
>>>>> index 5cf854318b1d..4099b8225670 100644
>>>>> --- a/io_uring/rsrc.c
>>>>> +++ b/io_uring/rsrc.c
>>>>> @@ -1037,6 +1037,7 @@ static int io_import_fixed(int ddir, struct iov_iter *iter,
>>>>> u64 buf_addr, size_t len)
>>>>> {
>>>>> const struct bio_vec *bvec;
>>>>> + size_t folio_mask;
>>>>> unsigned nr_segs;
>>>>> size_t offset;
>>>>> int ret;
>>>>> @@ -1067,6 +1068,7 @@ static int io_import_fixed(int ddir, struct iov_iter *iter,
>>>>> * 2) all bvecs are the same in size, except potentially the
>>>>> * first and last bvec
>>>>> */
>>>>> + folio_mask = (1UL << imu->folio_shift) - 1;
>>>>> bvec = imu->bvec;
>>>>> if (offset >= bvec->bv_len) {
>>>>> unsigned long seg_skip;
>>>>> @@ -1075,10 +1077,10 @@ static int io_import_fixed(int ddir, struct iov_iter *iter,
>>>>> offset -= bvec->bv_len;
>>>>> seg_skip = 1 + (offset >> imu->folio_shift);
>>>>> bvec += seg_skip;
>>>>> - offset &= (1UL << imu->folio_shift) - 1;
>>>>> + offset &= folio_mask;
>>>>> }
>>>>> - nr_segs = imu->nr_bvecs - (bvec - imu->bvec);
>>>>> + nr_segs = (offset + len + folio_mask) >> imu->folio_shift;
>>>>
>>>> Nitesh, let me know if you're happy with this version.
>>>>
>>> This looks great to me, I tested this series and see the
>>> improvement in IOPS from 7.15 to 7.65M here.
>>>
>>
>> There is corner case where this might not work,
>> This happens when there is a first bvec has non zero offset.
>> Let's say bv_offset = 256, len = 512, iov_offset = 3584 (512*7, 8th IO),
>> here we expect IO to have 2 segments with present codebase, but this
>> patch set produces 1 segment.
>>
>> So having a fix like this solves the issue,
>> + nr_segs = (offset + len + bvec->bv_offset + folio_mask) >> imu->folio_shift;
>
> Ah yes, looks like the right fix up. We can make it nicer, but
> that's for later. It'd also be great to have a test for it.
>
>
>> Note:
>> I am investigating whether this is a valid case or not, because having a
>> 512 byte IO with 256 byte alignment feel odd. So have sent one patch for
>
> Block might filter it out, but for example net/ doesn't care,
> fs as well. IIUC what you mean, either way we definitely should
> correct that.
I just tested it, and yes it certainly blows up... Can also confirm that
the corrected nr_segs calculation does the right thing, doesn't end up
underestimating the segment count by 1 in that case.
I'll turn the test case into something we can add to liburing, and fold
in that change.
--
Jens Axboe
next prev parent reply other threads:[~2025-04-17 13:41 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-17 9:32 [PATCH 0/4] io_import_fixed cleanups and optimisation Pavel Begunkov
2025-04-17 9:32 ` [PATCH 1/4] io_uring/rsrc: don't skip offset calculation Pavel Begunkov
2025-04-17 9:32 ` [PATCH 2/4] io_uring/rsrc: separate kbuf offset adjustments Pavel Begunkov
2025-04-17 9:32 ` [PATCH 3/4] io_uring/rsrc: refactor io_import_fixed Pavel Begunkov
2025-04-17 9:32 ` [PATCH 4/4] io_uring/rsrc: send exact nr_segs for fixed buffer Pavel Begunkov
2025-04-17 9:34 ` Pavel Begunkov
[not found] ` <CGME20250417103133epcas5p32c1e004e7f8a5135c4c7e3662b087470@epcas5p3.samsung.com>
2025-04-17 10:23 ` Nitesh Shetty
2025-04-17 11:50 ` Nitesh Shetty
2025-04-17 12:56 ` Pavel Begunkov
2025-04-17 13:41 ` Jens Axboe [this message]
2025-04-17 13:57 ` Jens Axboe
2025-04-17 14:02 ` Pavel Begunkov
2025-04-17 14:05 ` Jens Axboe
2025-04-17 12:31 ` [PATCH 0/4] io_import_fixed cleanups and optimisation Jens Axboe
[not found] ` <CGME20250417142334epcas5p36df55874d21c896115d92e505f9793fd@epcas5p3.samsung.com>
2025-04-17 14:15 ` Nitesh Shetty
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=603628d3-78ec-47a3-804a-ee6dc93639fd@kernel.dk \
--to=axboe@kernel.dk \
--cc=asml.silence@gmail.com \
--cc=io-uring@vger.kernel.org \
--cc=nj.shetty@samsung.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox