public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Pavel Begunkov <asml.silence@gmail.com>,
	Nitesh Shetty <nitheshshetty@gmail.com>
Cc: Nitesh Shetty <nj.shetty@samsung.com>,
	gost.dev@samsung.com, io-uring@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] io_uring/rsrc: send exact nr_segs for fixed buffer
Date: Wed, 16 Apr 2025 16:23:39 -0600	[thread overview]
Message-ID: <fe9043f2-6f80-4dab-aba1-e51577ef2645@kernel.dk> (raw)
In-Reply-To: <951a5f20-2ec4-40c3-8014-69cd6f4b9f0f@gmail.com>

>>> Should we just make it saner first? Sth like these 3 completely
>>> untested commits
>>>
>>> https://github.com/isilence/linux/commits/rsrc-import-cleanup/
>>>
>>> And then it'll become
>>>
>>> nr_segs = ALIGN(offset + len, 1UL << folio_shift);
>>
>> Let's please do that, certainly an improvement. Care to send this out? I
>> can toss them at the testing. And we'd still need that last patch to
> 
> I need to test it first, perhaps tomorrow

Sounds good, I'll run it through testing here too. Would be nice to
stuff in for -rc3, it's pretty minimal and honestly makes the code much
easier to read and reason about.

>> ensure the segment count is correct. Honestly somewhat surprised that
> 
> Right, I can pick up the Nitesh's patch to that.

Sounds good.

>> the only odd fallout of that is (needlessly) hitting the bio split path.
> 
> It's perfectly correct from the iter standpoint, AFAIK, length
> and nr of segments don't have to match. Though I am surprised
> it causes perf issues in the split path.

Theoretically it is, but it always makes me a bit nervous as there are
some _really_ odd iov_iter use cases out there. And passing down known
wrong segment counts is pretty wonky.

> Btw, where exactly does it stumble in there? I'd assume we don't

Because segments != 1, and then that hits the slower path.

> need to do the segment correction for kbuf as the bio splitting
> can do it (and probably does) in exactly the same way?

It doesn't strictly need to, but we should handle that case too. That'd
basically just be the loop addition I already did, something ala the
below on top for both of them:

diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c
index d8fa7158e598..767ac89c8426 100644
--- a/io_uring/rsrc.c
+++ b/io_uring/rsrc.c
@@ -1032,6 +1032,25 @@ static int validate_fixed_range(u64 buf_addr, size_t len,
 	return 0;
 }
 
+static int io_import_kbuf(int ddir, struct iov_iter *iter,
+			  struct io_mapped_ubuf *imu, size_t len, size_t offset)
+{
+	iov_iter_bvec(iter, ddir, iter->bvec, imu->nr_bvecs, len + offset);
+	iov_iter_advance(iter, offset);
+
+	if (len + offset < imu->len) {
+		const struct bio_vec *bvec = iter->bvec;
+
+		while (len > bvec->bv_len) {
+			len -= bvec->bv_len;
+			bvec++;
+		}
+		iter->nr_segs = bvec - iter->bvec;
+	}
+
+	return 0;
+}
+
 static int io_import_fixed(int ddir, struct iov_iter *iter,
 			   struct io_mapped_ubuf *imu,
 			   u64 buf_addr, size_t len)
@@ -1054,13 +1073,9 @@ static int io_import_fixed(int ddir, struct iov_iter *iter,
 	 * and advance us to the beginning.
 	 */
 	offset = buf_addr - imu->ubuf;
-	bvec = imu->bvec;
 
-	if (imu->is_kbuf) {
-		iov_iter_bvec(iter, ddir, bvec, imu->nr_bvecs, offset + len);
-		iov_iter_advance(iter, offset);
-		return 0;
-	}
+	if (imu->is_kbuf)
+		return io_import_kbuf(ddir, iter, imu, len, offset);
 
 	/*
 	 * Don't use iov_iter_advance() here, as it's really slow for
@@ -1083,7 +1098,7 @@ static int io_import_fixed(int ddir, struct iov_iter *iter,
 	 * have the size property of user registered ones, so we have
 	 * to use the slow iter advance.
 	 */
-
+	bvec = imu->bvec;
 	if (offset >= bvec->bv_len) {
 		unsigned long seg_skip;
 
@@ -1094,7 +1109,7 @@ static int io_import_fixed(int ddir, struct iov_iter *iter,
 		offset &= (1UL << imu->folio_shift) - 1;
 	}
 
-	nr_segs = imu->nr_bvecs - (bvec - imu->bvec);
+	nr_segs = ALIGN(offset + len, 1UL << imu->folio_shift) >> imu->folio_shift;
 	iov_iter_bvec(iter, ddir, bvec, nr_segs, len);
 	iter->iov_offset = offset;
 	return 0;

-- 
Jens Axboe

  reply	other threads:[~2025-04-16 22:23 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20250416055250epcas5p25fa8223a1bfeea5583ad8ba88c881a05@epcas5p2.samsung.com>
2025-04-16  5:44 ` [PATCH] io_uring/rsrc: send exact nr_segs for fixed buffer Nitesh Shetty
2025-04-16 14:19   ` Jens Axboe
2025-04-16 14:43     ` Jens Axboe
2025-04-16 14:49       ` Jens Axboe
2025-04-16 15:03   ` Pavel Begunkov
2025-04-16 15:07     ` Jens Axboe
2025-04-16 18:25       ` Jens Axboe
2025-04-16 19:57         ` Nitesh Shetty
2025-04-16 20:01           ` Jens Axboe
2025-04-16 20:29             ` Pavel Begunkov
2025-04-16 20:30               ` Jens Axboe
2025-04-16 21:03                 ` Pavel Begunkov
2025-04-16 22:23                   ` Jens Axboe [this message]
2025-04-16 22:42                     ` Jens Axboe
2025-04-17  9:12                     ` Pavel Begunkov
2025-04-16 20:03           ` Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fe9043f2-6f80-4dab-aba1-e51577ef2645@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=asml.silence@gmail.com \
    --cc=gost.dev@samsung.com \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nitheshshetty@gmail.com \
    --cc=nj.shetty@samsung.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox