public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: io-uring@vger.kernel.org
Cc: asml.silence@gmail.com, David Wei <dw@davidwei.uk>
Subject: [PATCH 2/4] io_uring/zcrx: add initial infra for large pages
Date: Tue, 22 Apr 2025 15:44:42 +0100	[thread overview]
Message-ID: <3f5949a63571e9eb3d2e4c7450d805b0ed23ff8e.1745328503.git.asml.silence@gmail.com> (raw)
In-Reply-To: <cover.1745328503.git.asml.silence@gmail.com>

Currently, the page array and net_iovs are 4K sized and have the same
number of elements. Allow the page array to be of a different shape,
which will be needed to support huge pages. The total size should always
match, but now we can store fewer larger pages / folios. The only
restriction here is that the folios size should always be equal or
larger than the niov size.

Note, there is no way just yet to really shrink the page array, and
it'll be added in following patches.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 io_uring/zcrx.c | 24 +++++++++++++++++++-----
 io_uring/zcrx.h |  3 +++
 2 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index 0f9375e889c3..784c4ed6c780 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -38,11 +38,21 @@ static inline struct io_zcrx_area *io_zcrx_iov_to_area(const struct net_iov *nio
 	return container_of(owner, struct io_zcrx_area, nia);
 }
 
-static inline struct page *io_zcrx_iov_page(const struct net_iov *niov)
+/* shift from chunk / niov to folio size */
+static inline unsigned io_chunk_folio_shift(struct io_zcrx_area *area)
 {
-	struct io_zcrx_area *area = io_zcrx_iov_to_area(niov);
+	return area->folio_shift - PAGE_SHIFT;
+}
+
+static struct page *io_zcrx_iov_page(struct io_zcrx_area *area,
+				     const struct net_iov *niov)
+{
+	unsigned chunk_gid = net_iov_idx(niov) + area->chunk_id_offset;
+	unsigned folio_idx, base_chunk_gid;
 
-	return area->pages[net_iov_idx(niov)];
+	folio_idx = chunk_gid >> io_chunk_folio_shift(area);
+	base_chunk_gid = folio_idx << io_chunk_folio_shift(area);
+	return area->pages[folio_idx] + (chunk_gid - base_chunk_gid);
 }
 
 #define IO_DMA_ATTR (DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_WEAK_ORDERING)
@@ -82,9 +92,11 @@ static int io_zcrx_map_area(struct io_zcrx_ifq *ifq, struct io_zcrx_area *area)
 
 	for (i = 0; i < area->nia.num_niovs; i++) {
 		struct net_iov *niov = &area->nia.niovs[i];
+		struct page *page;
 		dma_addr_t dma;
 
-		dma = dma_map_page_attrs(ifq->dev, area->pages[i], 0, PAGE_SIZE,
+		page = io_zcrx_iov_page(area, niov);
+		dma = dma_map_page_attrs(ifq->dev, page, 0, PAGE_SIZE,
 					 DMA_FROM_DEVICE, IO_DMA_ATTR);
 		if (dma_mapping_error(ifq->dev, dma))
 			break;
@@ -225,6 +237,8 @@ static int io_import_area_memory(struct io_zcrx_ifq *ifq,
 		return ret;
 	}
 	area->nr_folios = nr_pages;
+	area->folio_shift = PAGE_SHIFT;
+	area->chunk_id_offset = 0;
 	return 0;
 }
 
@@ -807,7 +821,7 @@ static ssize_t io_zcrx_copy_chunk(struct io_kiocb *req, struct io_zcrx_ifq *ifq,
 			break;
 		}
 
-		dst_page = io_zcrx_iov_page(niov);
+		dst_page = io_zcrx_iov_page(area, niov);
 		dst_addr = kmap_local_page(dst_page);
 		if (src_page)
 			src_base = kmap_local_page(src_page);
diff --git a/io_uring/zcrx.h b/io_uring/zcrx.h
index e3c7c4e647f1..dd29cfef637f 100644
--- a/io_uring/zcrx.h
+++ b/io_uring/zcrx.h
@@ -15,7 +15,10 @@ struct io_zcrx_area {
 	bool			is_mapped;
 	u16			area_id;
 	struct page		**pages;
+	/* offset into the first folio in allocation chunks  */
+	unsigned long		chunk_id_offset;
 	unsigned long		nr_folios;
+	unsigned		folio_shift;
 
 	/* freelist */
 	spinlock_t		freelist_lock ____cacheline_aligned_in_smp;
-- 
2.48.1


  parent reply	other threads:[~2025-04-22 14:43 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-22 14:44 [PATCH 0/4] preparation for zcrx with huge pages Pavel Begunkov
2025-04-22 14:44 ` [PATCH 1/4] io_uring/zcrx: add helper for importing user memory Pavel Begunkov
2025-04-22 14:44 ` Pavel Begunkov [this message]
2025-04-22 14:44 ` [PATCH 3/4] io_uring: export io_coalesce_buffer() Pavel Begunkov
2025-04-22 14:44 ` [PATCH 4/4] io_uring/zcrx: coalesce areas with huge pages Pavel Begunkov
2025-04-25 14:01 ` [PATCH 0/4] preparation for zcrx " Pavel Begunkov
2025-04-25 15:42   ` David Wei
2025-04-26  0:01     ` Pavel Begunkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3f5949a63571e9eb3d2e4c7450d805b0ed23ff8e.1745328503.git.asml.silence@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=dw@davidwei.uk \
    --cc=io-uring@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox