public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH 00/11] support kernel allocated regions
@ 2024-11-20 23:33 Pavel Begunkov
  2024-11-20 23:33 ` [PATCH 01/11] io_uring: rename ->resize_lock Pavel Begunkov
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: Pavel Begunkov @ 2024-11-20 23:33 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence

The classical way SQ/CQ work is kernel doing the allocation
and the user mmap'ing it into the userspace. Regions need to
support it as well.

The patchset should be straightforward with simple preparations
patches and cleanups. The main part is Patch 10, which internally
implements kernel allocations, and Patch 11 that implementing the
mmap part and exposes it to reg-wait / parameter region users.

I'll be sending liburing tests in a separate set. Additionally
tested converting CQ/SQ to internal region api, but this change
is left for later.

Pavel Begunkov (11):
  io_uring: rename ->resize_lock
  io_uring/rsrc: export io_check_coalesce_buffer
  io_uring/memmap: add internal region flags
  io_uring/memmap: flag regions with user pages
  io_uring/memmap: account memory before pinning
  io_uring/memmap: reuse io_free_region for failure path
  io_uring/memmap: optimise single folio regions
  io_uring/memmap: helper for pinning region pages
  io_uring/memmap: add IO_REGION_F_SINGLE_REF
  io_uring/memmap: implement kernel allocated regions
  io_uring/memmap: implement mmap for regions

 include/linux/io_uring_types.h |   7 +-
 io_uring/io_uring.c            |   2 +-
 io_uring/memmap.c              | 190 ++++++++++++++++++++++++++++-----
 io_uring/memmap.h              |  12 ++-
 io_uring/register.c            |  12 +--
 io_uring/rsrc.c                |  22 ++--
 io_uring/rsrc.h                |   4 +
 7 files changed, 198 insertions(+), 51 deletions(-)

-- 
2.46.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 01/11] io_uring: rename ->resize_lock
  2024-11-20 23:33 [PATCH 00/11] support kernel allocated regions Pavel Begunkov
@ 2024-11-20 23:33 ` Pavel Begunkov
  2024-11-20 23:33 ` [PATCH 02/11] io_uring/rsrc: export io_check_coalesce_buffer Pavel Begunkov
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Pavel Begunkov @ 2024-11-20 23:33 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence

->resize_lock is used for resizing rings, but it's a good idea to reuse
it in other cases as well. Rename it into mmap_lock as it's protects
from races with mmap.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 include/linux/io_uring_types.h | 2 +-
 io_uring/io_uring.c            | 2 +-
 io_uring/memmap.c              | 6 +++---
 io_uring/register.c            | 8 ++++----
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
index aa5f5ea98076..ac7b2b6484a9 100644
--- a/include/linux/io_uring_types.h
+++ b/include/linux/io_uring_types.h
@@ -422,7 +422,7 @@ struct io_ring_ctx {
 	 * side will need to grab this lock, to prevent either side from
 	 * being run concurrently with the other.
 	 */
-	struct mutex			resize_lock;
+	struct mutex			mmap_lock;
 
 	/*
 	 * If IORING_SETUP_NO_MMAP is used, then the below holds
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index da8fd460977b..d565b1589951 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -350,7 +350,7 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p)
 	INIT_WQ_LIST(&ctx->submit_state.compl_reqs);
 	INIT_HLIST_HEAD(&ctx->cancelable_uring_cmd);
 	io_napi_init(ctx);
-	mutex_init(&ctx->resize_lock);
+	mutex_init(&ctx->mmap_lock);
 
 	return ctx;
 
diff --git a/io_uring/memmap.c b/io_uring/memmap.c
index 3d71756bc598..771a57a4a16b 100644
--- a/io_uring/memmap.c
+++ b/io_uring/memmap.c
@@ -322,7 +322,7 @@ __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
 	unsigned int npages;
 	void *ptr;
 
-	guard(mutex)(&ctx->resize_lock);
+	guard(mutex)(&ctx->mmap_lock);
 
 	ptr = io_uring_validate_mmap_request(file, vma->vm_pgoff, sz);
 	if (IS_ERR(ptr))
@@ -358,7 +358,7 @@ unsigned long io_uring_get_unmapped_area(struct file *filp, unsigned long addr,
 	if (addr)
 		return -EINVAL;
 
-	guard(mutex)(&ctx->resize_lock);
+	guard(mutex)(&ctx->mmap_lock);
 
 	ptr = io_uring_validate_mmap_request(filp, pgoff, len);
 	if (IS_ERR(ptr))
@@ -408,7 +408,7 @@ unsigned long io_uring_get_unmapped_area(struct file *file, unsigned long addr,
 	struct io_ring_ctx *ctx = file->private_data;
 	void *ptr;
 
-	guard(mutex)(&ctx->resize_lock);
+	guard(mutex)(&ctx->mmap_lock);
 
 	ptr = io_uring_validate_mmap_request(file, pgoff, len);
 	if (IS_ERR(ptr))
diff --git a/io_uring/register.c b/io_uring/register.c
index 1e99c783abdf..ba61697d7a53 100644
--- a/io_uring/register.c
+++ b/io_uring/register.c
@@ -486,15 +486,15 @@ static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg)
 	}
 
 	/*
-	 * We'll do the swap. Grab the ctx->resize_lock, which will exclude
+	 * We'll do the swap. Grab the ctx->mmap_lock, which will exclude
 	 * any new mmap's on the ring fd. Clear out existing mappings to prevent
 	 * mmap from seeing them, as we'll unmap them. Any attempt to mmap
 	 * existing rings beyond this point will fail. Not that it could proceed
 	 * at this point anyway, as the io_uring mmap side needs go grab the
-	 * ctx->resize_lock as well. Likewise, hold the completion lock over the
+	 * ctx->mmap_lock as well. Likewise, hold the completion lock over the
 	 * duration of the actual swap.
 	 */
-	mutex_lock(&ctx->resize_lock);
+	mutex_lock(&ctx->mmap_lock);
 	spin_lock(&ctx->completion_lock);
 	o.rings = ctx->rings;
 	ctx->rings = NULL;
@@ -561,7 +561,7 @@ static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg)
 	ret = 0;
 out:
 	spin_unlock(&ctx->completion_lock);
-	mutex_unlock(&ctx->resize_lock);
+	mutex_unlock(&ctx->mmap_lock);
 	io_register_free_rings(&p, to_free);
 
 	if (ctx->sq_data)
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 02/11] io_uring/rsrc: export io_check_coalesce_buffer
  2024-11-20 23:33 [PATCH 00/11] support kernel allocated regions Pavel Begunkov
  2024-11-20 23:33 ` [PATCH 01/11] io_uring: rename ->resize_lock Pavel Begunkov
@ 2024-11-20 23:33 ` Pavel Begunkov
  2024-11-20 23:33 ` [PATCH 03/11] io_uring/memmap: add internal region flags Pavel Begunkov
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Pavel Begunkov @ 2024-11-20 23:33 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence

io_try_coalesce_buffer() is a useful helper collecting useful info about
a set of pages, I want to reuse it for analysing ring/etc. mappings. I
don't need the entire thing and only interested if it can be coalesced
into a single page, but that's better than duplicating the parsing.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 io_uring/rsrc.c | 22 ++++++++++++----------
 io_uring/rsrc.h |  4 ++++
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c
index adaae8630932..e51e5ddae728 100644
--- a/io_uring/rsrc.c
+++ b/io_uring/rsrc.c
@@ -626,11 +626,12 @@ static int io_buffer_account_pin(struct io_ring_ctx *ctx, struct page **pages,
 	return ret;
 }
 
-static bool io_do_coalesce_buffer(struct page ***pages, int *nr_pages,
-				struct io_imu_folio_data *data, int nr_folios)
+static bool io_coalesce_buffer(struct page ***pages, int *nr_pages,
+				struct io_imu_folio_data *data)
 {
 	struct page **page_array = *pages, **new_array = NULL;
 	int nr_pages_left = *nr_pages, i, j;
+	int nr_folios = data->nr_folios;
 
 	/* Store head pages only*/
 	new_array = kvmalloc_array(nr_folios, sizeof(struct page *),
@@ -667,15 +668,14 @@ static bool io_do_coalesce_buffer(struct page ***pages, int *nr_pages,
 	return true;
 }
 
-static bool io_try_coalesce_buffer(struct page ***pages, int *nr_pages,
-					 struct io_imu_folio_data *data)
+bool io_check_coalesce_buffer(struct page **page_array, int nr_pages,
+			      struct io_imu_folio_data *data)
 {
-	struct page **page_array = *pages;
 	struct folio *folio = page_folio(page_array[0]);
 	unsigned int count = 1, nr_folios = 1;
 	int i;
 
-	if (*nr_pages <= 1)
+	if (nr_pages <= 1)
 		return false;
 
 	data->nr_pages_mid = folio_nr_pages(folio);
@@ -687,7 +687,7 @@ static bool io_try_coalesce_buffer(struct page ***pages, int *nr_pages,
 	 * Check if pages are contiguous inside a folio, and all folios have
 	 * the same page count except for the head and tail.
 	 */
-	for (i = 1; i < *nr_pages; i++) {
+	for (i = 1; i < nr_pages; i++) {
 		if (page_folio(page_array[i]) == folio &&
 			page_array[i] == page_array[i-1] + 1) {
 			count++;
@@ -715,7 +715,8 @@ static bool io_try_coalesce_buffer(struct page ***pages, int *nr_pages,
 	if (nr_folios == 1)
 		data->nr_pages_head = count;
 
-	return io_do_coalesce_buffer(pages, nr_pages, data, nr_folios);
+	data->nr_folios = nr_folios;
+	return true;
 }
 
 static struct io_rsrc_node *io_sqe_buffer_register(struct io_ring_ctx *ctx,
@@ -729,7 +730,7 @@ static struct io_rsrc_node *io_sqe_buffer_register(struct io_ring_ctx *ctx,
 	size_t size;
 	int ret, nr_pages, i;
 	struct io_imu_folio_data data;
-	bool coalesced;
+	bool coalesced = false;
 
 	if (!iov->iov_base)
 		return NULL;
@@ -749,7 +750,8 @@ static struct io_rsrc_node *io_sqe_buffer_register(struct io_ring_ctx *ctx,
 	}
 
 	/* If it's huge page(s), try to coalesce them into fewer bvec entries */
-	coalesced = io_try_coalesce_buffer(&pages, &nr_pages, &data);
+	if (io_check_coalesce_buffer(pages, nr_pages, &data))
+		coalesced = io_coalesce_buffer(&pages, &nr_pages, &data);
 
 	imu = kvmalloc(struct_size(imu, bvec, nr_pages), GFP_KERNEL);
 	if (!imu)
diff --git a/io_uring/rsrc.h b/io_uring/rsrc.h
index 7a4668deaa1a..c8b093584461 100644
--- a/io_uring/rsrc.h
+++ b/io_uring/rsrc.h
@@ -40,6 +40,7 @@ struct io_imu_folio_data {
 	/* For non-head/tail folios, has to be fully included */
 	unsigned int	nr_pages_mid;
 	unsigned int	folio_shift;
+	unsigned int	nr_folios;
 };
 
 struct io_rsrc_node *io_rsrc_node_alloc(struct io_ring_ctx *ctx, int type);
@@ -66,6 +67,9 @@ int io_register_rsrc_update(struct io_ring_ctx *ctx, void __user *arg,
 int io_register_rsrc(struct io_ring_ctx *ctx, void __user *arg,
 			unsigned int size, unsigned int type);
 
+bool io_check_coalesce_buffer(struct page **page_array, int nr_pages,
+			      struct io_imu_folio_data *data);
+
 static inline struct io_rsrc_node *io_rsrc_node_lookup(struct io_rsrc_data *data,
 						       int index)
 {
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 03/11] io_uring/memmap: add internal region flags
  2024-11-20 23:33 [PATCH 00/11] support kernel allocated regions Pavel Begunkov
  2024-11-20 23:33 ` [PATCH 01/11] io_uring: rename ->resize_lock Pavel Begunkov
  2024-11-20 23:33 ` [PATCH 02/11] io_uring/rsrc: export io_check_coalesce_buffer Pavel Begunkov
@ 2024-11-20 23:33 ` Pavel Begunkov
  2024-11-20 23:33 ` [PATCH 04/11] io_uring/memmap: flag regions with user pages Pavel Begunkov
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Pavel Begunkov @ 2024-11-20 23:33 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence

Add internal flags for struct io_mapped_region, it will help to add more
functionality while not bloating struct io_mapped_region. Use it to mark
if the pointer needs to be vunmap'ed.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 include/linux/io_uring_types.h |  5 +++--
 io_uring/memmap.c              | 13 +++++++++----
 io_uring/memmap.h              |  2 +-
 3 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
index ac7b2b6484a9..31b420b8ecd9 100644
--- a/include/linux/io_uring_types.h
+++ b/include/linux/io_uring_types.h
@@ -77,8 +77,9 @@ struct io_hash_table {
 
 struct io_mapped_region {
 	struct page		**pages;
-	void			*vmap_ptr;
-	size_t			nr_pages;
+	void			*ptr;
+	unsigned		nr_pages;
+	unsigned		flags;
 };
 
 /*
diff --git a/io_uring/memmap.c b/io_uring/memmap.c
index 771a57a4a16b..21353ea09b39 100644
--- a/io_uring/memmap.c
+++ b/io_uring/memmap.c
@@ -195,14 +195,18 @@ void *__io_uaddr_map(struct page ***pages, unsigned short *npages,
 	return ERR_PTR(-ENOMEM);
 }
 
+enum {
+	IO_REGION_F_VMAP			= 1,
+};
+
 void io_free_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr)
 {
 	if (mr->pages) {
 		unpin_user_pages(mr->pages, mr->nr_pages);
 		kvfree(mr->pages);
 	}
-	if (mr->vmap_ptr)
-		vunmap(mr->vmap_ptr);
+	if ((mr->flags & IO_REGION_F_VMAP) && mr->ptr)
+		vunmap(mr->ptr);
 	if (mr->nr_pages && ctx->user)
 		__io_unaccount_mem(ctx->user, mr->nr_pages);
 
@@ -218,7 +222,7 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr,
 	void *vptr;
 	u64 end;
 
-	if (WARN_ON_ONCE(mr->pages || mr->vmap_ptr || mr->nr_pages))
+	if (WARN_ON_ONCE(mr->pages || mr->ptr || mr->nr_pages))
 		return -EFAULT;
 	if (memchr_inv(&reg->__resv, 0, sizeof(reg->__resv)))
 		return -EINVAL;
@@ -253,8 +257,9 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr,
 	}
 
 	mr->pages = pages;
-	mr->vmap_ptr = vptr;
+	mr->ptr = vptr;
 	mr->nr_pages = nr_pages;
+	mr->flags |= IO_REGION_F_VMAP;
 	return 0;
 out_free:
 	if (pages_accounted)
diff --git a/io_uring/memmap.h b/io_uring/memmap.h
index f361a635b6c7..2096a8427277 100644
--- a/io_uring/memmap.h
+++ b/io_uring/memmap.h
@@ -28,7 +28,7 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr,
 
 static inline void *io_region_get_ptr(struct io_mapped_region *mr)
 {
-	return mr->vmap_ptr;
+	return mr->ptr;
 }
 
 static inline bool io_region_is_set(struct io_mapped_region *mr)
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 04/11] io_uring/memmap: flag regions with user pages
  2024-11-20 23:33 [PATCH 00/11] support kernel allocated regions Pavel Begunkov
                   ` (2 preceding siblings ...)
  2024-11-20 23:33 ` [PATCH 03/11] io_uring/memmap: add internal region flags Pavel Begunkov
@ 2024-11-20 23:33 ` Pavel Begunkov
  2024-11-20 23:33 ` [PATCH 05/11] io_uring/memmap: account memory before pinning Pavel Begunkov
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Pavel Begunkov @ 2024-11-20 23:33 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence

In preparation to kernel allocated regions add a flag telling if
the region contains user pinned pages or not.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 io_uring/memmap.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/io_uring/memmap.c b/io_uring/memmap.c
index 21353ea09b39..f76bee5a861a 100644
--- a/io_uring/memmap.c
+++ b/io_uring/memmap.c
@@ -197,12 +197,16 @@ void *__io_uaddr_map(struct page ***pages, unsigned short *npages,
 
 enum {
 	IO_REGION_F_VMAP			= 1,
+	IO_REGION_F_USER_PINNED			= 2,
 };
 
 void io_free_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr)
 {
 	if (mr->pages) {
-		unpin_user_pages(mr->pages, mr->nr_pages);
+		if (mr->flags & IO_REGION_F_USER_PINNED)
+			unpin_user_pages(mr->pages, mr->nr_pages);
+		else
+			release_pages(mr->pages, mr->nr_pages);
 		kvfree(mr->pages);
 	}
 	if ((mr->flags & IO_REGION_F_VMAP) && mr->ptr)
@@ -259,7 +263,7 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr,
 	mr->pages = pages;
 	mr->ptr = vptr;
 	mr->nr_pages = nr_pages;
-	mr->flags |= IO_REGION_F_VMAP;
+	mr->flags |= IO_REGION_F_VMAP | IO_REGION_F_USER_PINNED;
 	return 0;
 out_free:
 	if (pages_accounted)
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 05/11] io_uring/memmap: account memory before pinning
  2024-11-20 23:33 [PATCH 00/11] support kernel allocated regions Pavel Begunkov
                   ` (3 preceding siblings ...)
  2024-11-20 23:33 ` [PATCH 04/11] io_uring/memmap: flag regions with user pages Pavel Begunkov
@ 2024-11-20 23:33 ` Pavel Begunkov
  2024-11-20 23:33 ` [PATCH 06/11] io_uring/memmap: reuse io_free_region for failure path Pavel Begunkov
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Pavel Begunkov @ 2024-11-20 23:33 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence

Move memory accounting before page pinning. It shouldn't even try to pin
pages if it's not allowed, and accounting is also relatively
inexpensive. It also give a better code structure as we do generic
accounting and then can branch for different mapping types.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 io_uring/memmap.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/io_uring/memmap.c b/io_uring/memmap.c
index f76bee5a861a..cc5f6f69ee6c 100644
--- a/io_uring/memmap.c
+++ b/io_uring/memmap.c
@@ -243,17 +243,21 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr,
 	if (check_add_overflow(reg->user_addr, reg->size, &end))
 		return -EOVERFLOW;
 
-	pages = io_pin_pages(reg->user_addr, reg->size, &nr_pages);
-	if (IS_ERR(pages))
-		return PTR_ERR(pages);
-
+	nr_pages = reg->size >> PAGE_SHIFT;
 	if (ctx->user) {
 		ret = __io_account_mem(ctx->user, nr_pages);
 		if (ret)
-			goto out_free;
+			return ret;
 		pages_accounted = nr_pages;
 	}
 
+	pages = io_pin_pages(reg->user_addr, reg->size, &nr_pages);
+	if (IS_ERR(pages)) {
+		ret = PTR_ERR(pages);
+		pages = NULL;
+		goto out_free;
+	}
+
 	vptr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL);
 	if (!vptr) {
 		ret = -ENOMEM;
@@ -268,7 +272,8 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr,
 out_free:
 	if (pages_accounted)
 		__io_unaccount_mem(ctx->user, pages_accounted);
-	io_pages_free(&pages, nr_pages);
+	if (pages)
+		io_pages_free(&pages, nr_pages);
 	return ret;
 }
 
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 06/11] io_uring/memmap: reuse io_free_region for failure path
  2024-11-20 23:33 [PATCH 00/11] support kernel allocated regions Pavel Begunkov
                   ` (4 preceding siblings ...)
  2024-11-20 23:33 ` [PATCH 05/11] io_uring/memmap: account memory before pinning Pavel Begunkov
@ 2024-11-20 23:33 ` Pavel Begunkov
  2024-11-20 23:33 ` [PATCH 07/11] io_uring/memmap: optimise single folio regions Pavel Begunkov
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Pavel Begunkov @ 2024-11-20 23:33 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence

Regions are going to become more complex with allocation options and
optimisations, I want to split initialisation into steps and for that it
needs a sane fail path. Reuse io_free_region(), it's smart enough to
undo only what's needed and leaves the structure in a consistent state.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 io_uring/memmap.c | 16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/io_uring/memmap.c b/io_uring/memmap.c
index cc5f6f69ee6c..2b3cb3fd3fdf 100644
--- a/io_uring/memmap.c
+++ b/io_uring/memmap.c
@@ -220,7 +220,6 @@ void io_free_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr)
 int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr,
 		     struct io_uring_region_desc *reg)
 {
-	int pages_accounted = 0;
 	struct page **pages;
 	int nr_pages, ret;
 	void *vptr;
@@ -248,32 +247,27 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr,
 		ret = __io_account_mem(ctx->user, nr_pages);
 		if (ret)
 			return ret;
-		pages_accounted = nr_pages;
 	}
+	mr->nr_pages = nr_pages;
 
 	pages = io_pin_pages(reg->user_addr, reg->size, &nr_pages);
 	if (IS_ERR(pages)) {
 		ret = PTR_ERR(pages);
-		pages = NULL;
 		goto out_free;
 	}
+	mr->pages = pages;
+	mr->flags |= IO_REGION_F_USER_PINNED;
 
 	vptr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL);
 	if (!vptr) {
 		ret = -ENOMEM;
 		goto out_free;
 	}
-
-	mr->pages = pages;
 	mr->ptr = vptr;
-	mr->nr_pages = nr_pages;
-	mr->flags |= IO_REGION_F_VMAP | IO_REGION_F_USER_PINNED;
+	mr->flags |= IO_REGION_F_VMAP;
 	return 0;
 out_free:
-	if (pages_accounted)
-		__io_unaccount_mem(ctx->user, pages_accounted);
-	if (pages)
-		io_pages_free(&pages, nr_pages);
+	io_free_region(ctx, mr);
 	return ret;
 }
 
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 07/11] io_uring/memmap: optimise single folio regions
  2024-11-20 23:33 [PATCH 00/11] support kernel allocated regions Pavel Begunkov
                   ` (5 preceding siblings ...)
  2024-11-20 23:33 ` [PATCH 06/11] io_uring/memmap: reuse io_free_region for failure path Pavel Begunkov
@ 2024-11-20 23:33 ` Pavel Begunkov
  2024-11-20 23:33 ` [PATCH 08/11] io_uring/memmap: helper for pinning region pages Pavel Begunkov
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Pavel Begunkov @ 2024-11-20 23:33 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence

We don't need to vmap if memory is already physically contiguous. There
are two important cases it covers: PAGE_SIZE regions and huge pages.
Use io_check_coalesce_buffer() to get the number of contiguous folios.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 io_uring/memmap.c | 29 ++++++++++++++++++++++-------
 1 file changed, 22 insertions(+), 7 deletions(-)

diff --git a/io_uring/memmap.c b/io_uring/memmap.c
index 2b3cb3fd3fdf..32d2a39aff02 100644
--- a/io_uring/memmap.c
+++ b/io_uring/memmap.c
@@ -217,12 +217,31 @@ void io_free_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr)
 	memset(mr, 0, sizeof(*mr));
 }
 
+static int io_region_init_ptr(struct io_mapped_region *mr)
+{
+	struct io_imu_folio_data ifd;
+	void *ptr;
+
+	if (io_check_coalesce_buffer(mr->pages, mr->nr_pages, &ifd)) {
+		if (ifd.nr_folios == 1) {
+			mr->ptr = page_address(mr->pages[0]);
+			return 0;
+		}
+	}
+	ptr = vmap(mr->pages, mr->nr_pages, VM_MAP, PAGE_KERNEL);
+	if (!ptr)
+		return -ENOMEM;
+
+	mr->ptr = ptr;
+	mr->flags |= IO_REGION_F_VMAP;
+	return 0;
+}
+
 int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr,
 		     struct io_uring_region_desc *reg)
 {
 	struct page **pages;
 	int nr_pages, ret;
-	void *vptr;
 	u64 end;
 
 	if (WARN_ON_ONCE(mr->pages || mr->ptr || mr->nr_pages))
@@ -258,13 +277,9 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr,
 	mr->pages = pages;
 	mr->flags |= IO_REGION_F_USER_PINNED;
 
-	vptr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL);
-	if (!vptr) {
-		ret = -ENOMEM;
+	ret = io_region_init_ptr(mr);
+	if (ret)
 		goto out_free;
-	}
-	mr->ptr = vptr;
-	mr->flags |= IO_REGION_F_VMAP;
 	return 0;
 out_free:
 	io_free_region(ctx, mr);
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 08/11] io_uring/memmap: helper for pinning region pages
  2024-11-20 23:33 [PATCH 00/11] support kernel allocated regions Pavel Begunkov
                   ` (6 preceding siblings ...)
  2024-11-20 23:33 ` [PATCH 07/11] io_uring/memmap: optimise single folio regions Pavel Begunkov
@ 2024-11-20 23:33 ` Pavel Begunkov
  2024-11-20 23:33 ` [PATCH 09/11] io_uring/memmap: add IO_REGION_F_SINGLE_REF Pavel Begunkov
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Pavel Begunkov @ 2024-11-20 23:33 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence

In preparation to adding kernel allocated regions extract a new helper
that pins user pages.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 io_uring/memmap.c | 29 +++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/io_uring/memmap.c b/io_uring/memmap.c
index 32d2a39aff02..15fefbed77ec 100644
--- a/io_uring/memmap.c
+++ b/io_uring/memmap.c
@@ -237,10 +237,28 @@ static int io_region_init_ptr(struct io_mapped_region *mr)
 	return 0;
 }
 
+static int io_region_pin_pages(struct io_ring_ctx *ctx,
+				struct io_mapped_region *mr,
+				struct io_uring_region_desc *reg)
+{
+	unsigned long size = mr->nr_pages << PAGE_SHIFT;
+	struct page **pages;
+	int nr_pages;
+
+	pages = io_pin_pages(reg->user_addr, size, &nr_pages);
+	if (IS_ERR(pages))
+		return PTR_ERR(pages);
+	if (WARN_ON_ONCE(nr_pages != mr->nr_pages))
+		return -EFAULT;
+
+	mr->pages = pages;
+	mr->flags |= IO_REGION_F_USER_PINNED;
+	return 0;
+}
+
 int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr,
 		     struct io_uring_region_desc *reg)
 {
-	struct page **pages;
 	int nr_pages, ret;
 	u64 end;
 
@@ -269,14 +287,9 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr,
 	}
 	mr->nr_pages = nr_pages;
 
-	pages = io_pin_pages(reg->user_addr, reg->size, &nr_pages);
-	if (IS_ERR(pages)) {
-		ret = PTR_ERR(pages);
+	ret = io_region_pin_pages(ctx, mr, reg);
+	if (ret)
 		goto out_free;
-	}
-	mr->pages = pages;
-	mr->flags |= IO_REGION_F_USER_PINNED;
-
 	ret = io_region_init_ptr(mr);
 	if (ret)
 		goto out_free;
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 09/11] io_uring/memmap: add IO_REGION_F_SINGLE_REF
  2024-11-20 23:33 [PATCH 00/11] support kernel allocated regions Pavel Begunkov
                   ` (7 preceding siblings ...)
  2024-11-20 23:33 ` [PATCH 08/11] io_uring/memmap: helper for pinning region pages Pavel Begunkov
@ 2024-11-20 23:33 ` Pavel Begunkov
  2024-11-20 23:33 ` [PATCH 10/11] io_uring/memmap: implement kernel allocated regions Pavel Begunkov
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Pavel Begunkov @ 2024-11-20 23:33 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence

Kernel allocated compound pages will have just one reference for the
entire page array, add a flag telling io_free_region about that.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 io_uring/memmap.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/io_uring/memmap.c b/io_uring/memmap.c
index 15fefbed77ec..cdd620bdd3ee 100644
--- a/io_uring/memmap.c
+++ b/io_uring/memmap.c
@@ -198,15 +198,22 @@ void *__io_uaddr_map(struct page ***pages, unsigned short *npages,
 enum {
 	IO_REGION_F_VMAP			= 1,
 	IO_REGION_F_USER_PINNED			= 2,
+	IO_REGION_F_SINGLE_REF			= 4,
 };
 
 void io_free_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr)
 {
 	if (mr->pages) {
+		long nr_pages = mr->nr_pages;
+
+		if (mr->flags & IO_REGION_F_SINGLE_REF)
+			nr_pages = 1;
+
 		if (mr->flags & IO_REGION_F_USER_PINNED)
-			unpin_user_pages(mr->pages, mr->nr_pages);
+			unpin_user_pages(mr->pages, nr_pages);
 		else
-			release_pages(mr->pages, mr->nr_pages);
+			release_pages(mr->pages, nr_pages);
+
 		kvfree(mr->pages);
 	}
 	if ((mr->flags & IO_REGION_F_VMAP) && mr->ptr)
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 10/11] io_uring/memmap: implement kernel allocated regions
  2024-11-20 23:33 [PATCH 00/11] support kernel allocated regions Pavel Begunkov
                   ` (8 preceding siblings ...)
  2024-11-20 23:33 ` [PATCH 09/11] io_uring/memmap: add IO_REGION_F_SINGLE_REF Pavel Begunkov
@ 2024-11-20 23:33 ` Pavel Begunkov
  2024-11-20 23:33 ` [PATCH 11/11] io_uring/memmap: implement mmap for regions Pavel Begunkov
  2024-11-21  1:28 ` [PATCH 00/11] support kernel allocated regions Jens Axboe
  11 siblings, 0 replies; 13+ messages in thread
From: Pavel Begunkov @ 2024-11-20 23:33 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence

Allow the kernel to allocate memory for a region. That's the classical
way SQ/CQ are allocated. It's not yet useful to user space as there
is no way to mmap it, which is why it's explicitly disabled in
io_register_mem_region().

Signed-off-by: Pavel Begunkov <[email protected]>
---
 io_uring/memmap.c   | 44 +++++++++++++++++++++++++++++++++++++++++---
 io_uring/register.c |  2 ++
 2 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/io_uring/memmap.c b/io_uring/memmap.c
index cdd620bdd3ee..8598770bc385 100644
--- a/io_uring/memmap.c
+++ b/io_uring/memmap.c
@@ -263,6 +263,39 @@ static int io_region_pin_pages(struct io_ring_ctx *ctx,
 	return 0;
 }
 
+static int io_region_allocate_pages(struct io_ring_ctx *ctx,
+				    struct io_mapped_region *mr,
+				    struct io_uring_region_desc *reg)
+{
+	gfp_t gfp = GFP_KERNEL_ACCOUNT | __GFP_ZERO | __GFP_NOWARN;
+	unsigned long size = mr->nr_pages << PAGE_SHIFT;
+	unsigned long nr_allocated;
+	struct page **pages;
+	void *p;
+
+	pages = kvmalloc_array(mr->nr_pages, sizeof(*pages), gfp);
+	if (!pages)
+		return -ENOMEM;
+
+	p = io_mem_alloc_compound(pages, mr->nr_pages, size, gfp);
+	if (!IS_ERR(p)) {
+		mr->flags |= IO_REGION_F_SINGLE_REF;
+		mr->pages = pages;
+		return 0;
+	}
+
+	nr_allocated = alloc_pages_bulk_noprof(gfp, numa_node_id(), NULL,
+					       mr->nr_pages, NULL, pages);
+	if (nr_allocated != mr->nr_pages) {
+		if (nr_allocated)
+			release_pages(pages, nr_allocated);
+		kvfree(pages);
+		return -ENOMEM;
+	}
+	mr->pages = pages;
+	return 0;
+}
+
 int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr,
 		     struct io_uring_region_desc *reg)
 {
@@ -273,9 +306,10 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr,
 		return -EFAULT;
 	if (memchr_inv(&reg->__resv, 0, sizeof(reg->__resv)))
 		return -EINVAL;
-	if (reg->flags != IORING_MEM_REGION_TYPE_USER)
+	if (reg->flags & ~IORING_MEM_REGION_TYPE_USER)
 		return -EINVAL;
-	if (!reg->user_addr)
+	/* user_addr should be set IFF it's a user memory backed region */
+	if ((reg->flags & IORING_MEM_REGION_TYPE_USER) != !!reg->user_addr)
 		return -EFAULT;
 	if (!reg->size || reg->mmap_offset || reg->id)
 		return -EINVAL;
@@ -294,9 +328,13 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr,
 	}
 	mr->nr_pages = nr_pages;
 
-	ret = io_region_pin_pages(ctx, mr, reg);
+	if (reg->flags & IORING_MEM_REGION_TYPE_USER)
+		ret = io_region_pin_pages(ctx, mr, reg);
+	else
+		ret = io_region_allocate_pages(ctx, mr, reg);
 	if (ret)
 		goto out_free;
+
 	ret = io_region_init_ptr(mr);
 	if (ret)
 		goto out_free;
diff --git a/io_uring/register.c b/io_uring/register.c
index ba61697d7a53..f043d3f6b026 100644
--- a/io_uring/register.c
+++ b/io_uring/register.c
@@ -586,6 +586,8 @@ static int io_register_mem_region(struct io_ring_ctx *ctx, void __user *uarg)
 	if (copy_from_user(&rd, rd_uptr, sizeof(rd)))
 		return -EFAULT;
 
+	if (!(rd.flags & IORING_MEM_REGION_TYPE_USER))
+		return -EINVAL;
 	if (memchr_inv(&reg.__resv, 0, sizeof(reg.__resv)))
 		return -EINVAL;
 	if (reg.flags & ~IORING_MEM_REGION_REG_WAIT_ARG)
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 11/11] io_uring/memmap: implement mmap for regions
  2024-11-20 23:33 [PATCH 00/11] support kernel allocated regions Pavel Begunkov
                   ` (9 preceding siblings ...)
  2024-11-20 23:33 ` [PATCH 10/11] io_uring/memmap: implement kernel allocated regions Pavel Begunkov
@ 2024-11-20 23:33 ` Pavel Begunkov
  2024-11-21  1:28 ` [PATCH 00/11] support kernel allocated regions Jens Axboe
  11 siblings, 0 replies; 13+ messages in thread
From: Pavel Begunkov @ 2024-11-20 23:33 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence

The patch implements mmap for the param region and enables the kernel
allocation mode. Internally it uses a fixed mmap offset, however the
user has to use the offset returned in
struct io_uring_region_desc::mmap_offset.

Note, mmap doesn't and can't take ->uring_lock and the region / ring
lookup is protected by ->mmap_lock, and it's directly peeking at
ctx->param_region. We can't protect io_create_region() with the
mmap_lock as it'd deadlock, which is why io_create_region_mmap_safe()
initialises it for us in a temporary variable and then publishes it
with the lock taken. It's intentionally decoupled from main region
helpers, and in the future we might want to have a list of active
regions, which then could be protected by the ->mmap_lock.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 io_uring/memmap.c   | 61 +++++++++++++++++++++++++++++++++++++++++----
 io_uring/memmap.h   | 10 +++++++-
 io_uring/register.c |  6 ++---
 3 files changed, 67 insertions(+), 10 deletions(-)

diff --git a/io_uring/memmap.c b/io_uring/memmap.c
index 8598770bc385..5d971ba33d5a 100644
--- a/io_uring/memmap.c
+++ b/io_uring/memmap.c
@@ -265,7 +265,8 @@ static int io_region_pin_pages(struct io_ring_ctx *ctx,
 
 static int io_region_allocate_pages(struct io_ring_ctx *ctx,
 				    struct io_mapped_region *mr,
-				    struct io_uring_region_desc *reg)
+				    struct io_uring_region_desc *reg,
+				    unsigned long mmap_offset)
 {
 	gfp_t gfp = GFP_KERNEL_ACCOUNT | __GFP_ZERO | __GFP_NOWARN;
 	unsigned long size = mr->nr_pages << PAGE_SHIFT;
@@ -280,8 +281,7 @@ static int io_region_allocate_pages(struct io_ring_ctx *ctx,
 	p = io_mem_alloc_compound(pages, mr->nr_pages, size, gfp);
 	if (!IS_ERR(p)) {
 		mr->flags |= IO_REGION_F_SINGLE_REF;
-		mr->pages = pages;
-		return 0;
+		goto done;
 	}
 
 	nr_allocated = alloc_pages_bulk_noprof(gfp, numa_node_id(), NULL,
@@ -292,12 +292,15 @@ static int io_region_allocate_pages(struct io_ring_ctx *ctx,
 		kvfree(pages);
 		return -ENOMEM;
 	}
+done:
+	reg->mmap_offset = mmap_offset;
 	mr->pages = pages;
 	return 0;
 }
 
 int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr,
-		     struct io_uring_region_desc *reg)
+		     struct io_uring_region_desc *reg,
+		     unsigned long mmap_offset)
 {
 	int nr_pages, ret;
 	u64 end;
@@ -331,7 +334,7 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr,
 	if (reg->flags & IORING_MEM_REGION_TYPE_USER)
 		ret = io_region_pin_pages(ctx, mr, reg);
 	else
-		ret = io_region_allocate_pages(ctx, mr, reg);
+		ret = io_region_allocate_pages(ctx, mr, reg, mmap_offset);
 	if (ret)
 		goto out_free;
 
@@ -344,6 +347,50 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr,
 	return ret;
 }
 
+int io_create_region_mmap_safe(struct io_ring_ctx *ctx, struct io_mapped_region *mr,
+				struct io_uring_region_desc *reg,
+				unsigned long mmap_offset)
+{
+	struct io_mapped_region tmp_mr;
+	int ret;
+
+	memcpy(&tmp_mr, mr, sizeof(tmp_mr));
+	ret = io_create_region(ctx, &tmp_mr, reg, mmap_offset);
+	if (ret)
+		return ret;
+
+	/*
+	 * Once published mmap can find it without holding only the ->mmap_lock
+	 * and not ->uring_lock.
+	 */
+	guard(mutex)(&ctx->mmap_lock);
+	memcpy(mr, &tmp_mr, sizeof(tmp_mr));
+	return 0;
+}
+
+static void *io_region_validate_mmap(struct io_ring_ctx *ctx,
+				     struct io_mapped_region *mr)
+{
+	lockdep_assert_held(&ctx->mmap_lock);
+
+	if (!io_region_is_set(mr))
+		return ERR_PTR(-EINVAL);
+	if (mr->flags & IO_REGION_F_USER_PINNED)
+		return ERR_PTR(-EINVAL);
+
+	return io_region_get_ptr(mr);
+}
+
+static int io_region_mmap(struct io_ring_ctx *ctx,
+			  struct io_mapped_region *mr,
+			  struct vm_area_struct *vma)
+{
+	unsigned long nr_pages = mr->nr_pages;
+
+	vm_flags_set(vma, VM_DONTEXPAND);
+	return vm_insert_pages(vma, vma->vm_start, mr->pages, &nr_pages);
+}
+
 static void *io_uring_validate_mmap_request(struct file *file, loff_t pgoff,
 					    size_t sz)
 {
@@ -379,6 +426,8 @@ static void *io_uring_validate_mmap_request(struct file *file, loff_t pgoff,
 		io_put_bl(ctx, bl);
 		return ptr;
 		}
+	case IORING_MAP_OFF_PARAM_REGION:
+		return io_region_validate_mmap(ctx, &ctx->param_region);
 	}
 
 	return ERR_PTR(-EINVAL);
@@ -419,6 +468,8 @@ __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
 						ctx->n_sqe_pages);
 	case IORING_OFF_PBUF_RING:
 		return io_pbuf_mmap(file, vma);
+	case IORING_MAP_OFF_PARAM_REGION:
+		return io_region_mmap(ctx, &ctx->param_region, vma);
 	}
 
 	return -EINVAL;
diff --git a/io_uring/memmap.h b/io_uring/memmap.h
index 2096a8427277..2402bca3d700 100644
--- a/io_uring/memmap.h
+++ b/io_uring/memmap.h
@@ -1,6 +1,8 @@
 #ifndef IO_URING_MEMMAP_H
 #define IO_URING_MEMMAP_H
 
+#define IORING_MAP_OFF_PARAM_REGION		0x20000000ULL
+
 struct page **io_pin_pages(unsigned long ubuf, unsigned long len, int *npages);
 void io_pages_free(struct page ***pages, int npages);
 int io_uring_mmap_pages(struct io_ring_ctx *ctx, struct vm_area_struct *vma,
@@ -24,7 +26,13 @@ int io_uring_mmap(struct file *file, struct vm_area_struct *vma);
 
 void io_free_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr);
 int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr,
-		     struct io_uring_region_desc *reg);
+		     struct io_uring_region_desc *reg,
+		     unsigned long mmap_offset);
+
+int io_create_region_mmap_safe(struct io_ring_ctx *ctx,
+				struct io_mapped_region *mr,
+				struct io_uring_region_desc *reg,
+				unsigned long mmap_offset);
 
 static inline void *io_region_get_ptr(struct io_mapped_region *mr)
 {
diff --git a/io_uring/register.c b/io_uring/register.c
index f043d3f6b026..5b099ec36d00 100644
--- a/io_uring/register.c
+++ b/io_uring/register.c
@@ -585,9 +585,6 @@ static int io_register_mem_region(struct io_ring_ctx *ctx, void __user *uarg)
 	rd_uptr = u64_to_user_ptr(reg.region_uptr);
 	if (copy_from_user(&rd, rd_uptr, sizeof(rd)))
 		return -EFAULT;
-
-	if (!(rd.flags & IORING_MEM_REGION_TYPE_USER))
-		return -EINVAL;
 	if (memchr_inv(&reg.__resv, 0, sizeof(reg.__resv)))
 		return -EINVAL;
 	if (reg.flags & ~IORING_MEM_REGION_REG_WAIT_ARG)
@@ -602,7 +599,8 @@ static int io_register_mem_region(struct io_ring_ctx *ctx, void __user *uarg)
 	    !(ctx->flags & IORING_SETUP_R_DISABLED))
 		return -EINVAL;
 
-	ret = io_create_region(ctx, &ctx->param_region, &rd);
+	ret = io_create_region_mmap_safe(ctx, &ctx->param_region, &rd,
+					 IORING_MAP_OFF_PARAM_REGION);
 	if (ret)
 		return ret;
 	if (copy_to_user(rd_uptr, &rd, sizeof(rd))) {
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 00/11] support kernel allocated regions
  2024-11-20 23:33 [PATCH 00/11] support kernel allocated regions Pavel Begunkov
                   ` (10 preceding siblings ...)
  2024-11-20 23:33 ` [PATCH 11/11] io_uring/memmap: implement mmap for regions Pavel Begunkov
@ 2024-11-21  1:28 ` Jens Axboe
  11 siblings, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2024-11-21  1:28 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 11/20/24 4:33 PM, Pavel Begunkov wrote:
> The classical way SQ/CQ work is kernel doing the allocation
> and the user mmap'ing it into the userspace. Regions need to
> support it as well.
> 
> The patchset should be straightforward with simple preparations
> patches and cleanups. The main part is Patch 10, which internally
> implements kernel allocations, and Patch 11 that implementing the
> mmap part and exposes it to reg-wait / parameter region users.
> 
> I'll be sending liburing tests in a separate set. Additionally
> tested converting CQ/SQ to internal region api, but this change
> is left for later.

Took a quick look and I like it, agree that regions should be
broadly usable rather than be tied to pinning. I'll give this a
more thorough look in the coming days.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-11-21  1:28 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-20 23:33 [PATCH 00/11] support kernel allocated regions Pavel Begunkov
2024-11-20 23:33 ` [PATCH 01/11] io_uring: rename ->resize_lock Pavel Begunkov
2024-11-20 23:33 ` [PATCH 02/11] io_uring/rsrc: export io_check_coalesce_buffer Pavel Begunkov
2024-11-20 23:33 ` [PATCH 03/11] io_uring/memmap: add internal region flags Pavel Begunkov
2024-11-20 23:33 ` [PATCH 04/11] io_uring/memmap: flag regions with user pages Pavel Begunkov
2024-11-20 23:33 ` [PATCH 05/11] io_uring/memmap: account memory before pinning Pavel Begunkov
2024-11-20 23:33 ` [PATCH 06/11] io_uring/memmap: reuse io_free_region for failure path Pavel Begunkov
2024-11-20 23:33 ` [PATCH 07/11] io_uring/memmap: optimise single folio regions Pavel Begunkov
2024-11-20 23:33 ` [PATCH 08/11] io_uring/memmap: helper for pinning region pages Pavel Begunkov
2024-11-20 23:33 ` [PATCH 09/11] io_uring/memmap: add IO_REGION_F_SINGLE_REF Pavel Begunkov
2024-11-20 23:33 ` [PATCH 10/11] io_uring/memmap: implement kernel allocated regions Pavel Begunkov
2024-11-20 23:33 ` [PATCH 11/11] io_uring/memmap: implement mmap for regions Pavel Begunkov
2024-11-21  1:28 ` [PATCH 00/11] support kernel allocated regions Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox