public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/9] io_uring: add kernel-managed buffer rings
@ 2026-02-18  2:51 Joanne Koong
  2026-02-18  2:51 ` [PATCH v2 1/9] io_uring/memmap: chunk allocations in io_region_allocate_pages() Joanne Koong
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Joanne Koong @ 2026-02-18  2:51 UTC (permalink / raw)
  To: axboe, io-uring; +Cc: csander, bernd, hch, asml.silence

Currently, io_uring buffer rings require the application to allocate and
manage the backing buffers. This series introduces buffer rings, where
the kernel allocates and manages the buffers on behalf of the application.

This is split out from the fuse over io_uring series in [1], which needs the
kernel to own and manage buffers shared between the fuse server and the
kernel.

This series is on top of commit 73cf88d775b1f55 in the for-next branch in
Jens' io-uring tree. The corresponding liburing changes are in [2] and will
be submitted after the changes in this patchset are accepted.

There was a discussion on v1 about having kernel-managed buffer rings go
through a user-provided registered memory region. This changes proposed in
this patchset add a simple straightforward interface where the kernel
allocates the buffers and the buffers are tied to the lifecycle of the ring,
which suffices for the majority of use cases. If/when in the future it is
useful for the buffers to be backed by a prior user registered mem region
(eg for PMD optimization gains), the changes in this patchset do not preclude
support for that from being added.

The link to the fuse commits that use the changes in this series is in [3].

Thanks,
Joanne

[1] https://lore.kernel.org/linux-fsdevel/20260116233044.1532965-1-joannelkoong@gmail.com/
[2] https://github.com/joannekoong/liburing/commits/pbuf_kernel_managed/
[3] https://github.com/joannekoong/linux/commits/fuse_zero_copy/

Changelog
---------
v1: https://lore.kernel.org/linux-fsdevel/20260210002852.1394504-1-joannelkoong@gmail.com/T/#t
* Incorporate Jens' feedback, including fixing wraparound int promotion bug
* memmap: drop allocation per buf + have everything go through io_create_region (Pavel),
  add 2MB chunking workaround for large allocations
* uapi: merge kmbuf into pbuf interface/apis as IOU_PBUF_RING_KERNEL_MANAGED flag (Pavel)

Changes since [1]:
* add "if (bl)" check for recycling API (Bernd)
* check mul overflow, use GFP_USER, use PTR as return type (Christoph)
* fix bl->ring leak (me)

Joanne Koong (9):
  io_uring/memmap: chunk allocations in io_region_allocate_pages()
  io_uring/kbuf: add support for kernel-managed buffer rings
  io_uring/kbuf: support kernel-managed buffer rings in buffer selection
  io_uring/kbuf: add buffer ring pinning/unpinning
  io_uring/kbuf: return buffer id in buffer selection
  io_uring/kbuf: add recycling for kernel managed buffer rings
  io_uring/kbuf: add io_uring_is_kmbuf_ring()
  io_uring/kbuf: export io_ring_buffer_select()
  io_uring/cmd: set selected buffer index in __io_uring_cmd_done()

 include/linux/io_uring/cmd.h   |  53 ++++++-
 include/linux/io_uring_types.h |  10 +-
 include/uapi/linux/io_uring.h  |  14 +-
 io_uring/kbuf.c                | 248 +++++++++++++++++++++++++++++----
 io_uring/kbuf.h                |  11 +-
 io_uring/memmap.c              |  87 +++++++++---
 io_uring/uring_cmd.c           |   6 +-
 7 files changed, 373 insertions(+), 56 deletions(-)

-- 
2.47.3


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/9] io_uring/memmap: chunk allocations in io_region_allocate_pages()
  2026-02-18  2:51 [PATCH v2 0/9] io_uring: add kernel-managed buffer rings Joanne Koong
@ 2026-02-18  2:51 ` Joanne Koong
  2026-02-18  2:52 ` [PATCH v2 2/9] io_uring/kbuf: add support for kernel-managed buffer rings Joanne Koong
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Joanne Koong @ 2026-02-18  2:51 UTC (permalink / raw)
  To: axboe, io-uring; +Cc: csander, bernd, hch, asml.silence

Currently, io_region_allocate_pages() tries a single compound allocation
for the entire region, and falls back to alloc_pages_bulk_node() if that
fails.

When allocating a large region, trying to do a single compound
allocation may be unrealistic while allocating page by page may be
inefficient and cause worse TLB performance.

Rework io_region_allocate_pages() to allocate memory in 2MB chunks,
attempting a compound allocation for each chunk.

Replace IO_REGION_F_SINGLE_REF with IO_REGION_F_COMPOUND_PAGES to
reflect that the page array may contain tail pages from multiple
compound allocations.

Currently, alloc_pages_bulk_node() fails when the GFP_KERNEL_ACCOUNT gfp
flag is set. This makes this commit a necessary change in order to use
kernel-managed ring buffers (which will allocate regions of large
sizes), at least until that issue is fixed.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 io_uring/memmap.c | 87 ++++++++++++++++++++++++++++++++++-------------
 1 file changed, 64 insertions(+), 23 deletions(-)

diff --git a/io_uring/memmap.c b/io_uring/memmap.c
index 89f56609e50a..6e91960aa8fc 100644
--- a/io_uring/memmap.c
+++ b/io_uring/memmap.c
@@ -15,6 +15,28 @@
 #include "rsrc.h"
 #include "zcrx.h"
 
+static void release_compound_pages(struct page **pages, unsigned long nr_pages)
+{
+	struct page *page;
+	unsigned int nr, i = 0;
+
+	while (nr_pages) {
+		page = pages[i];
+
+		if (!page || WARN_ON_ONCE(page != compound_head(page)))
+			return;
+
+		nr = compound_nr(page);
+		put_page(page);
+
+		if (nr >= nr_pages)
+			return;
+
+		i += nr;
+		nr_pages -= nr;
+	}
+}
+
 static bool io_mem_alloc_compound(struct page **pages, int nr_pages,
 				  size_t size, gfp_t gfp)
 {
@@ -84,22 +106,19 @@ enum {
 	IO_REGION_F_VMAP			= 1,
 	/* memory is provided by user and pinned by the kernel */
 	IO_REGION_F_USER_PROVIDED		= 2,
-	/* only the first page in the array is ref'ed */
-	IO_REGION_F_SINGLE_REF			= 4,
+	/* memory may contain tail pages from compound allocations */
+	IO_REGION_F_COMPOUND_PAGES		= 4,
 };
 
 void io_free_region(struct user_struct *user, struct io_mapped_region *mr)
 {
 	if (mr->pages) {
-		long nr_refs = mr->nr_pages;
-
-		if (mr->flags & IO_REGION_F_SINGLE_REF)
-			nr_refs = 1;
-
 		if (mr->flags & IO_REGION_F_USER_PROVIDED)
-			unpin_user_pages(mr->pages, nr_refs);
+			unpin_user_pages(mr->pages, mr->nr_pages);
+		else if (mr->flags & IO_REGION_F_COMPOUND_PAGES)
+			release_compound_pages(mr->pages, mr->nr_pages);
 		else
-			release_pages(mr->pages, nr_refs);
+			release_pages(mr->pages, mr->nr_pages);
 
 		kvfree(mr->pages);
 	}
@@ -154,28 +173,50 @@ static int io_region_allocate_pages(struct io_mapped_region *mr,
 				    unsigned long mmap_offset)
 {
 	gfp_t gfp = GFP_KERNEL_ACCOUNT | __GFP_ZERO | __GFP_NOWARN;
-	size_t size = io_region_size(mr);
 	unsigned long nr_allocated;
-	struct page **pages;
+	struct page **pages, **cur_pages;
+	unsigned chunk_size, chunk_nr_pages;
+	unsigned int pages_left;
 
 	pages = kvmalloc_array(mr->nr_pages, sizeof(*pages), gfp);
 	if (!pages)
 		return -ENOMEM;
 
-	if (io_mem_alloc_compound(pages, mr->nr_pages, size, gfp)) {
-		mr->flags |= IO_REGION_F_SINGLE_REF;
-		goto done;
-	}
+	chunk_size = SZ_2M;
+	chunk_nr_pages = chunk_size >> PAGE_SHIFT;
+	pages_left = mr->nr_pages;
+	cur_pages = pages;
+
+	while (pages_left) {
+		unsigned int nr_pages = min(pages_left,
+					    chunk_nr_pages);
+
+		if (io_mem_alloc_compound(cur_pages, nr_pages,
+					  nr_pages << PAGE_SHIFT, gfp)) {
+			mr->flags |= IO_REGION_F_COMPOUND_PAGES;
+			cur_pages += nr_pages;
+			pages_left -= nr_pages;
+			continue;
+		}
 
-	nr_allocated = alloc_pages_bulk_node(gfp, NUMA_NO_NODE,
-					     mr->nr_pages, pages);
-	if (nr_allocated != mr->nr_pages) {
-		if (nr_allocated)
-			release_pages(pages, nr_allocated);
-		kvfree(pages);
-		return -ENOMEM;
+		nr_allocated = alloc_pages_bulk_node(gfp, NUMA_NO_NODE,
+						     nr_pages, cur_pages);
+		if (nr_allocated != nr_pages) {
+			unsigned int total =
+				(cur_pages - pages) + nr_allocated;
+
+			if (mr->flags & IO_REGION_F_COMPOUND_PAGES)
+				release_compound_pages(pages, total);
+			else
+				release_pages(pages, total);
+			kvfree(pages);
+			return -ENOMEM;
+		}
+
+		cur_pages += nr_pages;
+		pages_left -= nr_pages;
 	}
-done:
+
 	reg->mmap_offset = mmap_offset;
 	mr->pages = pages;
 	return 0;
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/9] io_uring/kbuf: add support for kernel-managed buffer rings
  2026-02-18  2:51 [PATCH v2 0/9] io_uring: add kernel-managed buffer rings Joanne Koong
  2026-02-18  2:51 ` [PATCH v2 1/9] io_uring/memmap: chunk allocations in io_region_allocate_pages() Joanne Koong
@ 2026-02-18  2:52 ` Joanne Koong
  2026-02-18  2:52 ` [PATCH v2 3/9] io_uring/kbuf: support kernel-managed buffer rings in buffer selection Joanne Koong
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Joanne Koong @ 2026-02-18  2:52 UTC (permalink / raw)
  To: axboe, io-uring; +Cc: csander, bernd, hch, asml.silence

Add support for kernel-managed buffer rings, which allow the kernel to
allocate and manage the backing buffers for a buffer ring, rather than
requiring the application to provide and manage them.

Internally, the IOBL_KERNEL_MANAGED flag marks buffer lists as
kernel-managed for appropriate handling in the I/O path.

At the uapi level, kernel-managed buffer rings are created through the
pbuf interface with the IOU_PBUF_RING_KERNEL_MANAGED flag set. The
io_uring_buf_reg struct is modified to allow taking in a buf_size
instead of a ring_addr. To create a kernel-managed buffer ring, the
caller must set the IOU_PBUF_RING_MMAP flag as well to indicate that the
kernel will allocate the memory for the ring. When the caller mmaps the
ring, they will get back a virtual mapping to the buffer memory.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 include/uapi/linux/io_uring.h | 14 +++++-
 io_uring/kbuf.c               | 95 +++++++++++++++++++++++++++++------
 io_uring/kbuf.h               |  6 ++-
 3 files changed, 97 insertions(+), 18 deletions(-)

diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 6750c383a2ab..278b56a87745 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -885,15 +885,27 @@ struct io_uring_buf_ring {
  *			use of it will consume only as much as it needs. This
  *			requires that both the kernel and application keep
  *			track of where the current read/recv index is at.
+ * IOU_PBUF_RING_KERNEL_MANAGED: If set, kernel allocates the memory for the
+ *			ring and its buffers. The application must set the
+ *			buffer size through reg->buf_size. The buffers are
+ *			recycled by the kernel. IOU_PBUF_RING_MMAP must be set
+ *			as well. When the caller makes a subsequent mmap call,
+ *			the virtual mapping returned is a contiguous mapping of
+ *			the buffers. IOU_PBUF_RING_INC is not yet supported.
  */
 enum io_uring_register_pbuf_ring_flags {
 	IOU_PBUF_RING_MMAP	= 1,
 	IOU_PBUF_RING_INC	= 2,
+	IOU_PBUF_RING_KERNEL_MANAGED = 4,
 };
 
 /* argument for IORING_(UN)REGISTER_PBUF_RING */
 struct io_uring_buf_reg {
-	__u64	ring_addr;
+	union {
+		__u64	ring_addr;
+		/* used if reg->flags & IOU_PBUF_RING_KERNEL_MANAGED */
+		__u32   buf_size;
+	};
 	__u32	ring_entries;
 	__u16	bgid;
 	__u16	flags;
diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
index 67d4fe576473..816200e91b1f 100644
--- a/io_uring/kbuf.c
+++ b/io_uring/kbuf.c
@@ -427,10 +427,13 @@ static int io_remove_buffers_legacy(struct io_ring_ctx *ctx,
 
 static void io_put_bl(struct io_ring_ctx *ctx, struct io_buffer_list *bl)
 {
-	if (bl->flags & IOBL_BUF_RING)
+	if (bl->flags & IOBL_BUF_RING) {
 		io_free_region(ctx->user, &bl->region);
-	else
+		if (bl->flags & IOBL_KERNEL_MANAGED)
+			kfree(bl->buf_ring);
+	} else {
 		io_remove_buffers_legacy(ctx, bl, -1U);
+	}
 
 	kfree(bl);
 }
@@ -596,6 +599,51 @@ int io_manage_buffers_legacy(struct io_kiocb *req, unsigned int issue_flags)
 	return IOU_COMPLETE;
 }
 
+static int io_setup_kmbuf_ring(struct io_ring_ctx *ctx,
+			       struct io_buffer_list *bl,
+			       const struct io_uring_buf_reg *reg)
+{
+	struct io_uring_region_desc rd;
+	struct io_uring_buf_ring *ring;
+	unsigned long ring_size;
+	void *buf_region;
+	unsigned int i;
+	int ret;
+
+	/* allocate pages for the ring structure */
+	ring_size = flex_array_size(ring, bufs, reg->ring_entries);
+	ring = kzalloc(ring_size, GFP_KERNEL_ACCOUNT);
+	if (!ring)
+		return -ENOMEM;
+
+	memset(&rd, 0, sizeof(rd));
+	rd.size = (u64)reg->buf_size * reg->ring_entries;
+
+	ret = io_create_region(ctx, &bl->region, &rd, 0);
+	if (ret) {
+		kfree(ring);
+		return ret;
+	}
+
+	/* initialize ring buf entries to point to the buffers */
+	buf_region = io_region_get_ptr(&bl->region);
+	for (i = 0; i < reg->ring_entries; i++) {
+		struct io_uring_buf *buf = &ring->bufs[i];
+
+		buf->addr = (u64)(uintptr_t)buf_region;
+		buf->len = reg->buf_size;
+		buf->bid = i;
+
+		buf_region += reg->buf_size;
+	}
+	ring->tail = reg->ring_entries;
+
+	bl->buf_ring = ring;
+	bl->flags |= IOBL_KERNEL_MANAGED;
+
+	return 0;
+}
+
 int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
 {
 	struct io_uring_buf_reg reg;
@@ -612,7 +660,8 @@ int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
 		return -EFAULT;
 	if (!mem_is_zero(reg.resv, sizeof(reg.resv)))
 		return -EINVAL;
-	if (reg.flags & ~(IOU_PBUF_RING_MMAP | IOU_PBUF_RING_INC))
+	if (reg.flags & ~(IOU_PBUF_RING_MMAP | IOU_PBUF_RING_INC |
+			  IOU_PBUF_RING_KERNEL_MANAGED))
 		return -EINVAL;
 	if (!is_power_of_2(reg.ring_entries))
 		return -EINVAL;
@@ -620,6 +669,15 @@ int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
 	if (reg.ring_entries >= 65536)
 		return -EINVAL;
 
+	if (reg.flags & IOU_PBUF_RING_KERNEL_MANAGED) {
+		if (!(reg.flags & IOU_PBUF_RING_MMAP))
+			return -EINVAL;
+		if (reg.flags & IOU_PBUF_RING_INC)
+			return -EINVAL;
+		if (!reg.buf_size || !PAGE_ALIGNED(reg.buf_size))
+			return -EINVAL;
+	}
+
 	bl = io_buffer_get_list(ctx, reg.bgid);
 	if (bl) {
 		/* if mapped buffer ring OR classic exists, don't allow */
@@ -634,17 +692,26 @@ int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
 
 	mmap_offset = (unsigned long)reg.bgid << IORING_OFF_PBUF_SHIFT;
 	ring_size = flex_array_size(br, bufs, reg.ring_entries);
-
 	memset(&rd, 0, sizeof(rd));
-	rd.size = PAGE_ALIGN(ring_size);
-	if (!(reg.flags & IOU_PBUF_RING_MMAP)) {
-		rd.user_addr = reg.ring_addr;
-		rd.flags |= IORING_MEM_REGION_TYPE_USER;
+
+	if (reg.flags & IOU_PBUF_RING_KERNEL_MANAGED) {
+		ret = io_setup_kmbuf_ring(ctx, bl, &reg);
+		if (ret) {
+			kfree(bl);
+			return ret;
+		}
+	} else {
+		rd.size = PAGE_ALIGN(ring_size);
+		if (!(reg.flags & IOU_PBUF_RING_MMAP)) {
+			rd.user_addr = reg.ring_addr;
+			rd.flags |= IORING_MEM_REGION_TYPE_USER;
+		}
+		ret = io_create_region(ctx, &bl->region, &rd, mmap_offset);
+		if (ret)
+			goto fail;
+		bl->buf_ring = io_region_get_ptr(&bl->region);
 	}
-	ret = io_create_region(ctx, &bl->region, &rd, mmap_offset);
-	if (ret)
-		goto fail;
-	br = io_region_get_ptr(&bl->region);
+	br = bl->buf_ring;
 
 #ifdef SHM_COLOUR
 	/*
@@ -666,15 +733,13 @@ int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
 	bl->nr_entries = reg.ring_entries;
 	bl->mask = reg.ring_entries - 1;
 	bl->flags |= IOBL_BUF_RING;
-	bl->buf_ring = br;
 	if (reg.flags & IOU_PBUF_RING_INC)
 		bl->flags |= IOBL_INC;
 	ret = io_buffer_add_list(ctx, bl, reg.bgid);
 	if (!ret)
 		return 0;
 fail:
-	io_free_region(ctx->user, &bl->region);
-	kfree(bl);
+	io_put_bl(ctx, bl);
 	return ret;
 }
 
diff --git a/io_uring/kbuf.h b/io_uring/kbuf.h
index bf15e26520d3..38dd5fe6716e 100644
--- a/io_uring/kbuf.h
+++ b/io_uring/kbuf.h
@@ -7,9 +7,11 @@
 
 enum {
 	/* ring mapped provided buffers */
-	IOBL_BUF_RING	= 1,
+	IOBL_BUF_RING		= 1,
 	/* buffers are consumed incrementally rather than always fully */
-	IOBL_INC	= 2,
+	IOBL_INC		= 2,
+	/* buffers are kernel managed */
+	IOBL_KERNEL_MANAGED	= 4,
 };
 
 struct io_buffer_list {
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 3/9] io_uring/kbuf: support kernel-managed buffer rings in buffer selection
  2026-02-18  2:51 [PATCH v2 0/9] io_uring: add kernel-managed buffer rings Joanne Koong
  2026-02-18  2:51 ` [PATCH v2 1/9] io_uring/memmap: chunk allocations in io_region_allocate_pages() Joanne Koong
  2026-02-18  2:52 ` [PATCH v2 2/9] io_uring/kbuf: add support for kernel-managed buffer rings Joanne Koong
@ 2026-02-18  2:52 ` Joanne Koong
  2026-02-18  2:52 ` [PATCH v2 4/9] io_uring/kbuf: add buffer ring pinning/unpinning Joanne Koong
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Joanne Koong @ 2026-02-18  2:52 UTC (permalink / raw)
  To: axboe, io-uring; +Cc: csander, bernd, hch, asml.silence

Allow kernel-managed buffers to be selected. This requires modifying the
io_br_sel struct to separate the fields for address and val, since a
kernel address cannot be distinguished from a negative val when error
checking.

Auto-commit any selected kernel-managed buffer.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 include/linux/io_uring_types.h |  8 ++++----
 io_uring/kbuf.c                | 16 ++++++++++++----
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
index 3e4a82a6f817..36cc2e0346d9 100644
--- a/include/linux/io_uring_types.h
+++ b/include/linux/io_uring_types.h
@@ -93,13 +93,13 @@ struct io_mapped_region {
  */
 struct io_br_sel {
 	struct io_buffer_list *buf_list;
-	/*
-	 * Some selection parts return the user address, others return an error.
-	 */
 	union {
+		/* for classic/ring provided buffers */
 		void __user *addr;
-		ssize_t val;
+		/* for kernel-managed buffers */
+		void *kaddr;
 	};
+	ssize_t val;
 };
 
 
diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
index 816200e91b1f..efcc6540f948 100644
--- a/io_uring/kbuf.c
+++ b/io_uring/kbuf.c
@@ -155,7 +155,8 @@ static int io_provided_buffers_select(struct io_kiocb *req, size_t *len,
 	return 1;
 }
 
-static bool io_should_commit(struct io_kiocb *req, unsigned int issue_flags)
+static bool io_should_commit(struct io_kiocb *req, struct io_buffer_list *bl,
+			     unsigned int issue_flags)
 {
 	/*
 	* If we came in unlocked, we have no choice but to consume the
@@ -170,7 +171,11 @@ static bool io_should_commit(struct io_kiocb *req, unsigned int issue_flags)
 	if (issue_flags & IO_URING_F_UNLOCKED)
 		return true;
 
-	/* uring_cmd commits kbuf upfront, no need to auto-commit */
+	/* kernel-managed buffers are auto-committed */
+	if (bl->flags & IOBL_KERNEL_MANAGED)
+		return true;
+
+	/* multishot uring_cmd commits kbuf upfront, no need to auto-commit */
 	if (!io_file_can_poll(req) && req->opcode != IORING_OP_URING_CMD)
 		return true;
 	return false;
@@ -200,9 +205,12 @@ static struct io_br_sel io_ring_buffer_select(struct io_kiocb *req, size_t *len,
 	req->flags |= REQ_F_BUFFER_RING | REQ_F_BUFFERS_COMMIT;
 	req->buf_index = READ_ONCE(buf->bid);
 	sel.buf_list = bl;
-	sel.addr = u64_to_user_ptr(READ_ONCE(buf->addr));
+	if (bl->flags & IOBL_KERNEL_MANAGED)
+		sel.kaddr = (void *)(uintptr_t)READ_ONCE(buf->addr);
+	else
+		sel.addr = u64_to_user_ptr(READ_ONCE(buf->addr));
 
-	if (io_should_commit(req, issue_flags)) {
+	if (io_should_commit(req, bl, issue_flags)) {
 		io_kbuf_commit(req, sel.buf_list, *len, 1);
 		sel.buf_list = NULL;
 	}
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 4/9] io_uring/kbuf: add buffer ring pinning/unpinning
  2026-02-18  2:51 [PATCH v2 0/9] io_uring: add kernel-managed buffer rings Joanne Koong
                   ` (2 preceding siblings ...)
  2026-02-18  2:52 ` [PATCH v2 3/9] io_uring/kbuf: support kernel-managed buffer rings in buffer selection Joanne Koong
@ 2026-02-18  2:52 ` Joanne Koong
  2026-02-18  2:52 ` [PATCH v2 5/9] io_uring/kbuf: return buffer id in buffer selection Joanne Koong
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Joanne Koong @ 2026-02-18  2:52 UTC (permalink / raw)
  To: axboe, io-uring; +Cc: csander, bernd, hch, asml.silence

Add kernel APIs to pin and unpin buffer rings, preventing userspace from
unregistering a buffer ring while it is pinned by the kernel.

This provides a mechanism for kernel subsystems to safely access buffer
ring contents while ensuring the buffer ring remains valid. A pinned
buffer ring cannot be unregistered until explicitly unpinned. On the
userspace side, trying to unregister a pinned buffer will return -EBUSY.

This is a preparatory change for upcoming fuse usage of kernel-managed
buffer rings. It is necessary for fuse to pin the buffer ring because
fuse may need to select a buffer in atomic contexts, which it can only
do so by using the underlying buffer list pointer.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 include/linux/io_uring/cmd.h | 17 +++++++++++
 io_uring/kbuf.c              | 55 ++++++++++++++++++++++++++++++++++++
 io_uring/kbuf.h              |  5 ++++
 3 files changed, 77 insertions(+)

diff --git a/include/linux/io_uring/cmd.h b/include/linux/io_uring/cmd.h
index 375fd048c4cb..bd681d8ab1d4 100644
--- a/include/linux/io_uring/cmd.h
+++ b/include/linux/io_uring/cmd.h
@@ -84,6 +84,10 @@ struct io_br_sel io_uring_cmd_buffer_select(struct io_uring_cmd *ioucmd,
 bool io_uring_mshot_cmd_post_cqe(struct io_uring_cmd *ioucmd,
 				 struct io_br_sel *sel, unsigned int issue_flags);
 
+int io_uring_buf_ring_pin(struct io_uring_cmd *cmd, unsigned buf_group,
+			  unsigned issue_flags, struct io_buffer_list **out_bl);
+int io_uring_buf_ring_unpin(struct io_uring_cmd *cmd, unsigned buf_group,
+			    unsigned issue_flags);
 #else
 static inline int
 io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
@@ -126,6 +130,19 @@ static inline bool io_uring_mshot_cmd_post_cqe(struct io_uring_cmd *ioucmd,
 {
 	return true;
 }
+static inline int io_uring_buf_ring_pin(struct io_uring_cmd *cmd,
+					unsigned buf_group,
+					unsigned issue_flags,
+					struct io_buffer_list **bl)
+{
+	return -EOPNOTSUPP;
+}
+static inline int io_uring_buf_ring_unpin(struct io_uring_cmd *cmd,
+					  unsigned buf_group,
+					  unsigned issue_flags)
+{
+	return -EOPNOTSUPP;
+}
 #endif
 
 static inline struct io_uring_cmd *io_uring_cmd_from_tw(struct io_tw_req tw_req)
diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
index efcc6540f948..1d86ad7803fd 100644
--- a/io_uring/kbuf.c
+++ b/io_uring/kbuf.c
@@ -9,6 +9,7 @@
 #include <linux/poll.h>
 #include <linux/vmalloc.h>
 #include <linux/io_uring.h>
+#include <linux/io_uring/cmd.h>
 
 #include <uapi/linux/io_uring.h>
 
@@ -237,6 +238,58 @@ struct io_br_sel io_buffer_select(struct io_kiocb *req, size_t *len,
 	return sel;
 }
 
+int io_uring_buf_ring_pin(struct io_uring_cmd *cmd, unsigned buf_group,
+			  unsigned issue_flags, struct io_buffer_list **out_bl)
+{
+	struct io_ring_ctx *ctx = cmd_to_io_kiocb(cmd)->ctx;
+	struct io_buffer_list *bl;
+	int ret = -EINVAL;
+
+	io_ring_submit_lock(ctx, issue_flags);
+
+	bl = io_buffer_get_list(ctx, buf_group);
+	if (!bl || !(bl->flags & IOBL_BUF_RING))
+		goto err;
+
+	if (unlikely(bl->flags & IOBL_PINNED)) {
+		ret = -EALREADY;
+		goto err;
+	}
+
+	bl->flags |= IOBL_PINNED;
+	ret = 0;
+	*out_bl = bl;
+err:
+	io_ring_submit_unlock(ctx, issue_flags);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(io_uring_buf_ring_pin);
+
+int io_uring_buf_ring_unpin(struct io_uring_cmd *cmd, unsigned buf_group,
+		       unsigned issue_flags)
+{
+	struct io_ring_ctx *ctx = cmd_to_io_kiocb(cmd)->ctx;
+	struct io_buffer_list *bl;
+	unsigned int required_flags;
+	int ret = -EINVAL;
+
+	io_ring_submit_lock(ctx, issue_flags);
+
+	bl = io_buffer_get_list(ctx, buf_group);
+	if (!bl)
+		goto err;
+
+	required_flags = IOBL_BUF_RING | IOBL_PINNED;
+	if ((bl->flags & required_flags) == required_flags) {
+		bl->flags &= ~IOBL_PINNED;
+		ret = 0;
+	}
+err:
+	io_ring_submit_unlock(ctx, issue_flags);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(io_uring_buf_ring_unpin);
+
 /* cap it at a reasonable 256, will be one page even for 4K */
 #define PEEK_MAX_IMPORT		256
 
@@ -768,6 +821,8 @@ int io_unregister_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
 		return -ENOENT;
 	if (!(bl->flags & IOBL_BUF_RING))
 		return -EINVAL;
+	if (bl->flags & IOBL_PINNED)
+		return -EBUSY;
 
 	scoped_guard(mutex, &ctx->mmap_lock)
 		xa_erase(&ctx->io_bl_xa, bl->bgid);
diff --git a/io_uring/kbuf.h b/io_uring/kbuf.h
index 38dd5fe6716e..006e8a73a117 100644
--- a/io_uring/kbuf.h
+++ b/io_uring/kbuf.h
@@ -12,6 +12,11 @@ enum {
 	IOBL_INC		= 2,
 	/* buffers are kernel managed */
 	IOBL_KERNEL_MANAGED	= 4,
+	/*
+	 * buffer ring is pinned and cannot be unregistered by userspace until
+	 * it has been unpinned
+	 */
+	IOBL_PINNED		= 8,
 };
 
 struct io_buffer_list {
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 5/9] io_uring/kbuf: return buffer id in buffer selection
  2026-02-18  2:51 [PATCH v2 0/9] io_uring: add kernel-managed buffer rings Joanne Koong
                   ` (3 preceding siblings ...)
  2026-02-18  2:52 ` [PATCH v2 4/9] io_uring/kbuf: add buffer ring pinning/unpinning Joanne Koong
@ 2026-02-18  2:52 ` Joanne Koong
  2026-02-18  2:52 ` [PATCH v2 6/9] io_uring/kbuf: add recycling for kernel managed buffer rings Joanne Koong
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Joanne Koong @ 2026-02-18  2:52 UTC (permalink / raw)
  To: axboe, io-uring; +Cc: csander, bernd, hch, asml.silence

Return the id of the selected buffer in io_buffer_select(). This is
needed for kernel-managed buffer rings to later recycle the selected
buffer.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 include/linux/io_uring/cmd.h   | 2 +-
 include/linux/io_uring_types.h | 2 ++
 io_uring/kbuf.c                | 7 +++++--
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/include/linux/io_uring/cmd.h b/include/linux/io_uring/cmd.h
index bd681d8ab1d4..31f47cce99f5 100644
--- a/include/linux/io_uring/cmd.h
+++ b/include/linux/io_uring/cmd.h
@@ -71,7 +71,7 @@ void io_uring_cmd_issue_blocking(struct io_uring_cmd *ioucmd);
 
 /*
  * Select a buffer from the provided buffer group for multishot uring_cmd.
- * Returns the selected buffer address and size.
+ * Returns the selected buffer address, size, and id.
  */
 struct io_br_sel io_uring_cmd_buffer_select(struct io_uring_cmd *ioucmd,
 					    unsigned buf_group, size_t *len,
diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
index 36cc2e0346d9..5a56bb341337 100644
--- a/include/linux/io_uring_types.h
+++ b/include/linux/io_uring_types.h
@@ -100,6 +100,8 @@ struct io_br_sel {
 		void *kaddr;
 	};
 	ssize_t val;
+	/* id of the selected buffer */
+	unsigned buf_id;
 };
 
 
diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
index 1d86ad7803fd..d20221f1b9b2 100644
--- a/io_uring/kbuf.c
+++ b/io_uring/kbuf.c
@@ -206,6 +206,7 @@ static struct io_br_sel io_ring_buffer_select(struct io_kiocb *req, size_t *len,
 	req->flags |= REQ_F_BUFFER_RING | REQ_F_BUFFERS_COMMIT;
 	req->buf_index = READ_ONCE(buf->bid);
 	sel.buf_list = bl;
+	sel.buf_id = req->buf_index;
 	if (bl->flags & IOBL_KERNEL_MANAGED)
 		sel.kaddr = (void *)(uintptr_t)READ_ONCE(buf->addr);
 	else
@@ -229,10 +230,12 @@ struct io_br_sel io_buffer_select(struct io_kiocb *req, size_t *len,
 
 	bl = io_buffer_get_list(ctx, buf_group);
 	if (likely(bl)) {
-		if (bl->flags & IOBL_BUF_RING)
+		if (bl->flags & IOBL_BUF_RING) {
 			sel = io_ring_buffer_select(req, len, bl, issue_flags);
-		else
+		} else {
 			sel.addr = io_provided_buffer_select(req, len, bl);
+			sel.buf_id = req->buf_index;
+		}
 	}
 	io_ring_submit_unlock(req->ctx, issue_flags);
 	return sel;
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 6/9] io_uring/kbuf: add recycling for kernel managed buffer rings
  2026-02-18  2:51 [PATCH v2 0/9] io_uring: add kernel-managed buffer rings Joanne Koong
                   ` (4 preceding siblings ...)
  2026-02-18  2:52 ` [PATCH v2 5/9] io_uring/kbuf: return buffer id in buffer selection Joanne Koong
@ 2026-02-18  2:52 ` Joanne Koong
  2026-02-18  2:52 ` [PATCH v2 7/9] io_uring/kbuf: add io_uring_is_kmbuf_ring() Joanne Koong
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Joanne Koong @ 2026-02-18  2:52 UTC (permalink / raw)
  To: axboe, io-uring; +Cc: csander, bernd, hch, asml.silence

Add an interface for buffers to be recycled back into a kernel-managed
buffer ring.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 include/linux/io_uring/cmd.h | 11 +++++++++
 io_uring/kbuf.c              | 48 ++++++++++++++++++++++++++++++++++++
 2 files changed, 59 insertions(+)

diff --git a/include/linux/io_uring/cmd.h b/include/linux/io_uring/cmd.h
index 31f47cce99f5..5cebcd6d50e6 100644
--- a/include/linux/io_uring/cmd.h
+++ b/include/linux/io_uring/cmd.h
@@ -88,6 +88,10 @@ int io_uring_buf_ring_pin(struct io_uring_cmd *cmd, unsigned buf_group,
 			  unsigned issue_flags, struct io_buffer_list **out_bl);
 int io_uring_buf_ring_unpin(struct io_uring_cmd *cmd, unsigned buf_group,
 			    unsigned issue_flags);
+
+int io_uring_kmbuf_recycle(struct io_uring_cmd *cmd, unsigned int buf_group,
+			   u64 addr, unsigned int len, unsigned int bid,
+			   unsigned int issue_flags);
 #else
 static inline int
 io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
@@ -143,6 +147,13 @@ static inline int io_uring_buf_ring_unpin(struct io_uring_cmd *cmd,
 {
 	return -EOPNOTSUPP;
 }
+static inline int io_uring_kmbuf_recycle(struct io_uring_cmd *cmd,
+					 unsigned int buf_group, u64 addr,
+					 unsigned int len, unsigned int bid,
+					 unsigned int issue_flags)
+{
+	return -EOPNOTSUPP;
+}
 #endif
 
 static inline struct io_uring_cmd *io_uring_cmd_from_tw(struct io_tw_req tw_req)
diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
index d20221f1b9b2..6e4dd1e003f4 100644
--- a/io_uring/kbuf.c
+++ b/io_uring/kbuf.c
@@ -102,6 +102,54 @@ void io_kbuf_drop_legacy(struct io_kiocb *req)
 	req->kbuf = NULL;
 }
 
+int io_uring_kmbuf_recycle(struct io_uring_cmd *cmd, unsigned int buf_group,
+			   u64 addr, unsigned int len, unsigned int bid,
+			   unsigned int issue_flags)
+{
+	struct io_kiocb *req = cmd_to_io_kiocb(cmd);
+	struct io_ring_ctx *ctx = req->ctx;
+	struct io_uring_buf_ring *br;
+	struct io_uring_buf *buf;
+	struct io_buffer_list *bl;
+	unsigned int required_flags;
+	int ret = -EINVAL;
+
+	if (WARN_ON_ONCE(req->flags & REQ_F_BUFFERS_COMMIT))
+		return ret;
+
+	io_ring_submit_lock(ctx, issue_flags);
+
+	bl = io_buffer_get_list(ctx, buf_group);
+
+	if (!bl)
+		goto err;
+
+	required_flags = IOBL_BUF_RING | IOBL_KERNEL_MANAGED;
+	if (WARN_ON_ONCE((bl->flags & required_flags) != required_flags))
+		goto err;
+
+	br = bl->buf_ring;
+
+	if (WARN_ON_ONCE((__u16)(br->tail - bl->head) >= bl->nr_entries))
+		goto err;
+
+	buf = &br->bufs[(br->tail) & bl->mask];
+
+	buf->addr = addr;
+	buf->len = len;
+	buf->bid = bid;
+
+	req->flags &= ~REQ_F_BUFFER_RING;
+
+	br->tail++;
+	ret = 0;
+
+err:
+	io_ring_submit_unlock(ctx, issue_flags);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(io_uring_kmbuf_recycle);
+
 bool io_kbuf_recycle_legacy(struct io_kiocb *req, unsigned issue_flags)
 {
 	struct io_ring_ctx *ctx = req->ctx;
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 7/9] io_uring/kbuf: add io_uring_is_kmbuf_ring()
  2026-02-18  2:51 [PATCH v2 0/9] io_uring: add kernel-managed buffer rings Joanne Koong
                   ` (5 preceding siblings ...)
  2026-02-18  2:52 ` [PATCH v2 6/9] io_uring/kbuf: add recycling for kernel managed buffer rings Joanne Koong
@ 2026-02-18  2:52 ` Joanne Koong
  2026-02-18  2:52 ` [PATCH v2 8/9] io_uring/kbuf: export io_ring_buffer_select() Joanne Koong
  2026-02-18  2:52 ` [PATCH v2 9/9] io_uring/cmd: set selected buffer index in __io_uring_cmd_done() Joanne Koong
  8 siblings, 0 replies; 10+ messages in thread
From: Joanne Koong @ 2026-02-18  2:52 UTC (permalink / raw)
  To: axboe, io-uring; +Cc: csander, bernd, hch, asml.silence

io_uring_is_kmbuf_ring() returns true if there is a kernel-managed
buffer ring at the specified buffer group.

This is a preparatory patch for upcoming fuse kernel-managed buffer
support, which needs to ensure the buffer ring registered by the server
is a kernel-managed buffer ring.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 include/linux/io_uring/cmd.h |  9 +++++++++
 io_uring/kbuf.c              | 20 ++++++++++++++++++++
 2 files changed, 29 insertions(+)

diff --git a/include/linux/io_uring/cmd.h b/include/linux/io_uring/cmd.h
index 5cebcd6d50e6..dce6a0ce8538 100644
--- a/include/linux/io_uring/cmd.h
+++ b/include/linux/io_uring/cmd.h
@@ -92,6 +92,9 @@ int io_uring_buf_ring_unpin(struct io_uring_cmd *cmd, unsigned buf_group,
 int io_uring_kmbuf_recycle(struct io_uring_cmd *cmd, unsigned int buf_group,
 			   u64 addr, unsigned int len, unsigned int bid,
 			   unsigned int issue_flags);
+
+bool io_uring_is_kmbuf_ring(struct io_uring_cmd *cmd, unsigned int buf_group,
+			    unsigned int issue_flags);
 #else
 static inline int
 io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
@@ -154,6 +157,12 @@ static inline int io_uring_kmbuf_recycle(struct io_uring_cmd *cmd,
 {
 	return -EOPNOTSUPP;
 }
+static inline bool io_uring_is_kmbuf_ring(struct io_uring_cmd *cmd,
+					  unsigned int buf_group,
+					  unsigned int issue_flags)
+{
+	return false;
+}
 #endif
 
 static inline struct io_uring_cmd *io_uring_cmd_from_tw(struct io_tw_req tw_req)
diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
index 6e4dd1e003f4..bd10c830cd30 100644
--- a/io_uring/kbuf.c
+++ b/io_uring/kbuf.c
@@ -917,3 +917,23 @@ struct io_mapped_region *io_pbuf_get_region(struct io_ring_ctx *ctx,
 		return NULL;
 	return &bl->region;
 }
+
+bool io_uring_is_kmbuf_ring(struct io_uring_cmd *cmd, unsigned int buf_group,
+			    unsigned int issue_flags)
+{
+	struct io_ring_ctx *ctx = cmd_to_io_kiocb(cmd)->ctx;
+	struct io_buffer_list *bl;
+	bool is_kmbuf_ring = false;
+
+	io_ring_submit_lock(ctx, issue_flags);
+
+	bl = io_buffer_get_list(ctx, buf_group);
+	if (likely(bl) && (bl->flags & IOBL_KERNEL_MANAGED)) {
+		WARN_ON_ONCE(!(bl->flags & IOBL_BUF_RING));
+		is_kmbuf_ring = true;
+	}
+
+	io_ring_submit_unlock(ctx, issue_flags);
+	return is_kmbuf_ring;
+}
+EXPORT_SYMBOL_GPL(io_uring_is_kmbuf_ring);
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 8/9] io_uring/kbuf: export io_ring_buffer_select()
  2026-02-18  2:51 [PATCH v2 0/9] io_uring: add kernel-managed buffer rings Joanne Koong
                   ` (6 preceding siblings ...)
  2026-02-18  2:52 ` [PATCH v2 7/9] io_uring/kbuf: add io_uring_is_kmbuf_ring() Joanne Koong
@ 2026-02-18  2:52 ` Joanne Koong
  2026-02-18  2:52 ` [PATCH v2 9/9] io_uring/cmd: set selected buffer index in __io_uring_cmd_done() Joanne Koong
  8 siblings, 0 replies; 10+ messages in thread
From: Joanne Koong @ 2026-02-18  2:52 UTC (permalink / raw)
  To: axboe, io-uring; +Cc: csander, bernd, hch, asml.silence

Export io_ring_buffer_select() so that it may be used by callers who
pass in a pinned bufring without needing to grab the io_uring mutex.

This is a preparatory patch that will be needed by fuse io-uring, which
will need to select a buffer from a kernel-managed bufring while the
uring mutex may already be held by in-progress commits, and may need to
select a buffer in atomic contexts.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 include/linux/io_uring/cmd.h | 14 ++++++++++++++
 io_uring/kbuf.c              |  7 ++++---
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/include/linux/io_uring/cmd.h b/include/linux/io_uring/cmd.h
index dce6a0ce8538..ac8925fa81f6 100644
--- a/include/linux/io_uring/cmd.h
+++ b/include/linux/io_uring/cmd.h
@@ -95,6 +95,10 @@ int io_uring_kmbuf_recycle(struct io_uring_cmd *cmd, unsigned int buf_group,
 
 bool io_uring_is_kmbuf_ring(struct io_uring_cmd *cmd, unsigned int buf_group,
 			    unsigned int issue_flags);
+
+struct io_br_sel io_ring_buffer_select(struct io_kiocb *req, size_t *len,
+				       struct io_buffer_list *bl,
+				       unsigned int issue_flags);
 #else
 static inline int
 io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
@@ -163,6 +167,16 @@ static inline bool io_uring_is_kmbuf_ring(struct io_uring_cmd *cmd,
 {
 	return false;
 }
+static inline struct io_br_sel io_ring_buffer_select(struct io_kiocb *req,
+						     size_t *len,
+						     struct io_buffer_list *bl,
+						     unsigned int issue_flags)
+{
+	struct io_br_sel sel = {
+		.val = -EOPNOTSUPP,
+	};
+	return sel;
+}
 #endif
 
 static inline struct io_uring_cmd *io_uring_cmd_from_tw(struct io_tw_req tw_req)
diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
index bd10c830cd30..fcc64e4a6a29 100644
--- a/io_uring/kbuf.c
+++ b/io_uring/kbuf.c
@@ -230,9 +230,9 @@ static bool io_should_commit(struct io_kiocb *req, struct io_buffer_list *bl,
 	return false;
 }
 
-static struct io_br_sel io_ring_buffer_select(struct io_kiocb *req, size_t *len,
-					      struct io_buffer_list *bl,
-					      unsigned int issue_flags)
+struct io_br_sel io_ring_buffer_select(struct io_kiocb *req, size_t *len,
+				       struct io_buffer_list *bl,
+				       unsigned int issue_flags)
 {
 	struct io_uring_buf_ring *br = bl->buf_ring;
 	__u16 tail, head = bl->head;
@@ -266,6 +266,7 @@ static struct io_br_sel io_ring_buffer_select(struct io_kiocb *req, size_t *len,
 	}
 	return sel;
 }
+EXPORT_SYMBOL_GPL(io_ring_buffer_select);
 
 struct io_br_sel io_buffer_select(struct io_kiocb *req, size_t *len,
 				  unsigned buf_group, unsigned int issue_flags)
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 9/9] io_uring/cmd: set selected buffer index in __io_uring_cmd_done()
  2026-02-18  2:51 [PATCH v2 0/9] io_uring: add kernel-managed buffer rings Joanne Koong
                   ` (7 preceding siblings ...)
  2026-02-18  2:52 ` [PATCH v2 8/9] io_uring/kbuf: export io_ring_buffer_select() Joanne Koong
@ 2026-02-18  2:52 ` Joanne Koong
  8 siblings, 0 replies; 10+ messages in thread
From: Joanne Koong @ 2026-02-18  2:52 UTC (permalink / raw)
  To: axboe, io-uring; +Cc: csander, bernd, hch, asml.silence

When uring_cmd operations select a buffer, the completion queue entry
should indicate which buffer was selected.

Set IORING_CQE_F_BUFFER on the completed entry and encode the buffer
index if a buffer was selected.

This change is needed in order to relay to userspace which selected
buffer contains the data.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 io_uring/uring_cmd.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index ee7b49f47cb5..6d38df1a812d 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -151,6 +151,7 @@ void __io_uring_cmd_done(struct io_uring_cmd *ioucmd, s32 ret, u64 res2,
 		       unsigned issue_flags, bool is_cqe32)
 {
 	struct io_kiocb *req = cmd_to_io_kiocb(ioucmd);
+	u32 cflags = 0;
 
 	if (WARN_ON_ONCE(req->flags & REQ_F_APOLL_MULTISHOT))
 		return;
@@ -160,7 +161,10 @@ void __io_uring_cmd_done(struct io_uring_cmd *ioucmd, s32 ret, u64 res2,
 	if (ret < 0)
 		req_set_fail(req);
 
-	io_req_set_res(req, ret, 0);
+	if (req->flags & (REQ_F_BUFFER_SELECTED | REQ_F_BUFFER_RING))
+		cflags |= IORING_CQE_F_BUFFER |
+			(req->buf_index << IORING_CQE_BUFFER_SHIFT);
+	io_req_set_res(req, ret, cflags);
 	if (is_cqe32) {
 		if (req->ctx->flags & IORING_SETUP_CQE_MIXED)
 			req->cqe.flags |= IORING_CQE_F_32;
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-02-18  2:56 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-18  2:51 [PATCH v2 0/9] io_uring: add kernel-managed buffer rings Joanne Koong
2026-02-18  2:51 ` [PATCH v2 1/9] io_uring/memmap: chunk allocations in io_region_allocate_pages() Joanne Koong
2026-02-18  2:52 ` [PATCH v2 2/9] io_uring/kbuf: add support for kernel-managed buffer rings Joanne Koong
2026-02-18  2:52 ` [PATCH v2 3/9] io_uring/kbuf: support kernel-managed buffer rings in buffer selection Joanne Koong
2026-02-18  2:52 ` [PATCH v2 4/9] io_uring/kbuf: add buffer ring pinning/unpinning Joanne Koong
2026-02-18  2:52 ` [PATCH v2 5/9] io_uring/kbuf: return buffer id in buffer selection Joanne Koong
2026-02-18  2:52 ` [PATCH v2 6/9] io_uring/kbuf: add recycling for kernel managed buffer rings Joanne Koong
2026-02-18  2:52 ` [PATCH v2 7/9] io_uring/kbuf: add io_uring_is_kmbuf_ring() Joanne Koong
2026-02-18  2:52 ` [PATCH v2 8/9] io_uring/kbuf: export io_ring_buffer_select() Joanne Koong
2026-02-18  2:52 ` [PATCH v2 9/9] io_uring/cmd: set selected buffer index in __io_uring_cmd_done() Joanne Koong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox