From: Joanne Koong <joannelkoong@gmail.com>
To: miklos@szeredi.hu, axboe@kernel.dk
Cc: bschubert@ddn.com, asml.silence@gmail.com,
io-uring@vger.kernel.org, csander@purestorage.com,
xiaobing.li@samsung.com, linux-fsdevel@vger.kernel.org
Subject: [PATCH v2 25/25] docs: fuse: add io-uring bufring and zero-copy documentation
Date: Thu, 18 Dec 2025 00:33:19 -0800 [thread overview]
Message-ID: <20251218083319.3485503-26-joannelkoong@gmail.com> (raw)
In-Reply-To: <20251218083319.3485503-1-joannelkoong@gmail.com>
Add documentation for fuse over io-uring usage of kernel-managed
bufrings and zero-copy.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
.../filesystems/fuse/fuse-io-uring.rst | 55 ++++++++++++++++++-
1 file changed, 54 insertions(+), 1 deletion(-)
diff --git a/Documentation/filesystems/fuse/fuse-io-uring.rst b/Documentation/filesystems/fuse/fuse-io-uring.rst
index d73dd0dbd238..4c17169069e9 100644
--- a/Documentation/filesystems/fuse/fuse-io-uring.rst
+++ b/Documentation/filesystems/fuse/fuse-io-uring.rst
@@ -95,5 +95,58 @@ Sending requests with CQEs
| <fuse_unlink() |
| <sys_unlink() |
+Kernel-managed buffer rings
+===========================
-
+Kernel-managed buffer rings have two main advantages:
+* eliminates the overhead of pinning/unpinning user pages and translating
+ virtual addresses for every server-kernel interaction
+* reduces buffer memory allocation requirements
+
+In order to use buffer rings, the server must preregister the following:
+* a fixed buffer at index 0. This is where the headers will reside
+* a kernel-managed buffer ring. This is where the payload will reside
+
+At a high-level, this is how fuse uses buffer rings:
+* The server registers a kernel-managed buffer ring. In the kernel this
+ allocates the pages needed for the buffers and vmaps them. The server
+ obtains the virtual address for the buffers through an mmap call on the ring
+ fd.
+* When there is a request from a client, fuse will select a buffer from the
+ ring if there is any payload that needs to be copied, copy over the payload
+ to the selected buffer, and copy over the headers to the fixed buffer at
+ index 0, at the buffer id that corresponds to the server (which the server
+ needs to specify through sqe->buf_index).
+* The server obtains a cqe representing the request. The cqe flag will have
+ IORING_CQE_F_BUFFER set if a selected buffer was used for the payload. The
+ buffer id is stashed in cqe->flags (through IORING_CQE_BUFFER_SHIFT). The
+ server can directly access the payload by using that buffer id to calculate
+ the offset into the virtual address obtained for the buffers.
+* The server processes the request and then sends a
+ FUSE_URING_CMD_COMMIT_AND_FETCH sqe with the reply.
+* When the kernel handles the sqe, it will process the reply and if there is a
+ next request, it will reuse the same selected buffer for the request. If
+ there is no next request, it will recycle the buffer back to the ring.
+
+Zero-copy
+=========
+
+Fuse io-uring zero-copy allows the server to directly read from / write to the
+client's pages and bypass any intermediary buffer copies. This is only allowed
+on privileged servers.
+
+In order to use zero-copy, the server must pregister the following:
+* a sparse buffer for every entry in the queue. This is where the client's
+ pages will reside
+* a fixed buffer at index queue_depth (tailing the sparse buffer).
+ This is where the headers will reside
+* a kernel-managed buffer ring. This is where any non-zero-copied payload (eg
+ out headers) will reside
+
+When the client issues a read/write, fuse stores the client's underlying pages
+in the sparse buffer entry corresponding to the ent in the queue. The server
+can then issue reads/writes on these pages through io_uring rw operations.
+Please note that the server is not able to directly access these pages, it
+must go through the io-uring interface to read/write to them. The pages are
+unregistered once the server replies to the request. Non-zero-copyable
+payload (if needed) is placed in a buffer from the kernel-managed buffer ring.
--
2.47.3
next prev parent reply other threads:[~2025-12-18 8:35 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-18 8:32 [PATCH v2 00/25] fuse/io-uring: add kernel-managed buffer rings and zero-copy Joanne Koong
2025-12-18 8:32 ` [PATCH v2 01/25] io_uring/kbuf: refactor io_buf_pbuf_register() logic into generic helpers Joanne Koong
2025-12-18 8:32 ` [PATCH v2 02/25] io_uring/kbuf: rename io_unregister_pbuf_ring() to io_unregister_buf_ring() Joanne Koong
2025-12-18 8:32 ` [PATCH v2 03/25] io_uring/kbuf: add support for kernel-managed buffer rings Joanne Koong
2025-12-21 12:24 ` kernel test robot
2025-12-18 8:32 ` [PATCH v2 04/25] io_uring/kbuf: add mmap " Joanne Koong
2025-12-18 8:32 ` [PATCH v2 05/25] io_uring/kbuf: support kernel-managed buffer rings in buffer selection Joanne Koong
2025-12-21 13:49 ` kernel test robot
2025-12-18 8:33 ` [PATCH v2 06/25] io_uring/kbuf: add buffer ring pinning/unpinning Joanne Koong
2025-12-18 14:21 ` Joanne Koong
2025-12-18 8:33 ` [PATCH v2 07/25] io_uring/kbuf: add recycling for kernel managed buffer rings Joanne Koong
2025-12-18 8:33 ` [PATCH v2 08/25] io_uring: add io_uring_cmd_fixed_index_get() and io_uring_cmd_fixed_index_put() Joanne Koong
2025-12-18 8:33 ` [PATCH v2 09/25] io_uring/kbuf: add io_uring_cmd_is_kmbuf_ring() Joanne Koong
2025-12-18 8:33 ` [PATCH v2 10/25] io_uring/kbuf: export io_ring_buffer_select() Joanne Koong
2025-12-18 8:33 ` [PATCH v2 11/25] io_uring/kbuf: return buffer id in buffer selection Joanne Koong
2025-12-18 8:33 ` [PATCH v2 12/25] io_uring/cmd: set selected buffer index in __io_uring_cmd_done() Joanne Koong
2025-12-18 8:33 ` [PATCH v2 13/25] fuse: refactor io-uring logic for getting next fuse request Joanne Koong
2025-12-18 8:33 ` [PATCH v2 14/25] fuse: refactor io-uring header copying to ring Joanne Koong
2025-12-18 8:33 ` [PATCH v2 15/25] fuse: refactor io-uring header copying from ring Joanne Koong
2025-12-18 8:33 ` [PATCH v2 16/25] fuse: use enum types for header copying Joanne Koong
2025-12-18 8:33 ` [PATCH v2 17/25] fuse: refactor setting up copy state for payload copying Joanne Koong
2025-12-18 8:33 ` [PATCH v2 18/25] fuse: support buffer copying for kernel addresses Joanne Koong
2025-12-18 8:33 ` [PATCH v2 19/25] fuse: add io-uring kernel-managed buffer ring Joanne Koong
2025-12-20 22:45 ` kernel test robot
2025-12-21 2:10 ` kernel test robot
2025-12-22 17:23 ` kernel test robot
2025-12-18 8:33 ` [PATCH v2 20/25] io_uring/rsrc: rename io_buffer_register_bvec()/io_buffer_unregister_bvec() Joanne Koong
2025-12-18 8:33 ` [PATCH v2 21/25] io_uring/rsrc: split io_buffer_register_request() logic Joanne Koong
2025-12-18 8:33 ` [PATCH v2 22/25] io_uring/rsrc: Allow buffer release callback to be optional Joanne Koong
2025-12-18 8:33 ` [PATCH v2 23/25] io_uring/rsrc: add io_buffer_register_bvec() Joanne Koong
2025-12-18 8:33 ` [PATCH v2 24/25] fuse: add zero-copy over io-uring Joanne Koong
2025-12-18 8:33 ` Joanne Koong [this message]
2025-12-21 2:28 ` [PATCH v2 25/25] docs: fuse: add io-uring bufring and zero-copy documentation kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251218083319.3485503-26-joannelkoong@gmail.com \
--to=joannelkoong@gmail.com \
--cc=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=bschubert@ddn.com \
--cc=csander@purestorage.com \
--cc=io-uring@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=xiaobing.li@samsung.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox