From: Joanne Koong <joannelkoong@gmail.com>
To: Bernd Schubert <bernd@bsbernd.com>
Cc: miklos@szeredi.hu, axboe@kernel.dk,
linux-fsdevel@vger.kernel.org, bschubert@ddn.com,
asml.silence@gmail.com, io-uring@vger.kernel.org,
xiaobing.li@samsung.com, csander@purestorage.com,
kernel-team@meta.com
Subject: Re: [PATCH v2 8/8] fuse: support io-uring registered buffers
Date: Thu, 6 Nov 2025 15:09:48 -0800 [thread overview]
Message-ID: <CAJnrk1YPEDUbOu2N0EjfrkwK3Ge2XrNeaCY0YKL+E1t7Z8Xtvg@mail.gmail.com> (raw)
In-Reply-To: <a335fd2c-03ca-4201-abcf-74809b84c426@bsbernd.com>
On Thu, Nov 6, 2025 at 11:48 AM Bernd Schubert <bernd@bsbernd.com> wrote:
>
> On 10/27/25 23:28, Joanne Koong wrote:
> > Add support for io-uring registered buffers for fuse daemons
> > communicating through the io-uring interface. Daemons may register
> > buffers ahead of time, which will eliminate the overhead of
> > pinning/unpinning user pages and translating virtual addresses for every
> > server-kernel interaction.
> >
> > To support page-aligned payloads, the buffer is structured such that the
> > payload is at the front of the buffer and the fuse_uring_req_header is
> > offset from the end of the buffer.
> >
> > To be backwards compatible, fuse uring still needs to support non-registered
> > buffers as well.
> >
> > Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> > ---
> > fs/fuse/dev_uring.c | 200 +++++++++++++++++++++++++++++++++---------
> > fs/fuse/dev_uring_i.h | 27 +++++-
> > 2 files changed, 183 insertions(+), 44 deletions(-)
> >
> > diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c
> > index c6b22b14b354..f501bc81f331 100644
> > --- a/fs/fuse/dev_uring.c
> > +++ b/fs/fuse/dev_uring.c
> >
> > +/*
> > + * Prepare fixed buffer for access. Sets up the payload iter and kmaps the
> > + * header.
> > + *
> > + * Callers must call fuse_uring_unmap_buffer() in the same scope to release the
> > + * header mapping.
> > + *
> > + * For non-fixed buffers, this is a no-op.
> > + */
> > +static int fuse_uring_map_buffer(struct fuse_ring_ent *ent)
> > +{
> > + size_t header_size = sizeof(struct fuse_uring_req_header);
> > + struct iov_iter iter;
> > + struct page *header_page;
> > + size_t count, start;
> > + ssize_t copied;
> > + int err;
> > +
> > + if (!ent->fixed_buffer)
> > + return 0;
> > +
> > + err = io_uring_cmd_import_fixed_full(ITER_DEST, &iter, ent->cmd, 0);
>
> This seems to be a rather expensive call, especially as it gets
> called twice (during submit and fetch).
> Wouldn't be there be a possibility to check if the user buffer changed
> and then keep the existing iter? I think Caleb had a similar idea
> in patch 1/8.
I think the best approach is to get rid of the call entirely by
returning -EBUSY to the server if it tries unregistering the buffers
while a connection is still alive. Then we would just have to set this
up once at registration time, and use that for the lifetime of the
connection. The discussion about this with Pavel is in [1] - I'm
planning to do this as a separate follow-up.
[1] https://lore.kernel.org/linux-fsdevel/9f0debb1-ce0e-4085-a3fe-0da7a8fd76a6@gmail.com/
>
> > + if (err)
> > + return err;
> > +
> > + count = iov_iter_count(&iter);
> > + if (count < header_size || count & (PAGE_SIZE - 1))
> > + return -EINVAL;
>
> || !PAGE_ALIGNED(count)) ?
Nice, I didn't realize this macro existed. Thanks.
>
> > +
> > + /* Adjust the payload iter to protect the header from any overwrites */
> > + ent->payload_iter = iter;
> > + iov_iter_truncate(&ent->payload_iter, count - header_size);
> > +
> > + /* Set up the headers */
> > + iov_iter_advance(&iter, count - header_size);
> > + copied = iov_iter_get_pages2(&iter, &header_page, header_size, 1, &start);
>
> The iter is later used for the payload, but I miss a reset? iov_iter_revert()?
This iter is separate from the payload iter and doesn't affect the
payload iter's values because the "ent->payload_iter = iter;"
assignment above shallow copies that out first.
>
> > + if (copied < header_size)
> > + return -EFAULT;
> > + ent->headers = kmap_local_page(header_page) + start;
>
> My plan for the alternative pinning patch (with io-uring) was to let the
> header be shared by multiple entries. Current libfuse master handles
> a fixed page size buffer for the payload (prepared page pinning - I
> didn't expect I was blocked for 9 months on other work), missing is to
> share it between ring entries.
> I think this wouldn't work with registered buffer approach - it
> always needs one full page?
I've been working on the patches for zero-copy and that has required
the design for registered buffers in this patch to change, namely that
the payload and the headers must be separated out. For v3, I have them
separate now.
>
> I would also like to discuss dynamic multiple payload sizes per queue.
> For example to have something like
>
> 256 x 4K
> 8 x 128K
> 4 x 1M
I think zero-copy might obviate the need for this. The way I have it
right now, it uses sparse buffers for payloads, which prevents the
server from needing to allocate the 1M buffer per ent. I'm hoping to
send out the patches for this as part of v3 at the end of next week or
next next week.
Thanks,
joanne
>
> I think there are currently two ways to do that
>
> 1) Sort entries into pools
> 2) Sort buffers into pools and let entries use these. Here the header
> would be fixed and payload would come from a pool.
>
> With the appraoch to have payload and header in one buffer we couldn't
> use 2). Using 1) should be fine, though.
>
> >
> > /*
> > @@ -1249,20 +1358,29 @@ static void fuse_uring_send_in_task(struct io_uring_cmd *cmd,
> > {
> > struct fuse_ring_ent *ent = uring_cmd_to_ring_ent(cmd);
> > struct fuse_ring_queue *queue = ent->queue;
> > + bool send_ent = true;
> > int err;
> >
> > - if (!(issue_flags & IO_URING_F_TASK_DEAD)) {
> > - err = fuse_uring_prepare_send(ent, ent->fuse_req);
> > - if (err) {
> > - if (!fuse_uring_get_next_fuse_req(ent, queue))
> > - return;
> > - err = 0;
> > - }
> > - } else {
> > - err = -ECANCELED;
> > + if (issue_flags & IO_URING_F_TASK_DEAD) {
> > + fuse_uring_send(ent, cmd, -ECANCELED, issue_flags);
> > + return;
> > + }
> > +
> > + err = fuse_uring_map_buffer(ent);
> > + if (err) {
> > + fuse_uring_req_end(ent, ent->fuse_req, err);
> > + return;
>
> I think this needs to abort the connection now. There could be multiple
> commands on the queue and they would be stuck now and there is no
> notification to fuse server either.
This approach makes sense to me and makes things a bit simpler. I'll
add this to v3.
Thanks,
Joanne
next prev parent reply other threads:[~2025-11-06 23:10 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-27 22:27 [PATCH v2 0/8] fuse: support io-uring registered buffers Joanne Koong
2025-10-27 22:28 ` [PATCH v2 1/8] io_uring/uring_cmd: add io_uring_cmd_import_fixed_full() Joanne Koong
2025-10-28 1:28 ` Caleb Sander Mateos
2025-10-29 14:01 ` Pavel Begunkov
2025-10-29 18:37 ` Joanne Koong
2025-10-29 19:59 ` Bernd Schubert
2025-10-30 17:42 ` Pavel Begunkov
2025-10-30 18:06 ` Pavel Begunkov
2025-10-30 22:23 ` Bernd Schubert
2025-10-30 23:50 ` Joanne Koong
2025-10-31 10:27 ` Bernd Schubert
2025-10-31 21:19 ` Joanne Koong
2025-10-30 23:13 ` Joanne Koong
2025-10-27 22:28 ` [PATCH v2 2/8] fuse: refactor io-uring logic for getting next fuse request Joanne Koong
2025-10-30 23:07 ` Bernd Schubert
2025-10-27 22:28 ` [PATCH v2 3/8] fuse: refactor io-uring header copying to ring Joanne Koong
2025-10-30 23:15 ` Bernd Schubert
2025-10-30 23:52 ` Joanne Koong
2025-10-27 22:28 ` [PATCH v2 4/8] fuse: refactor io-uring header copying from ring Joanne Koong
2025-10-27 22:28 ` [PATCH v2 5/8] fuse: use enum types for header copying Joanne Koong
2025-11-05 23:01 ` Bernd Schubert
2025-11-06 21:59 ` Joanne Koong
2025-11-07 22:11 ` Bernd Schubert
2025-10-27 22:28 ` [PATCH v2 6/8] fuse: add user_ prefix to userspace headers and payload fields Joanne Koong
2025-10-28 1:32 ` Caleb Sander Mateos
2025-10-28 23:56 ` Joanne Koong
2025-11-06 13:35 ` Bernd Schubert
2025-10-27 22:28 ` [PATCH v2 7/8] fuse: refactor setting up copy state for payload copying Joanne Koong
2025-11-06 16:53 ` Bernd Schubert
2025-11-06 22:01 ` Joanne Koong
2025-10-27 22:28 ` [PATCH v2 8/8] fuse: support io-uring registered buffers Joanne Koong
2025-10-28 1:42 ` Caleb Sander Mateos
2025-10-28 23:56 ` Joanne Koong
2025-11-06 19:48 ` Bernd Schubert
2025-11-06 23:09 ` Joanne Koong [this message]
2025-11-07 22:16 ` Bernd Schubert
2025-11-07 22:23 ` Bernd Schubert
2025-11-23 20:12 ` Bernd Schubert
2025-11-25 1:13 ` Joanne Koong
2025-11-14 23:59 ` [PATCH v2 0/8] " Joanne Koong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJnrk1YPEDUbOu2N0EjfrkwK3Ge2XrNeaCY0YKL+E1t7Z8Xtvg@mail.gmail.com \
--to=joannelkoong@gmail.com \
--cc=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=bernd@bsbernd.com \
--cc=bschubert@ddn.com \
--cc=csander@purestorage.com \
--cc=io-uring@vger.kernel.org \
--cc=kernel-team@meta.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=xiaobing.li@samsung.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox