From: Bernd Schubert <[email protected]>
To: Joanne Koong <[email protected]>,
Bernd Schubert <[email protected]>
Cc: Miklos Szeredi <[email protected]>, Jens Axboe <[email protected]>,
Pavel Begunkov <[email protected]>,
[email protected], [email protected],
Josef Bacik <[email protected]>,
Amir Goldstein <[email protected]>,
Ming Lei <[email protected]>, David Wei <[email protected]>,
[email protected]
Subject: Re: [PATCH RFC v7 06/16] fuse: {uring} Handle SQEs - register commands
Date: Thu, 28 Nov 2024 19:20:21 +0100 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAJnrk1YnWFQYG9VTr_1iUwcJmQEg3LemGOGkiqwbaqa4EaMUWw@mail.gmail.com>
Hi Joanne,
thanks for your careful review!
On 11/28/24 03:23, Joanne Koong wrote:
> On Wed, Nov 27, 2024 at 5:41 AM Bernd Schubert <[email protected]> wrote:
>>
>> This adds basic support for ring SQEs (with opcode=IORING_OP_URING_CMD).
>> For now only FUSE_URING_REQ_FETCH is handled to register queue entries.
>>
>> Signed-off-by: Bernd Schubert <[email protected]>
>> ---
>> fs/fuse/Kconfig | 12 ++
>> fs/fuse/Makefile | 1 +
>> fs/fuse/dev.c | 4 +
>> fs/fuse/dev_uring.c | 318 ++++++++++++++++++++++++++++++++++++++++++++++
>> fs/fuse/dev_uring_i.h | 115 +++++++++++++++++
>> fs/fuse/fuse_i.h | 5 +
>> fs/fuse/inode.c | 10 ++
>> include/uapi/linux/fuse.h | 67 ++++++++++
>> 8 files changed, 532 insertions(+)
>>
>> diff --git a/fs/fuse/Kconfig b/fs/fuse/Kconfig
>> index 8674dbfbe59dbf79c304c587b08ebba3cfe405be..11f37cefc94b2af5a675c238801560c822b95f1a 100644
>> --- a/fs/fuse/Kconfig
>> +++ b/fs/fuse/Kconfig
>> @@ -63,3 +63,15 @@ config FUSE_PASSTHROUGH
>> to be performed directly on a backing file.
>>
>> If you want to allow passthrough operations, answer Y.
>> +
>> +config FUSE_IO_URING
>> + bool "FUSE communication over io-uring"
>> + default y
>> + depends on FUSE_FS
>> + depends on IO_URING
>> + help
>> + This allows sending FUSE requests over the IO uring interface and
>> + also adds request core affinity.
>> +
>> + If you want to allow fuse server/client communication through io-uring,
>> + answer Y
>> diff --git a/fs/fuse/Makefile b/fs/fuse/Makefile
>> index ce0ff7a9007b94b4ab246b5271f227d126c768e8..fcf16b1c391a9bf11ca9f3a25b137acdb203ac47 100644
>> --- a/fs/fuse/Makefile
>> +++ b/fs/fuse/Makefile
>> @@ -14,5 +14,6 @@ fuse-y := dev.o dir.o file.o inode.o control.o xattr.o acl.o readdir.o ioctl.o
>> fuse-y += iomode.o
>> fuse-$(CONFIG_FUSE_DAX) += dax.o
>> fuse-$(CONFIG_FUSE_PASSTHROUGH) += passthrough.o
>> +fuse-$(CONFIG_FUSE_IO_URING) += dev_uring.o
>>
>> virtiofs-y := virtio_fs.o
>> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
>> index 63c3865aebb7811fdf4a5729b2181ee8321421dc..0770373492ae9ee83c4154fede9dcfd7be9fb33d 100644
>> --- a/fs/fuse/dev.c
>> +++ b/fs/fuse/dev.c
>> @@ -6,6 +6,7 @@
>> See the file COPYING.
>> */
>>
>> +#include "dev_uring_i.h"
>> #include "fuse_i.h"
>> #include "fuse_dev_i.h"
>>
>> @@ -2452,6 +2453,9 @@ const struct file_operations fuse_dev_operations = {
>> .fasync = fuse_dev_fasync,
>> .unlocked_ioctl = fuse_dev_ioctl,
>> .compat_ioctl = compat_ptr_ioctl,
>> +#ifdef CONFIG_FUSE_IO_URING
>> + .uring_cmd = fuse_uring_cmd,
>> +#endif
>> };
>> EXPORT_SYMBOL_GPL(fuse_dev_operations);
>>
>> diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c
>> new file mode 100644
>> index 0000000000000000000000000000000000000000..af9c5f116ba1dcf6c01d0359d1a06491c92c32f9
>> --- /dev/null
>> +++ b/fs/fuse/dev_uring.c
>> @@ -0,0 +1,318 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * FUSE: Filesystem in Userspace
>> + * Copyright (c) 2023-2024 DataDirect Networks.
>> + */
>> +
>> +#include <linux/fs.h>
>
> nit: for consistency, should this line be placed directly above the
> other "#include <linux..." line?
>
>> +
>> +#include "fuse_i.h"
>> +#include "dev_uring_i.h"
>> +#include "fuse_dev_i.h"
>> +
>> +#include <linux/io_uring/cmd.h>
>> +
>> +#ifdef CONFIG_FUSE_IO_URING
>> +static bool __read_mostly enable_uring;
>> +module_param(enable_uring, bool, 0644);
>> +MODULE_PARM_DESC(enable_uring,
>> + "Enable uring userspace communication through uring.");
>
> nit: The double uring seems a bit repetitive to me. Maybe just "enable
> uring userspace communication" or "enable userspace communication
> through uring"? Also, super nit but I noticed the other
> MODULE_PARM_DESCs don't have trailing periods.
Fixed to "Enable userspace communication through io-uring"
>
>> +#endif
>> +
>> +bool fuse_uring_enabled(void)
>> +{
>> + return enable_uring;
>> +}
>> +
>> +static int fuse_ring_ent_unset_userspace(struct fuse_ring_ent *ent)
>> +{
>> + struct fuse_ring_queue *queue = ent->queue;
>> +
>> + lockdep_assert_held(&queue->lock);
>> +
>> + if (WARN_ON_ONCE(ent->state != FRRS_USERSPACE))
>> + return -EIO;
>> +
>> + ent->state = FRRS_COMMIT;
>> + list_move(&ent->list, &queue->ent_commit_queue);
>> +
>> + return 0;
>> +}
>> +
>> +void fuse_uring_destruct(struct fuse_conn *fc)
>> +{
>> + struct fuse_ring *ring = fc->ring;
>> + int qid;
>> +
>> + if (!ring)
>> + return;
>> +
>> + for (qid = 0; qid < ring->nr_queues; qid++) {
>> + struct fuse_ring_queue *queue = ring->queues[qid];
>> +
>> + if (!queue)
>> + continue;
>> +
>> + WARN_ON(!list_empty(&queue->ent_avail_queue));
>> + WARN_ON(!list_empty(&queue->ent_commit_queue));
>> +
>> + kfree(queue);
>> + ring->queues[qid] = NULL;
>> + }
>> +
>> + kfree(ring->queues);
>> + kfree(ring);
>> + fc->ring = NULL;
>> +}
>> +
>> +#define FUSE_URING_IOV_SEGS 2 /* header and payload */
>
> nit: to make the code flow easier, might be better to move #defines to
> the top of the file after the includes.
Fixed.
>
>> +
>> +/*
>> + * Basic ring setup for this connection based on the provided configuration
>> + */
>> +static struct fuse_ring *fuse_uring_create(struct fuse_conn *fc)
>> +{
>> + struct fuse_ring *ring = NULL;
>> + size_t nr_queues = num_possible_cpus();
>> + struct fuse_ring *res = NULL;
>> +
>> + ring = kzalloc(sizeof(*fc->ring) +
>> + nr_queues * sizeof(struct fuse_ring_queue),
>
> I think you just need kzalloc(sizeof(*fc->ring)); here since you're
> allocating ring->queues later below
Ah, now I see what Miklos probably meant "with left over from previous
patches". Sorry Miklos, I had misunderstood what you meant and thanks
Joanne for spotting it.
>
>> + GFP_KERNEL_ACCOUNT);
>> + if (!ring)
>> + return NULL;
>> +
>> + ring->queues = kcalloc(nr_queues, sizeof(struct fuse_ring_queue *),
>> + GFP_KERNEL_ACCOUNT);
>> + if (!ring->queues)
>> + goto out_err;
>> +
>> + spin_lock(&fc->lock);
>> + if (fc->ring) {
>> + /* race, another thread created the ring in the mean time */
>
> nit: s/mean time/meantime
Fixed.
>
>> + spin_unlock(&fc->lock);
>> + res = fc->ring;
>> + goto out_err;
>> + }
>> +
>> + fc->ring = ring;
>> + ring->nr_queues = nr_queues;
>> + ring->fc = fc;
>> +
>> + spin_unlock(&fc->lock);
>> + return ring;
>> +
>> +out_err:
>> + if (ring)
>
> I think you meant if (ring->queues)
I really meant ring, initially ring allocation error was via this goto. I just
removed the condition now.
>
>> + kfree(ring->queues);
>> + kfree(ring);
>> + return res;
>> +}
>> +
>> +static struct fuse_ring_queue *fuse_uring_create_queue(struct fuse_ring *ring,
>> + int qid)
>> +{
>> + struct fuse_conn *fc = ring->fc;
>> + struct fuse_ring_queue *queue;
>> +
>> + queue = kzalloc(sizeof(*queue), GFP_KERNEL_ACCOUNT);
>> + if (!queue)
>> + return ERR_PTR(-ENOMEM);
>> + spin_lock(&fc->lock);
>
> This probably doesn't make much of a difference but might be better to
> minimize logic inside the lock, eg do the queue initialization stuff
> outside the lock
Absolutely, fixed.
>
>> + if (ring->queues[qid]) {
>> + spin_unlock(&fc->lock);
>> + kfree(queue);
>> + return ring->queues[qid];
>> + }
>> +
>> + queue->qid = qid;
>> + queue->ring = ring;
>> + spin_lock_init(&queue->lock);
>> +
>> + INIT_LIST_HEAD(&queue->ent_avail_queue);
>> + INIT_LIST_HEAD(&queue->ent_commit_queue);
>> +
>> + WRITE_ONCE(ring->queues[qid], queue);
>
> Just curious, why do we need WRITE_ONCE here if it's already protected
> by the fc->lock?
fuse_uring_fetch() (going to rename it _register) obtains the the ring
without holding a lock. Only when that fails it calls into
fuse_uring_create(), which then obtains the lock.
I.e. this is optimized to not need the lock at all.
Btw, interesting part here is that I always thought that with a single
libfuse ring thread we should never run into races, until I noticed
with large queue sizes (>128) that there were races in fuse_uring_fetch,
because fuse_uring_create_queue had assigned "ring->queues[qid] = queue"
too early, before initialization of the spin lock (some lockdep
annotations about the queue spin lock not being initialized came up).
>
>> + spin_unlock(&fc->lock);
>> +
>> + return queue;
>> +}
>> +
>> +/*
>> + * Make a ring entry available for fuse_req assignment
>> + */
>> +static void fuse_uring_ent_avail(struct fuse_ring_ent *ring_ent,
>> + struct fuse_ring_queue *queue)
>> +{
>> + list_move(&ring_ent->list, &queue->ent_avail_queue);
>> + ring_ent->state = FRRS_WAIT;
>> +}
>> +
>> +/*
>> + * fuse_uring_req_fetch command handling
>> + */
>> +static void _fuse_uring_fetch(struct fuse_ring_ent *ring_ent,
>> + struct io_uring_cmd *cmd,
>> + unsigned int issue_flags)
>> +{
>> + struct fuse_ring_queue *queue = ring_ent->queue;
>> +
>> + spin_lock(&queue->lock);
>> + fuse_uring_ent_avail(ring_ent, queue);
>> + ring_ent->cmd = cmd;
>> + spin_unlock(&queue->lock);
>> +}
>> +
>> +/*
>> + * sqe->addr is a ptr to an iovec array, iov[0] has the headers, iov[1]
>> + * the payload
>> + */
>> +static int fuse_uring_get_iovec_from_sqe(const struct io_uring_sqe *sqe,
>> + struct iovec iov[FUSE_URING_IOV_SEGS])
>> +{
>> + struct iovec __user *uiov = u64_to_user_ptr(READ_ONCE(sqe->addr));
>> + struct iov_iter iter;
>> + ssize_t ret;
>> +
>> + if (sqe->len != FUSE_URING_IOV_SEGS)
>> + return -EINVAL;
>> +
>> + /*
>> + * Direction for buffer access will actually be READ and WRITE,
>> + * using write for the import should include READ access as well.
>> + */
>> + ret = import_iovec(WRITE, uiov, FUSE_URING_IOV_SEGS,
>> + FUSE_URING_IOV_SEGS, &iov, &iter);
>> + if (ret < 0)
>> + return ret;
>> +
>> + return 0;
>> +}
>> +
>> +static int fuse_uring_fetch(struct io_uring_cmd *cmd, unsigned int issue_flags,
>> + struct fuse_conn *fc)
>> +{
>> + const struct fuse_uring_cmd_req *cmd_req = io_uring_sqe_cmd(cmd->sqe);
>> + struct fuse_ring *ring = fc->ring;
>> + struct fuse_ring_queue *queue;
>> + struct fuse_ring_ent *ring_ent;
>> + int err;
>> + struct iovec iov[FUSE_URING_IOV_SEGS];
>> +
>> + err = fuse_uring_get_iovec_from_sqe(cmd->sqe, iov);
>> + if (err) {
>> + pr_info_ratelimited("Failed to get iovec from sqe, err=%d\n",
>> + err);
>> + return err;
>> + }
>> +
>> + err = -ENOMEM;
>> + if (!ring) {
>> + ring = fuse_uring_create(fc);
>> + if (!ring)
>> + return err;
>> + }
>> +
>> + queue = ring->queues[cmd_req->qid];
>> + if (!queue) {
>> + queue = fuse_uring_create_queue(ring, cmd_req->qid);
>> + if (!queue)
>> + return err;
>> + }
>> +
>> + /*
>> + * The created queue above does not need to be destructed in
>> + * case of entry errors below, will be done at ring destruction time.
>> + */
>> +
>> + ring_ent = kzalloc(sizeof(*ring_ent), GFP_KERNEL_ACCOUNT);
>> + if (ring_ent == NULL)
>
> nit: !ring_ent
Fixed.
>
>> + return err;
>> +
>> + INIT_LIST_HEAD(&ring_ent->list);
>> +
>> + ring_ent->queue = queue;
>> + ring_ent->cmd = cmd;
>> +
>> + err = -EINVAL;
>> + if (iov[0].iov_len < sizeof(struct fuse_uring_req_header)) {
>> + pr_info_ratelimited("Invalid header len %zu\n", iov[0].iov_len);
>> + goto err;
>> + }
>> +
>> + ring_ent->headers = iov[0].iov_base;
>> + ring_ent->payload = iov[1].iov_base;
>> + ring_ent->max_arg_len = iov[1].iov_len;
>> +
>> + if (ring_ent->max_arg_len <
>> + max_t(size_t, FUSE_MIN_READ_BUFFER, fc->max_write)) {
>
> If I'm understanding this correctly, iov[0] is the header and iov[1]
> is the payload. Is this right that the payload len must always be
> equal to fc->max_write?
I'm going to drop ring_ent->max_arg_len for the next version, but it will
eventually come back once I had support for multiple payload sizes.
Typically we will need many small requests, less mid size requests and few
large size requests. Just think of a queue size of 128 and a recent system
with 192 cores.
>
> Also, do we need to take into account fc->max_pages too?
Oh yes, thanks for spotting!
>
>> + pr_info_ratelimited("Invalid req payload len %zu\n",
>> + ring_ent->max_arg_len);
>> + goto err;
>> + }
>> +
>> + spin_lock(&queue->lock);
>> +
>> + /*
>> + * FUSE_URING_REQ_FETCH is an initialization exception, needs
>> + * state override
>> + */
>> + ring_ent->state = FRRS_USERSPACE;
>> + err = fuse_ring_ent_unset_userspace(ring_ent);
>> + spin_unlock(&queue->lock);
>> + if (WARN_ON_ONCE(err != 0))
>
> nit: WARN_ON_ONCE(err)
Fixed.
>
>> + goto err;
>> +
>> + _fuse_uring_fetch(ring_ent, cmd, issue_flags);
>> +
>> + return 0;
>> +err:
>> + list_del_init(&ring_ent->list);
>> + kfree(ring_ent);
>> + return err;
>> +}
>> +
>> +/*
>> + * Entry function from io_uring to handle the given passthrough command
>> + * (op cocde IORING_OP_URING_CMD)
>> + */
>> +int fuse_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags)
>> +{
>> + struct fuse_dev *fud;
>> + struct fuse_conn *fc;
>> + u32 cmd_op = cmd->cmd_op;
>> + int err;
>> +
>> + /* Disabled for now, especially as teardown is not implemented yet */
>> + pr_info_ratelimited("fuse-io-uring is not enabled yet\n");
>> + return -EOPNOTSUPP;
>> +
>> + if (!enable_uring) {
>> + pr_info_ratelimited("fuse-io-uring is disabled\n");
>> + return -EOPNOTSUPP;
>> + }
>> +
>> + fud = fuse_get_dev(cmd->file);
>> + if (!fud) {
>> + pr_info_ratelimited("No fuse device found\n");
>> + return -ENOTCONN;
>> + }
>> + fc = fud->fc;
>> +
>> + if (!fc->connected || fc->aborted)
>> + return fc->aborted ? -ECONNABORTED : -ENOTCONN;
>
> I find
> if (fc->aborted)
> return -ECONNABORTED;
> if (!fc->connected)
> return -ENOTCONN;
>
> easier to read
Fine with me.
>> +
>> + switch (cmd_op) {
>> + case FUSE_URING_REQ_FETCH:
>
> FUSE_URING_REQ_FETCH is only used for initialization, would it be
> clearer if this was named FUSE_URING_INIT or something like that?
FUSE_URING_CMD_REGISTER?
>
>> + err = fuse_uring_fetch(cmd, issue_flags, fc);
>> + if (err) {
>> + pr_info_once("fuse_uring_fetch failed err=%d\n", err);
>> + return err;
>> + }
>> + break;
>> + default:
>> + return -EINVAL;
>> + }
>> +
>> + return -EIOCBQUEUED;
>> +}
>> diff --git a/fs/fuse/dev_uring_i.h b/fs/fuse/dev_uring_i.h
>> new file mode 100644
>> index 0000000000000000000000000000000000000000..75c644cc0b2bb3721b08f8695964815d53f46e92
>> --- /dev/null
>> +++ b/fs/fuse/dev_uring_i.h
>> @@ -0,0 +1,115 @@
>> +/* SPDX-License-Identifier: GPL-2.0
>> + *
>> + * FUSE: Filesystem in Userspace
>> + * Copyright (c) 2023-2024 DataDirect Networks.
>> + */
>> +
>> +#ifndef _FS_FUSE_DEV_URING_I_H
>> +#define _FS_FUSE_DEV_URING_I_H
>> +
>> +#include "fuse_i.h"
>> +
>> +#ifdef CONFIG_FUSE_IO_URING
>> +
>> +enum fuse_ring_req_state {
>> + FRRS_INVALID = 0,
>> +
>> + /* The ring entry received from userspace and it being processed */
>
> nit: "it is being processed"
>
>> + FRRS_COMMIT,
>> +
>> + /* The ring entry is waiting for new fuse requests */
>> + FRRS_WAIT,
>> +
>> + /* The ring entry is in or on the way to user space */
>> + FRRS_USERSPACE,
>> +};
>> +
>> +/** A fuse ring entry, part of the ring queue */
>> +struct fuse_ring_ent {
>> + /* userspace buffer */
>> + struct fuse_uring_req_header __user *headers;
>> + void *__user *payload;
>> +
>> + /* the ring queue that owns the request */
>> + struct fuse_ring_queue *queue;
>> +
>> + struct io_uring_cmd *cmd;
>> +
>> + struct list_head list;
>> +
>> + /* size of payload buffer */
>> + size_t max_arg_len;
>> +
>> + /*
>> + * state the request is currently in
>> + * (enum fuse_ring_req_state)
>> + */
>> + unsigned int state;
>> +
>> + struct fuse_req *fuse_req;
>> +};
>> +
>> +struct fuse_ring_queue {
>> + /*
>> + * back pointer to the main fuse uring structure that holds this
>> + * queue
>> + */
>> + struct fuse_ring *ring;
>> +
>> + /* queue id, typically also corresponds to the cpu core */
>> + unsigned int qid;
>> +
>> + /*
>> + * queue lock, taken when any value in the queue changes _and_ also
>> + * a ring entry state changes.
>> + */
>> + spinlock_t lock;
>> +
>> + /* available ring entries (struct fuse_ring_ent) */
>> + struct list_head ent_avail_queue;
>> +
>> + /*
>> + * entries in the process of being committed or in the process
>> + * to be send to userspace
>> + */
>> + struct list_head ent_commit_queue;
>> +};
>> +
>> +/**
>> + * Describes if uring is for communication and holds alls the data needed
>> + * for uring communication
>> + */
>> +struct fuse_ring {
>> + /* back pointer */
>> + struct fuse_conn *fc;
>> +
>> + /* number of ring queues */
>> + size_t nr_queues;
>> +
>> + struct fuse_ring_queue **queues;
>> +};
>> +
>> +bool fuse_uring_enabled(void);
>> +void fuse_uring_destruct(struct fuse_conn *fc);
>> +int fuse_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags);
>> +
>> +#else /* CONFIG_FUSE_IO_URING */
>> +
>> +struct fuse_ring;
>> +
>> +static inline void fuse_uring_create(struct fuse_conn *fc)
>> +{
>> +}
>> +
>> +static inline void fuse_uring_destruct(struct fuse_conn *fc)
>> +{
>> +}
>> +
>> +static inline bool fuse_uring_enabled(void)
>> +{
>> + return false;
>> +}
>> +
>> +#endif /* CONFIG_FUSE_IO_URING */
>> +
>> +#endif /* _FS_FUSE_DEV_URING_I_H */
>> diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
>> index e3748751e231d0991c050b31bdd84db0b8016f9f..a21256ec4c3b4bd7c67eae2d03b68d87dcc8234b 100644
>> --- a/fs/fuse/fuse_i.h
>> +++ b/fs/fuse/fuse_i.h
>> @@ -914,6 +914,11 @@ struct fuse_conn {
>> /** IDR for backing files ids */
>> struct idr backing_files_map;
>> #endif
>> +
>> +#ifdef CONFIG_FUSE_IO_URING
>> + /** uring connection information*/
>> + struct fuse_ring *ring;
>> +#endif
>> };
>>
>> /*
>> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
>> index fd3321e29a3e569bf06be22a5383cf34fd42c051..76267c79e920204175e5713853de8214c5555d46 100644
>> --- a/fs/fuse/inode.c
>> +++ b/fs/fuse/inode.c
>> @@ -7,6 +7,7 @@
>> */
>>
>> #include "fuse_i.h"
>> +#include "dev_uring_i.h"
>>
>> #include <linux/pagemap.h>
>> #include <linux/slab.h>
>> @@ -959,6 +960,8 @@ static void delayed_release(struct rcu_head *p)
>> {
>> struct fuse_conn *fc = container_of(p, struct fuse_conn, rcu);
>>
>> + fuse_uring_destruct(fc);
>> +
>> put_user_ns(fc->user_ns);
>> fc->release(fc);
>> }
>> @@ -1413,6 +1416,13 @@ void fuse_send_init(struct fuse_mount *fm)
>> if (IS_ENABLED(CONFIG_FUSE_PASSTHROUGH))
>> flags |= FUSE_PASSTHROUGH;
>>
>> + /*
>> + * This is just an information flag for fuse server. No need to check
>> + * the reply - server is either sending IORING_OP_URING_CMD or not.
>> + */
>> + if (fuse_uring_enabled())
>> + flags |= FUSE_OVER_IO_URING;
>> +
>> ia->in.flags = flags;
>> ia->in.flags2 = flags >> 32;
>>
>> diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
>> index f1e99458e29e4fdce5273bc3def242342f207ebd..6d39077edf8cde4fa77130efcec16323839a676c 100644
>> --- a/include/uapi/linux/fuse.h
>> +++ b/include/uapi/linux/fuse.h
>> @@ -220,6 +220,14 @@
>> *
>> * 7.41
>> * - add FUSE_ALLOW_IDMAP
>> + * 7.42
>> + * - Add FUSE_OVER_IO_URING and all other io-uring related flags and data
>> + * structures:
>> + * - fuse_uring_ent_in_out
>> + * - fuse_uring_req_header
>> + * - fuse_uring_cmd_req
>> + * - FUSE_URING_IN_OUT_HEADER_SZ
>> + * - FUSE_URING_OP_IN_OUT_SZ
>> */
>>
>> #ifndef _LINUX_FUSE_H
>> @@ -425,6 +433,7 @@ struct fuse_file_lock {
>> * FUSE_HAS_RESEND: kernel supports resending pending requests, and the high bit
>> * of the request ID indicates resend requests
>> * FUSE_ALLOW_IDMAP: allow creation of idmapped mounts
>> + * FUSE_OVER_IO_URING: Indicate that Client supports io-uring
>> */
>> #define FUSE_ASYNC_READ (1 << 0)
>> #define FUSE_POSIX_LOCKS (1 << 1)
>> @@ -471,6 +480,7 @@ struct fuse_file_lock {
>> /* Obsolete alias for FUSE_DIRECT_IO_ALLOW_MMAP */
>> #define FUSE_DIRECT_IO_RELAX FUSE_DIRECT_IO_ALLOW_MMAP
>> #define FUSE_ALLOW_IDMAP (1ULL << 40)
>> +#define FUSE_OVER_IO_URING (1ULL << 41)
>>
>> /**
>> * CUSE INIT request/reply flags
>> @@ -1206,4 +1216,61 @@ struct fuse_supp_groups {
>> uint32_t groups[];
>> };
>>
>> +/**
>> + * Size of the ring buffer header
>> + */
>> +#define FUSE_URING_IN_OUT_HEADER_SZ 128
>> +#define FUSE_URING_OP_IN_OUT_SZ 128
>> +
>> +struct fuse_uring_ent_in_out {
>> + uint64_t flags;
>> +
>> + /* size of use payload buffer */
>> + uint32_t payload_sz;
>> + uint32_t padding;
>> +
>> + uint8_t reserved[30];
>
> out of curiosity, how was 30 chosen here? I think this makes the
> struct 46 bytes?
Clearly a bug, that was supposed to be 16. I changed it to
uint64_t reserved[2];
>
>> +};
>> +
>> +/**
>> + * This structure mapped onto the
>> + */
>> +struct fuse_uring_req_header {
>> + /* struct fuse_in / struct fuse_out */
>> + char in_out[FUSE_URING_IN_OUT_HEADER_SZ];
>> +
>> + /* per op code structs */
>> + char op_in[FUSE_URING_OP_IN_OUT_SZ];
>> +
>> + /* struct fuse_ring_in_out */
>> + char ring_ent_in_out[sizeof(struct fuse_uring_ent_in_out)];
>> +};
>> +
>> +/**
>> + * sqe commands to the kernel
>> + */
>> +enum fuse_uring_cmd {
>> + FUSE_URING_REQ_INVALID = 0,
>> +
>> + /* submit sqe to kernel to get a request */
>> + FUSE_URING_REQ_FETCH = 1,
>> +
>> + /* commit result and fetch next request */
>> + FUSE_URING_REQ_COMMIT_AND_FETCH = 2,
>> +};
>> +
>> +/**
>> + * In the 80B command area of the SQE.
>> + */
>> +struct fuse_uring_cmd_req {
>> + uint64_t flags;
>> +
>> + /* entry identifier */
>> + uint64_t commit_id;
>> +
>> + /* queue the command is for (queue index) */
>> + uint16_t qid;
>> + uint8_t padding[6];
>> +};
>> +
>> #endif /* _LINUX_FUSE_H */
>>
>
> I'll be out the rest of this week for an American holiday
> (Thanksgiving) and next week for a work trip but will review the rest
> of this patchset after that when i get back.
Thanks so much for your review! Happy Thanksgiving and safe travels!
Thanks,
Bernd
next prev parent reply other threads:[~2024-11-28 18:20 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-27 13:40 [PATCH RFC v7 00/16] fuse: fuse-over-io-uring Bernd Schubert
2024-11-27 13:40 ` [PATCH RFC v7 01/16] fuse: rename to fuse_dev_end_requests and make non-static Bernd Schubert
2024-11-28 0:19 ` Joanne Koong
2024-11-27 13:40 ` [PATCH RFC v7 02/16] fuse: Move fuse_get_dev to header file Bernd Schubert
2024-11-28 0:20 ` Joanne Koong
2024-11-27 13:40 ` [PATCH RFC v7 03/16] fuse: Move request bits Bernd Schubert
2024-11-28 0:21 ` Joanne Koong
2024-11-27 13:40 ` [PATCH RFC v7 04/16] fuse: Add fuse-io-uring design documentation Bernd Schubert
2024-12-03 12:30 ` Pavel Begunkov
2024-11-27 13:40 ` [PATCH RFC v7 05/16] fuse: make args->in_args[0] to be always the header Bernd Schubert
2024-11-28 0:27 ` Joanne Koong
2024-11-27 13:40 ` [PATCH RFC v7 06/16] fuse: {uring} Handle SQEs - register commands Bernd Schubert
2024-11-28 2:23 ` Joanne Koong
2024-11-28 18:20 ` Bernd Schubert [this message]
2024-12-03 13:24 ` Pavel Begunkov
2024-12-03 13:49 ` Bernd Schubert
2024-12-03 14:16 ` Pavel Begunkov
2024-12-03 13:38 ` Pavel Begunkov
2024-11-27 13:40 ` [PATCH RFC v7 07/16] fuse: Make fuse_copy non static Bernd Schubert
2024-11-27 13:40 ` [PATCH RFC v7 08/16] fuse: Add fuse-io-uring handling into fuse_copy Bernd Schubert
2024-11-27 13:40 ` [PATCH RFC v7 09/16] fuse: {uring} Add uring sqe commit and fetch support Bernd Schubert
2024-12-03 13:47 ` Pavel Begunkov
2024-11-27 13:40 ` [PATCH RFC v7 10/16] fuse: {uring} Handle teardown of ring entries Bernd Schubert
2024-11-27 13:40 ` [PATCH RFC v7 11/16] fuse: {uring} Allow to queue fg requests through io-uring Bernd Schubert
2024-12-03 14:09 ` Pavel Begunkov
2024-12-03 22:46 ` Bernd Schubert
2024-11-27 13:40 ` [PATCH RFC v7 12/16] fuse: {uring} Allow to queue bg " Bernd Schubert
2024-11-27 13:40 ` [PATCH RFC v7 13/16] io_uring/cmd: let cmds to know about dying task Bernd Schubert
2024-12-03 12:15 ` Pavel Begunkov
2024-12-03 12:15 ` Bernd Schubert
2024-11-27 13:40 ` [PATCH RFC v7 14/16] fuse: {uring} Handle IO_URING_F_TASK_DEAD Bernd Schubert
2024-12-03 12:20 ` Pavel Begunkov
2024-11-27 13:40 ` [PATCH RFC v7 15/16] fuse: {io-uring} Prevent mount point hang on fuse-server termination Bernd Schubert
2024-11-27 13:40 ` [PATCH RFC v7 16/16] fuse: enable fuse-over-io-uring Bernd Schubert
2024-11-27 13:45 ` [PATCH RFC v7 00/16] fuse: fuse-over-io-uring Bernd Schubert
2024-12-03 14:24 ` Pavel Begunkov
2024-12-03 14:32 ` Bernd Schubert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox