From: Hao Xu <[email protected]>
To: Jens Axboe <[email protected]>, [email protected]
Cc: [email protected]
Subject: Re: [PATCH 2/2] io_uring: add support for passing fixed file descriptors
Date: Sat, 18 Jun 2022 19:02:53 +0800 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 6/17/22 21:45, Jens Axboe wrote:
> With IORING_OP_MSG_RING, one ring can send a message to another ring.
> Extend that support to also allow sending a fixed file descriptor to
> that ring, enabling one ring to pass a registered descriptor to another
> one.
>
> Arguments are extended to pass in:
>
> sqe->addr3 fixed file slot in source ring
> sqe->file_index fixed file slot in destination ring
>
> IORING_OP_MSG_RING is extended to take a command argument in sqe->addr.
> If set to zero (or IORING_MSG_DATA), it sends just a message like before.
> If set to IORING_MSG_SEND_FD, a fixed file descriptor is sent according
> to the above arguments.
>
> Undecided:
> - Should we post a cqe with the send, or require that the sender
> just link a separate IORING_OP_MSG_RING? This makes error
> handling easier, as we cannot easily retract the installed
> file descriptor if the target CQ ring is full. Right now we do
> fill a CQE. If the request completes with -EOVERFLOW, then the
> sender must re-send a CQE if the target must get notified.
Hi Jens,
Since we are have open/accept direct feature, this may be useful. But I
just can't think of a real case that people use two rings and need to do
operations to same fd.
Assume there are real cases, then filling a cqe is necessary since users
need to first make sure the desired fd is registered before doing
something to it.
A downside is users have to take care to do fd delivery especially
when slot resource is in short supply in target_ctx.
ctx target_ctx
msg1(fd1 to target slot x)
msg2(fd2 to target slot x)
get cqe of msg1
do something to fd1 by access slot x
the msg2 is issued not at the right time. In short not only ctx needs to
fill a cqe to target_ctx to inform that the file has been registered
but also the target_ctx has to tell ctx that "my slot x is free now
for you to deliver fd". So I guess users are inclined to allocate a
big fixed table and deliver fds to target_ctx in different slots,
Which is ok but anyway a limitation.
>
> - Add an IORING_MSG_MOVE_FD which moves the descriptor, removing
> it from the source ring when installed in the target? Again
> error handling is difficult.
>
> Signed-off-by: Jens Axboe <[email protected]>
> ---
> include/uapi/linux/io_uring.h | 8 +++
> io_uring/msg_ring.c | 122 ++++++++++++++++++++++++++++++++--
> 2 files changed, 123 insertions(+), 7 deletions(-)
>
> diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
> index 8715f0942ec2..dbdaeef3ea89 100644
> --- a/include/uapi/linux/io_uring.h
> +++ b/include/uapi/linux/io_uring.h
> @@ -264,6 +264,14 @@ enum io_uring_op {
> */
> #define IORING_ACCEPT_MULTISHOT (1U << 0)
>
> +/*
> + * IORING_OP_MSG_RING command types, stored in sqe->addr
> + */
> +enum {
> + IORING_MSG_DATA, /* pass sqe->len as 'res' and off as user_data */
> + IORING_MSG_SEND_FD, /* send a registered fd to another ring */
> +};
> +
> /*
> * IO completion data structure (Completion Queue Entry)
> */
> diff --git a/io_uring/msg_ring.c b/io_uring/msg_ring.c
> index b02be2349652..e9d6fb25d141 100644
> --- a/io_uring/msg_ring.c
> +++ b/io_uring/msg_ring.c
> @@ -3,46 +3,154 @@
> #include <linux/errno.h>
> #include <linux/file.h>
> #include <linux/slab.h>
> +#include <linux/nospec.h>
> #include <linux/io_uring.h>
>
> #include <uapi/linux/io_uring.h>
>
> #include "io_uring.h"
> +#include "rsrc.h"
> +#include "filetable.h"
> #include "msg_ring.h"
>
> struct io_msg {
> struct file *file;
> u64 user_data;
> u32 len;
> + u32 cmd;
> + u32 src_fd;
> + u32 dst_fd;
> };
>
> +static int io_msg_ring_data(struct io_kiocb *req)
> +{
> + struct io_ring_ctx *target_ctx = req->file->private_data;
> + struct io_msg *msg = io_kiocb_to_cmd(req);
> +
> + if (msg->src_fd || msg->dst_fd)
> + return -EINVAL;
> +
> + if (io_post_aux_cqe(target_ctx, msg->user_data, msg->len, 0))
> + return 0;
> +
> + return -EOVERFLOW;
> +}
> +
> +static void io_double_unlock_ctx(struct io_ring_ctx *ctx,
> + struct io_ring_ctx *octx,
> + unsigned int issue_flags)
> +{
> + if (issue_flags & IO_URING_F_UNLOCKED)
> + mutex_unlock(&ctx->uring_lock);
> + mutex_unlock(&octx->uring_lock);
> +}
> +
> +static int io_double_lock_ctx(struct io_ring_ctx *ctx,
> + struct io_ring_ctx *octx,
> + unsigned int issue_flags)
> +{
> + /*
> + * To ensure proper ordering between the two ctxs, we can only
> + * attempt a trylock on the target. If that fails and we already have
> + * the source ctx lock, punt to io-wq.
> + */
> + if (!(issue_flags & IO_URING_F_UNLOCKED)) {
> + if (!mutex_trylock(&octx->uring_lock))
> + return -EAGAIN;
> + return 0;
> + }
> +
> + /* Always grab smallest value ctx first. */
> + if (ctx < octx) {
> + mutex_lock(&ctx->uring_lock);
> + mutex_lock(&octx->uring_lock);
> + } else if (ctx > octx) {
Would a simple else work?
if (a < b) {
lock(a); lock(b);
} else {
lock(b);lock(a);
}
since a doesn't equal b
next prev parent reply other threads:[~2022-06-18 11:03 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-17 13:45 [PATCHSET RFC for-next 0/2] Add direct descriptor ring passing Jens Axboe
2022-06-17 13:45 ` [PATCH 1/2] io_uring: split out fixed file installation and removal Jens Axboe
2022-06-17 13:45 ` [PATCH 2/2] io_uring: add support for passing fixed file descriptors Jens Axboe
2022-06-18 11:02 ` Hao Xu [this message]
2022-06-18 11:34 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox