public inbox for [email protected]
 help / color / mirror / Atom feed
From: Hao Xu <[email protected]>
To: Jens Axboe <[email protected]>, [email protected]
Cc: [email protected]
Subject: Re: [PATCH 2/2] io_uring: add support for passing fixed file descriptors
Date: Sat, 18 Jun 2022 19:02:53 +0800	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 6/17/22 21:45, Jens Axboe wrote:
> With IORING_OP_MSG_RING, one ring can send a message to another ring.
> Extend that support to also allow sending a fixed file descriptor to
> that ring, enabling one ring to pass a registered descriptor to another
> one.
> 
> Arguments are extended to pass in:
> 
> sqe->addr3	fixed file slot in source ring
> sqe->file_index	fixed file slot in destination ring
> 
> IORING_OP_MSG_RING is extended to take a command argument in sqe->addr.
> If set to zero (or IORING_MSG_DATA), it sends just a message like before.
> If set to IORING_MSG_SEND_FD, a fixed file descriptor is sent according
> to the above arguments.
> 
> Undecided:
> 	- Should we post a cqe with the send, or require that the sender
> 	  just link a separate IORING_OP_MSG_RING? This makes error
> 	  handling easier, as we cannot easily retract the installed
> 	  file descriptor if the target CQ ring is full. Right now we do
> 	  fill a CQE. If the request completes with -EOVERFLOW, then the
> 	  sender must re-send a CQE if the target must get notified.

Hi Jens,
Since we are have open/accept direct feature, this may be useful. But I
just can't think of a real case that people use two rings and need to do
operations to same fd.
Assume there are real cases, then filling a cqe is necessary since users
need to first make sure the desired fd is registered before doing
something to it.

A downside is users have to take care to do fd delivery especially
when slot resource is in short supply in target_ctx.

                 ctx                            target_ctx
     msg1(fd1 to target slot x)

     msg2(fd2 to target slot x)

                                              get cqe of msg1
                                   do something to fd1 by access slot x


the msg2 is issued not at the right time. In short not only ctx needs to
fill a cqe to target_ctx to inform that the file has been registered
but also the target_ctx has to tell ctx that "my slot x is free now
for you to deliver fd". So I guess users are inclined to allocate a
big fixed table and deliver fds to target_ctx in different slots,
Which is ok but anyway a limitation.

> 
> 	- Add an IORING_MSG_MOVE_FD which moves the descriptor, removing
> 	  it from the source ring when installed in the target? Again
> 	  error handling is difficult.
> 
> Signed-off-by: Jens Axboe <[email protected]>
> ---
>   include/uapi/linux/io_uring.h |   8 +++
>   io_uring/msg_ring.c           | 122 ++++++++++++++++++++++++++++++++--
>   2 files changed, 123 insertions(+), 7 deletions(-)
> 
> diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
> index 8715f0942ec2..dbdaeef3ea89 100644
> --- a/include/uapi/linux/io_uring.h
> +++ b/include/uapi/linux/io_uring.h
> @@ -264,6 +264,14 @@ enum io_uring_op {
>    */
>   #define IORING_ACCEPT_MULTISHOT	(1U << 0)
>   
> +/*
> + * IORING_OP_MSG_RING command types, stored in sqe->addr
> + */
> +enum {
> +	IORING_MSG_DATA,	/* pass sqe->len as 'res' and off as user_data */
> +	IORING_MSG_SEND_FD,	/* send a registered fd to another ring */
> +};
> +
>   /*
>    * IO completion data structure (Completion Queue Entry)
>    */
> diff --git a/io_uring/msg_ring.c b/io_uring/msg_ring.c
> index b02be2349652..e9d6fb25d141 100644
> --- a/io_uring/msg_ring.c
> +++ b/io_uring/msg_ring.c
> @@ -3,46 +3,154 @@
>   #include <linux/errno.h>
>   #include <linux/file.h>
>   #include <linux/slab.h>
> +#include <linux/nospec.h>
>   #include <linux/io_uring.h>
>   
>   #include <uapi/linux/io_uring.h>
>   
>   #include "io_uring.h"
> +#include "rsrc.h"
> +#include "filetable.h"
>   #include "msg_ring.h"
>   
>   struct io_msg {
>   	struct file			*file;
>   	u64 user_data;
>   	u32 len;
> +	u32 cmd;
> +	u32 src_fd;
> +	u32 dst_fd;
>   };
>   
> +static int io_msg_ring_data(struct io_kiocb *req)
> +{
> +	struct io_ring_ctx *target_ctx = req->file->private_data;
> +	struct io_msg *msg = io_kiocb_to_cmd(req);
> +
> +	if (msg->src_fd || msg->dst_fd)
> +		return -EINVAL;
> +
> +	if (io_post_aux_cqe(target_ctx, msg->user_data, msg->len, 0))
> +		return 0;
> +
> +	return -EOVERFLOW;
> +}
> +
> +static void io_double_unlock_ctx(struct io_ring_ctx *ctx,
> +				 struct io_ring_ctx *octx,
> +				 unsigned int issue_flags)
> +{
> +	if (issue_flags & IO_URING_F_UNLOCKED)
> +		mutex_unlock(&ctx->uring_lock);
> +	mutex_unlock(&octx->uring_lock);
> +}
> +
> +static int io_double_lock_ctx(struct io_ring_ctx *ctx,
> +			      struct io_ring_ctx *octx,
> +			      unsigned int issue_flags)
> +{
> +	/*
> +	 * To ensure proper ordering between the two ctxs, we can only
> +	 * attempt a trylock on the target. If that fails and we already have
> +	 * the source ctx lock, punt to io-wq.
> +	 */
> +	if (!(issue_flags & IO_URING_F_UNLOCKED)) {
> +		if (!mutex_trylock(&octx->uring_lock))
> +			return -EAGAIN;
> +		return 0;
> +	}
> +
> +	/* Always grab smallest value ctx first. */
> +	if (ctx < octx) {
> +		mutex_lock(&ctx->uring_lock);
> +		mutex_lock(&octx->uring_lock);
> +	} else if (ctx > octx) {


Would a simple else work?
if (a < b) {
   lock(a); lock(b);
} else {
   lock(b);lock(a);
}

since a doesn't equal b





  reply	other threads:[~2022-06-18 11:03 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-17 13:45 [PATCHSET RFC for-next 0/2] Add direct descriptor ring passing Jens Axboe
2022-06-17 13:45 ` [PATCH 1/2] io_uring: split out fixed file installation and removal Jens Axboe
2022-06-17 13:45 ` [PATCH 2/2] io_uring: add support for passing fixed file descriptors Jens Axboe
2022-06-18 11:02   ` Hao Xu [this message]
2022-06-18 11:34     ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox