public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
From: Daniele Di Proietto <daniele.di.proietto@gmail.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: io-uring@vger.kernel.org
Subject: Re: [PATCH] io_uring: Add IORING_OP_DUP
Date: Tue, 10 Mar 2026 18:42:38 +0000	[thread overview]
Message-ID: <CAExiqTKBFeyxE4nwSxd3muOuZkP5YDSoweYwns4wb64w8efPVQ@mail.gmail.com> (raw)
In-Reply-To: <c29a339d-67c5-4e8a-a1c9-2388aa9f28d5@kernel.dk>

On Tue, Mar 10, 2026 at 4:24 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 3/10/26 9:49 AM, Daniele Di Proietto wrote:
> > The new operation is like dup3(). The source file can be a regular file
> > descriptor or a direct descriptor. The destination is a regular file
> > descriptor.
> >
> > The direct descriptor variant is useful to move a descriptor to an fd
> > and close the existing fd with a single acquisition of the `struct
> > files_struct` `file_lock`. Combined with IORING_OP_ACCEPT or
> > IORING_OP_OPENAT2 with direct descriptors, it can reduce lock contention
> > for multithreaded applications.
>
> Overall comment - how does this interact with direct descriptors? Feels
> like this should support both, rather than just normal file descriptors.

As implemented, the operation supports:
1. src: direct, dst: normal (this is the use case I mostly care about)
2. src: normal, dst: normal ()

I can extend it to also support
3. src: direct, dst: direct
4, src: normal, dst: direct

I can use IOSQE_FIXED_FILE to pick the source and I guess I can use a
bit in dup_flags (something like IORING_DUP_DIRECT) to decide whether
the destination is a direct descriptor or normal.

Does that make sense?

>
> > @@ -446,3 +452,46 @@ int io_pipe(struct io_kiocb *req, unsigned int issue_flags)
> >               fput(files[1]);
> >       return ret;
> >  }
> > +
> > +int io_dup_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
> > +{
> > +     unsigned int flags;
> > +     struct io_dup *id;
> > +     int new_fd;
> > +
> > +     if (sqe->off || sqe->addr || sqe->len || sqe->buf_index || sqe->addr3)
> > +             return -EINVAL;
> > +
> > +     flags = READ_ONCE(sqe->dup_flags);
> > +     if (flags & ~IORING_DUP_NO_CLOEXEC)
> > +             return -EINVAL;
> > +
> > +     new_fd = READ_ONCE(sqe->dup_new_fd);
> > +     if (new_fd < 0)
> > +             return -EBADF;
>
> Is this necessary? Yes it'll help fail early, but do we care about that?

You're right, we don't really care about that. I'll remove it.

>
> > +     /* ensure the task's creds are used when installing/receiving fds */
> > +     if (req->flags & REQ_F_CREDS)
> > +             return -EPERM;
>
> Not sure that's sane. Let's say you mark this request as IOSQE_ASYNC,
> then it'd fail even if REQ_F_CREDS would then be set, and creds would
> match the original task.

I'm not sure either, I mostly added this because it's in
io_install_fixed_fd_prep, I assume the same rationale applies here,
right?

>
>
> > +
> > +     id = io_kiocb_to_cmd(req, struct io_dup);
> > +     id->o_flags = O_CLOEXEC;
> > +     if (flags & IORING_DUP_NO_CLOEXEC)
> > +             id->o_flags = 0;
> > +     id->new_fd = new_fd;
> > +
> > +     return 0;
> > +}
> > +
> > +int io_dup(struct io_kiocb *req, unsigned int issue_flags)
> > +{
> > +     struct io_dup *id;
> > +     int ret;
> > +
> > +     id = io_kiocb_to_cmd(req, struct io_dup);
> > +     ret = replace_fd(id->new_fd, id->file, id->o_flags);
> > +     if (ret < 0)
> > +             req_set_fail(req);
> > +     io_req_set_res(req, ret, 0);
> > +     return IOU_COMPLETE;
>
> And like Keith said here, we might need to punt it to io-wq if the file
> has a ->flush() method.

Makes sense, thanks.

Thanks for the review!

>
> --
> Jens Axboe

      reply	other threads:[~2026-03-10 18:42 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-10 15:49 [PATCH] io_uring: Add IORING_OP_DUP Daniele Di Proietto
2026-03-10 16:14 ` Keith Busch
2026-03-10 18:42   ` Daniele Di Proietto
2026-03-10 16:24 ` Jens Axboe
2026-03-10 18:42   ` Daniele Di Proietto [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAExiqTKBFeyxE4nwSxd3muOuZkP5YDSoweYwns4wb64w8efPVQ@mail.gmail.com \
    --to=daniele.di.proietto@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox