public inbox for [email protected]
 help / color / mirror / Atom feed
From: Josh Triplett <[email protected]>
To: Miklos Szeredi <[email protected]>
Cc: Michael Kerrisk <[email protected]>,
	[email protected],
	"[email protected]" <[email protected]>,
	lkml <[email protected]>,
	Alexander Viro <[email protected]>,
	Arnd Bergmann <[email protected]>, Jens Axboe <[email protected]>,
	Aleksa Sarai <[email protected]>,
	linux-man <[email protected]>,
	Linux API <[email protected]>
Subject: Re: [PATCH v5 2/3] fs: openat2: Extend open_how to allow userspace-selected fds
Date: Thu, 23 Apr 2020 00:33:10 -0700	[thread overview]
Message-ID: <20200423073310.GA169998@localhost> (raw)
In-Reply-To: <CAJfpeguaVYo-Lf-5Bi=EYJYWdmCfo3BqZA=kj9E5UmDb0mBc1w@mail.gmail.com>

On Thu, Apr 23, 2020 at 08:04:25AM +0200, Miklos Szeredi wrote:
> On Thu, Apr 23, 2020 at 6:42 AM Josh Triplett <[email protected]> wrote:
> >
> > On Thu, Apr 23, 2020 at 06:24:14AM +0200, Miklos Szeredi wrote:
> > > On Thu, Apr 23, 2020 at 2:48 AM Josh Triplett <[email protected]> wrote:
> > > > On Wed, Apr 22, 2020 at 09:55:56AM +0200, Miklos Szeredi wrote:
> > > > > On Wed, Apr 22, 2020 at 8:06 AM Michael Kerrisk (man-pages)
> > > > > <[email protected]> wrote:
> > > > > >
> > > > > > [CC += linux-api]
> > > > > >
> > > > > > On Wed, 22 Apr 2020 at 07:20, Josh Triplett <[email protected]> wrote:
> > > > > > >
> > > > > > > Inspired by the X protocol's handling of XIDs, allow userspace to select
> > > > > > > the file descriptor opened by openat2, so that it can use the resulting
> > > > > > > file descriptor in subsequent system calls without waiting for the
> > > > > > > response to openat2.
> > > > > > >
> > > > > > > In io_uring, this allows sequences like openat2/read/close without
> > > > > > > waiting for the openat2 to complete. Multiple such sequences can
> > > > > > > overlap, as long as each uses a distinct file descriptor.
> > > > >
> > > > > If this is primarily an io_uring feature, then why burden the normal
> > > > > openat2 API with this?
> > > >
> > > > This feature was inspired by io_uring; it isn't exclusively of value
> > > > with io_uring. (And io_uring doesn't normally change the semantics of
> > > > syscalls.)
> > >
> > > What's the use case of O_SPECIFIC_FD beyond io_uring?
> >
> > Avoiding a call to dup2 and close, if you need something as a specific
> > file descriptor, such as when setting up to exec something, or when
> > debugging a program.
> >
> > I don't expect it to be as widely used as with io_uring, but I also
> > don't want io_uring versions of syscalls to diverge from the underlying
> > syscalls, and this would be a heavy divergence.
> 
> What are the plans for those syscalls that don't easily lend
> themselves to this modification (such as accept(2))?

accept4 has a flags argument with more flags available, so it'd be
entirely possible to cleanly extend it further without introducing a new
version. The same goes for other fd-producing syscalls that still have
flag space available.

This may or may not provide sufficient motivation on its own to
introduce a new syscall variant of a syscall that isn't currently
extensible.

> Compared to that, having a common flag for file ops to enable the use
> of fixed and private file descriptors is a clean and well contained
> interface.

"private" is not a desirable property here. See below. Even if the
userspace-specified fd mechanism were to become something only
accessible via io_uring (which I'd prefer to avoid), that's not a reason
to avoid generating real file descriptors that work anywhere a file
descriptor works.

> > > > > This would also allow Implementing a private fd table for io_uring.
> > > > > I.e. add a flag interpreted by file ops (IORING_PRIVATE_FD), including
> > > > > openat2 and freely use the private fd space without having to worry
> > > > > about interactions with other parts of the system.
> > > >
> > > > I definitely don't want to add a special kind of file descriptor that
> > > > doesn't work in normal syscalls taking file descriptors. A file
> > > > descriptor allocated via O_SPECIFIC_FD is an entirely normal file
> > > > descriptor, and works anywhere a file descriptor normally works.
> > >
> > > What's the use case of allocating a file descriptor within io_uring
> > > and using it outside of io_uring?
> >
> > Calling a syscall not provided via io_uring. Calling a library that
> > doesn't use io_uring. Passing the file descriptor via UNIX socket to
> > another program. Passing the file descriptor via exec to another
> > program. Userspace is modular, and file descriptors are widely used.
> 
> I mean, you could open the file descriptor outside of io_uring in such
> cases, no?

I would prefer to not introduce that limitation in the first place, and
instead open normal file descriptors.

> The point of O_SPECIFIC_FD is to be able to perform short
> sequences of open/dosomething/close without having to block and having
> to issue separate syscalls.

"close" is not a required component. It's entirely possible to use
io_uring to open a file descriptor, do various things with it, and then
leave it open for subsequent usage via either other io_uring chains or
standalone syscalls.

> If you're going to issue separate
> syscalls anyway, then I see no point in doing the open within
> io_uring.  Or?

io_uring is not an all-or-nothing proposition. There's value in using
io_uring for some operations without converting an entire program (and
every library it might potentially use on a file descriptor) entirely to
io_uring. Userspace is modular, and file descriptors are a common
element used by many different bits of userspace.

  reply	other threads:[~2020-04-23  7:33 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-22  5:19 [PATCH v5 0/3] Support userspace-selected fds Josh Triplett
2020-04-22  5:19 ` [PATCH v5 1/3] fs: Support setting a minimum fd for "lowest available fd" allocation Josh Triplett
2020-04-22  6:06   ` Michael Kerrisk (man-pages)
2020-04-23  1:12   ` Dmitry V. Levin
2020-04-23  4:51     ` Josh Triplett
2020-04-23  9:24   ` Arnd Bergmann
2020-04-22  5:20 ` [PATCH v5 2/3] fs: openat2: Extend open_how to allow userspace-selected fds Josh Triplett
2020-04-22  6:06   ` Michael Kerrisk (man-pages)
2020-04-22  7:55     ` Miklos Szeredi
2020-04-23  0:48       ` Josh Triplett
2020-04-23  4:24         ` Miklos Szeredi
2020-04-23  4:42           ` Josh Triplett
2020-04-23  6:04             ` Miklos Szeredi
2020-04-23  7:33               ` Josh Triplett [this message]
2020-04-23  7:45                 ` Miklos Szeredi
2020-04-23  7:57                   ` Miklos Szeredi
2020-04-23  9:20                     ` Miklos Szeredi
2020-04-23  9:46                       ` Miklos Szeredi
2020-04-23  8:06                   ` Josh Triplett
2020-04-22  5:20 ` [PATCH v5 3/3] fs: pipe2: Support O_SPECIFIC_FD Josh Triplett
2020-04-22  6:06   ` Michael Kerrisk (man-pages)
2020-04-22 15:44   ` Florian Weimer
2020-04-23  0:44     ` Josh Triplett
2020-04-22  6:05 ` [PATCH v5 0/3] Support userspace-selected fds Michael Kerrisk (man-pages)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200423073310.GA169998@localhost \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox