public inbox for [email protected]
 help / color / mirror / Atom feed
From: Aleksa Sarai <[email protected]>
To: Josh Triplett <[email protected]>
Cc: [email protected], [email protected],
	[email protected], [email protected],
	Alexander Viro <[email protected]>,
	Arnd Bergmann <[email protected]>, Jens Axboe <[email protected]>
Subject: Re: [PATCH v3 0/3] Support userspace-selected fds
Date: Wed, 8 Apr 2020 22:26:01 +1000	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

[-- Attachment #1: Type: text/plain, Size: 3530 bytes --]

On 2020-04-07, Josh Triplett <[email protected]> wrote:
> (Note: numbering this updated version v3, to avoid confusion with Jens'
> v2 that built on my v1. Jens, if you like this approach, please feel
> free to stack your additional patches from the io_uring-fd-select branch
> atop this series. 5.8 material, not intended for the current merge window.)
> 
> Inspired by the X protocol's handling of XIDs, allow userspace to select
> the file descriptor opened by a call like openat2, so that it can use
> the resulting file descriptor in subsequent system calls without waiting
> for the response to the initial openat2 syscall.
> 
> The first patch is independent of the other two; it allows reserving
> file descriptors below a certain minimum for userspace-selected fd
> allocation only.
> 
> The second patch implements userspace-selected fd allocation for
> openat2, introducing a new O_SPECIFIC_FD flag and an fd field in struct
> open_how. In io_uring, this allows sequences like openat2/read/close
> without waiting for the openat2 to complete. Multiple such sequences can
> overlap, as long as each uses a distinct file descriptor.
> 
> The third patch adds userspace-selected fd allocation to pipe2 as well.
> I did this partly as a demonstration of how simple it is to wire up
> O_SPECIFIC_FD support for any fd-allocating system call, and partly in
> the hopes that this may make it more useful to wire up io_uring support
> for pipe2 in the future.
> 
> If this gets accepted, I'm happy to also write corresponding manpage
> patches.
> 
> v3:
> This new version has an API to atomically increase the minimum fd and
> return the previous minimum, rather than just getting and setting the
> minimum; this makes it easier to allocate a range. (A library that might
> initialize after the program has already opened other file descriptors
> may need to check for existing open fds in the range after reserving it,
> and reserve more fds if needed; this can be done entirely in userspace,
> and we can't really do anything simpler in the kernel due to limitations
> on file-descriptor semantics, so this patch series avoids introducing
> any extra complexity in the kernel.)
> 
> This new version also supports a __get_specific_unused_fd_flags call
> which accepts the limit for RLIMIT_NOFILE as an argument, analogous to
> __get_unused_fd_flags, since io_uring needs that to correctly handle
> RLIMIT_NOFILE.
> 
> Josh Triplett (3):
>   fs: Support setting a minimum fd for "lowest available fd" allocation
>   fs: openat2: Extend open_how to allow userspace-selected fds
>   fs: pipe2: Support O_SPECIFIC_FD

Aside from my specific comments and questions, the changes to openat2
deserve at least one or two selftests.

>  fs/fcntl.c                       |  2 +-
>  fs/file.c                        | 62 ++++++++++++++++++++++++++++----
>  fs/io_uring.c                    |  3 +-
>  fs/open.c                        |  6 ++--
>  fs/pipe.c                        | 16 ++++++---
>  include/linux/fcntl.h            |  5 +--
>  include/linux/fdtable.h          |  1 +
>  include/linux/file.h             |  4 +++
>  include/uapi/asm-generic/fcntl.h |  4 +++
>  include/uapi/linux/openat2.h     |  2 ++
>  include/uapi/linux/prctl.h       |  3 ++
>  kernel/sys.c                     |  5 +++
>  12 files changed, 97 insertions(+), 16 deletions(-)

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

  parent reply	other threads:[~2020-04-08 12:34 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-08  6:56 [PATCH v3 0/3] Support userspace-selected fds Josh Triplett
2020-04-08  6:57 ` [PATCH v3 1/3] fs: Support setting a minimum fd for "lowest available fd" allocation Josh Triplett
2020-04-08 12:00   ` Aleksa Sarai
2020-04-09  3:17     ` Josh Triplett
2020-04-08  6:57 ` [PATCH v3 2/3] fs: openat2: Extend open_how to allow userspace-selected fds Josh Triplett
2020-04-08 12:23   ` Aleksa Sarai
2020-04-09  5:00     ` Josh Triplett
2020-04-09  8:10     ` Aleksa Sarai
2020-04-08  6:57 ` [PATCH v3 3/3] fs: pipe2: Support O_SPECIFIC_FD Josh Triplett
2020-04-08 12:26 ` Aleksa Sarai [this message]
2020-04-09  3:19   ` [PATCH v3 0/3] Support userspace-selected fds Josh Triplett
  -- strict thread matches above, loose matches on Subject: below --
2020-04-04  5:57 Josh Triplett
2020-04-07 22:11 ` Jens Axboe
2020-04-08  0:40   ` Josh Triplett

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200408122601.kvrdjksjkl7ktgt4@yavin.dot.cyphar.com \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox