public inbox for [email protected]
 help / color / mirror / Atom feed
From: Josh Triplett <[email protected]>
To: Aleksa Sarai <[email protected]>
Cc: [email protected], [email protected],
	[email protected], [email protected],
	Alexander Viro <[email protected]>,
	Arnd Bergmann <[email protected]>, Jens Axboe <[email protected]>
Subject: Re: [PATCH v3 2/3] fs: openat2: Extend open_how to allow userspace-selected fds
Date: Wed, 8 Apr 2020 22:00:00 -0700	[thread overview]
Message-ID: <20200409050000.GE6149@localhost> (raw)
In-Reply-To: <[email protected]>

On Wed, Apr 08, 2020 at 10:23:30PM +1000, Aleksa Sarai wrote:
> On 2020-04-07, Josh Triplett <[email protected]> wrote:
> > Signed-off-by: Josh Triplett <[email protected]>
> > Signed-off-by: Jens Axboe <[email protected]>
> 
> Maybe I'm misunderstanding something, but this Signed-off-by looks
> strange -- was it Co-developed-by Jens?

No. I wrote v1, and Jens incorporated some of those patches into a v2
series that added signoffs. So when I took patch 2 from that series and
rebased it into this new series, I kept the signoff.

> > ---
> >  fs/fcntl.c                       |  2 +-
> >  fs/file.c                        | 39 ++++++++++++++++++++++++++++++++
> >  fs/io_uring.c                    |  3 ++-
> >  fs/open.c                        |  6 +++--
> >  include/linux/fcntl.h            |  5 ++--
> >  include/linux/file.h             |  3 +++
> >  include/uapi/asm-generic/fcntl.h |  4 ++++
> >  include/uapi/linux/openat2.h     |  2 ++
> >  8 files changed, 58 insertions(+), 6 deletions(-)

> > --- a/fs/file.c
> > +++ b/fs/file.c
> > @@ -567,6 +567,45 @@ void put_unused_fd(unsigned int fd)
> >  
> >  EXPORT_SYMBOL(put_unused_fd);
> >  
> > +int __get_specific_unused_fd_flags(unsigned int fd, unsigned int flags,
> > +				   unsigned long nofile)
> > +{
> > +	int ret;
> > +	struct fdtable *fdt;
> > +	struct files_struct *files = current->files;
> > +
> > +	if (!(flags & O_SPECIFIC_FD) || fd == -1)
> > +		return __get_unused_fd_flags(flags, nofile);
> 
> This check should just be (flags & O_SPECIFIC_FD) -- see my comment
> below about ->fd being negative.

This was intentional, and I think it makes sense to keep. I'd like to
maintain the same consistent behavior in any call that uses
O_SPECIFIC_FD: an fd of -1 means "go ahead and assign the lowest
available fd". This allows for calls like pipe2, or in the future
socketpair, where you might want to explicitly assign one end, but allow
the kernel to allocate the other end. -1 can never be a valid file
descriptor, since it would be interpreted as an errno instead. mmap
similarly treats -1 as an invalid file descriptor.

As an example use case: in one io_uring batch, you want to allocate a
pipe, send one end over a UNIX socket to another process, and close that
end in your process. So you need *that* end to be a userspace-allocated
file descriptor, so you know what fd to use in the subsequent sendmsg
and close calls. But for the other end that you plan to hold onto, you
don't want to use up one of your small number of preallocated file
descriptors (of which you only need enough for the operations in any one
io_uring batch).

However, in response to your comment below, I've changed this from -1 to
UINT_MAX.

> > +
> > +	if (fd >= nofile)
> > +		return -EBADF;
> > +
> > +	spin_lock(&files->file_lock);
> > +	ret = expand_files(files, fd);
> > +	if (unlikely(ret < 0))
> > +		goto out_unlock;
> > +	fdt = files_fdtable(files);
> > +	if (fdt->fd[fd]) {
> > +		ret = -EBUSY;
> > +		goto out_unlock;
> 
> It would be remiss of me to not mention that this is inconsistent with
> the other way of explicitly picking a file descriptor on Unix -- the dup
> family closes newfd if it's already used.
> 
> But that being said, I do actually prefer this behaviour since it means
> that two threads trying to open a file with the same specific file
> descriptor won't see the file descriptor change underneath them (leading
> to who knows how much head-scratching).

Exactly. I documented that difference explicitly in the commit message:

> The specified file descriptor must not already be open; if it is,
> get_specific_unused_fd_flags will fail with -EBUSY. This helps catch
> userspace errors.

I also intend for that to appear in the manpage.

> > --- a/fs/open.c
> > +++ b/fs/open.c
> > @@ -958,7 +958,7 @@ EXPORT_SYMBOL(open_with_fake_path);
> >  inline struct open_how build_open_how(int flags, umode_t mode)
> >  {
> >  	struct open_how how = {
> > -		.flags = flags & VALID_OPEN_FLAGS,
> > +		.flags = flags & VALID_OPEN_FLAGS & ~O_SPECIFIC_FD,
> 
> This is getting a little ugly, maybe filter out O_SPECIFIC_FD later on
> in build_open_how() -- where we handle O_PATH.

Sure; I'll move it down and add a comment.

> >  		.mode = mode & S_IALLUGO,
> >  	};
> >  
> > @@ -1143,7 +1143,7 @@ static long do_sys_openat2(int dfd, const char __user *filename,
> >  	if (IS_ERR(tmp))
> >  		return PTR_ERR(tmp);
> >  
> > -	fd = get_unused_fd_flags(how->flags);
> > +	fd = get_specific_unused_fd_flags(how->fd, how->flags);
> >  	if (fd >= 0) {
> >  		struct file *f = do_filp_open(dfd, tmp, &op);
> >  		if (IS_ERR(f)) {
> > @@ -1193,6 +1193,8 @@ SYSCALL_DEFINE4(openat2, int, dfd, const char __user *, filename,
> >  	err = copy_struct_from_user(&tmp, sizeof(tmp), how, usize);
> >  	if (err)
> >  		return err;
> > +	if (tmp.pad != 0)
> > +		return -EINVAL;
> 
> This check should be done in build_open_flags(), where the other sanity
> checks are done.

Good idea; done.

> In addition, there must be an additional check like
> 
>   if (!(flags & O_SPECIFIC_FD) && how->fd != 0)
>     return -EINVAL;
> 
> Since we must not allow garbage values to be passed and ignored by us in
> openat2().

Good catch! Added.

> > --- a/include/uapi/asm-generic/fcntl.h
> > +++ b/include/uapi/asm-generic/fcntl.h
> > @@ -89,6 +89,10 @@
> >  #define __O_TMPFILE	020000000
> >  #endif
> >  
> > +#ifndef O_SPECIFIC_FD
> > +#define O_SPECIFIC_FD	01000000000	/* open as specified fd */
> > +#endif
> 
> Maybe you've already done this (since you skipped several bits in the
> O_* flag space), but I would double-check that there is no conflict on
> other architectures. I faintly recall that FMODE_NOTIFY has a different
> value on sparc, and there was some oddness on alpha too... But as long
> as fcntl.c builds on all the arches then it's fine.

I checked other architectures carefully, and yes, I picked this value
because some of the intermediate values conflicted on specific
architectures.

> > --- a/include/uapi/linux/openat2.h
> > +++ b/include/uapi/linux/openat2.h
> > @@ -20,6 +20,8 @@ struct open_how {
> >  	__u64 flags;
> >  	__u64 mode;
> >  	__u64 resolve;
> > +	__s32 fd;
> > +	__u32 pad; /* Must be 0 in the current version */
> 
> I'm not sure I see why the new field is an s32 -- there should be no
> situation where a user specifies O_SPECIFIC_FD and a negative file
> descriptor (if we keep it as an s32, you should get an -EINVAL in that
> case). If you don't want O_SPECIFIC_FD, don't specify it.
>
> But I think this should be a u32.

See my explanation above for why I do want to allow `-1` here.

However, there's no reason I need to make the field signed just because
of that one reserved value. I'll change the field to a __u32.

- Josh Triplett

  reply	other threads:[~2020-04-09  5:00 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-08  6:56 [PATCH v3 0/3] Support userspace-selected fds Josh Triplett
2020-04-08  6:57 ` [PATCH v3 1/3] fs: Support setting a minimum fd for "lowest available fd" allocation Josh Triplett
2020-04-08 12:00   ` Aleksa Sarai
2020-04-09  3:17     ` Josh Triplett
2020-04-08  6:57 ` [PATCH v3 2/3] fs: openat2: Extend open_how to allow userspace-selected fds Josh Triplett
2020-04-08 12:23   ` Aleksa Sarai
2020-04-09  5:00     ` Josh Triplett [this message]
2020-04-09  8:10     ` Aleksa Sarai
2020-04-08  6:57 ` [PATCH v3 3/3] fs: pipe2: Support O_SPECIFIC_FD Josh Triplett
2020-04-08 12:26 ` [PATCH v3 0/3] Support userspace-selected fds Aleksa Sarai
2020-04-09  3:19   ` Josh Triplett
  -- strict thread matches above, loose matches on Subject: below --
2020-04-04  5:57 Josh Triplett
2020-04-04  5:58 ` [PATCH v3 2/3] fs: openat2: Extend open_how to allow " Josh Triplett

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200409050000.GE6149@localhost \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox