public inbox for [email protected]
 help / color / mirror / Atom feed
From: Hao Xu <[email protected]>
To: Stefan Metzmacher <[email protected]>, Jens Axboe <[email protected]>,
	[email protected]
Cc: [email protected], [email protected],
	"Ralph Böhme" <[email protected]>, vl <[email protected]>
Subject: Re: [PATCHSET v2 0/6] Allow allocated direct descriptors
Date: Fri, 10 Jun 2022 21:04:16 +0800	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 6/10/22 19:28, Stefan Metzmacher wrote:
> 
> Am 10.06.22 um 13:06 schrieb Hao Xu:
>> Hi Stefan,
>> On 6/9/22 16:57, Stefan Metzmacher wrote:
>>>
>>> Hi Jens,
>>>
>>> this looks very useful, thanks!
>>>
>>> I have an additional feature request to make this even more useful...
>>>
>>> IO_OP_ACCEPT allows a fixed descriptor for the listen socket
>>> and then can generate a fixed descriptor for the accepted connection,
>>> correct?
>>
>> Yes.
>>
>>>
>>> It would be extremely useful to also allow that pattern
>>> for IO_OP_OPENAT[2], which currently is not able to get
>>> a fixed descriptor for the dirfd argument (this also applies to
>>> IO_OP_STATX, IO_OP_UNLINK and all others taking a dirfd).
>>>
>>> Being able use such a sequence:
>>>
>>> OPENTAT2(AT_FDCWD, "directory") => 1 (fixed)
>>> STATX(1 (fixed))
>>> FGETXATTR(1 (fixed)
>>> OPENAT2(1 (fixed), "file") => 2 (fixed)
>>> STATX(2 (fixed))
>>> FGETXATTR(2 (fixed))
>>> CLOSE(1 (fixed)
>>> DUP( 2 (fixed)) => per-process fd for ("file")
>>>
>>> I looked briefly how to implement that.
>>> But set_nameidata() takes 'int dfd' to store the value
>>> and it's used later somewhere deep down the stack.
>>> And makes it too complex for me to create patches :-(
>>>
>>
>> Indeed.. dirfd is used in path_init() etc. For me, no idea how to tackle
>> it for now.We surely can register a fixed descriptor to the process
>> fdtable but that is against the purpose of fixed file..
> 
> I looked at it a bit more and the good thing is that
> 'struct nameidata' is private to namei.c, which simplifies
> getting an overview.
> 
> path_init() is the actual only user of nd.dfd

                               ^[1]

> and it's used to fill nd.path, either from get_fs_pwd()
> for AT_FDCWD and f.file->f_path otherwise.
> 
> So might be able to have a function that translated
> the fd to struct path early and let the callers pass 'struct path'
> instead of 'int dfd'...

Yea, if [1] is true. I wrote something for your reference:
(totally unpolished and untested, just to show an idea)

diff --git a/fs/namei.c b/fs/namei.c
index 1f28d3f463c3..18e11717005c 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2423,21 +2423,30 @@ static const char *path_init(struct nameidata 
*nd, unsigned flags)
                         nd->inode = nd->path.dentry->d_inode;
                 }
         } else {
-               /* Caller must check execute permissions on the starting 
path component */
-               struct fd f = fdget_raw(nd->dfd);
                 struct dentry *dentry;

-               if (!f.file)
-                       return ERR_PTR(-EBADF);
+               if (nd->dfd != -1) {
+                       /* Caller must check execute permissions on the 
starting path component */
+                       struct fd f = fdget_raw(nd->dfd);

-               dentry = f.file->f_path.dentry;
+                       if (!f.file)
+                               return ERR_PTR(-EBADF);

-               if (*s && unlikely(!d_can_lookup(dentry))) {
-                       fdput(f);
-                       return ERR_PTR(-ENOTDIR);
+                       dentry = f.file->f_path.dentry;
+
+                       if (*s && unlikely(!d_can_lookup(dentry))) {
+                               fdput(f);
+                               return ERR_PTR(-ENOTDIR);
+                       }
+
+                       nd->path = f.file->f_path;
+               } else {
+                       dentry = nd->path.dentry;
+
+                       if (*s && unlikely(!d_can_lookup(dentry)))
+                               return ERR_PTR(-ENOTDIR);
                 }

-               nd->path = f.file->f_path;
                 if (flags & LOOKUP_RCU) {
                         nd->inode = nd->path.dentry->d_inode;
                         nd->seq = 
read_seqcount_begin(&nd->path.dentry->d_seq);
@@ -2445,7 +2454,9 @@ static const char *path_init(struct nameidata *nd, 
unsigned flags)
                         path_get(&nd->path);
                         nd->inode = nd->path.dentry->d_inode;
                 }
-               fdput(f);
+               if (dfd != -1)
+                       fdput(f);
+
         }

         /* For scoped-lookups we need to set the root to the dirfd as 
well. */
@@ -3686,6 +3697,48 @@ struct file *do_filp_open(int dfd, struct 
filename *pathname,
         return filp;
  }

+static void __set_nameidata2(struct nameidata *p, struct path *path,
+                            struct filename *name)
+{
+       struct nameidata *old = current->nameidata;
+       p->stack = p->internal;
+       p->depth = 0;
+       p->dfd = -1;
+       p->name = name;
+       p->path = *path;
+       p->total_link_count = old ? old->total_link_count : 0;
+       p->saved = old;
+       current->nameidata = p;
+}
+
+static inline void set_nameidata2(struct nameidata *p, struct path *path,
+                                 struct filename *name, const struct 
path *root)
+{
+       __set_nameidata2(p, path, name);
+       p->state = 0;
+       if (unlikely(root)) {
+               p->state = ND_ROOT_PRESET;
+               p->root = *root;
+       }
+}
+
+struct file *do_filp_open_path(struct *path, struct filename *pathname,
+               const struct open_flags *op)
+{
+       struct nameidata nd;
+       int flags = op->lookup_flags;
+       struct file *filp;
+
+       set_nameidata2(&nd, path, pathname, NULL);
+       filp = path_openat(&nd, op, flags | LOOKUP_RCU);
+       if (unlikely(filp == ERR_PTR(-ECHILD)))
+               filp = path_openat(&nd, op, flags);
+       if (unlikely(filp == ERR_PTR(-ESTALE)))
+               filp = path_openat(&nd, op, flags | LOOKUP_REVAL);
+       restore_nameidata();
+       return filp;
+}
+
  struct file *do_file_open_root(const struct path *root,
                 const char *name, const struct open_flags *op)
  {



      reply	other threads:[~2022-06-10 13:04 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-09 15:50 [PATCHSET v2 0/6] Allow allocated direct descriptors Jens Axboe
2022-05-09 15:50 ` [PATCH 1/6] io_uring: track fixed files with a bitmap Jens Axboe
2022-05-09 15:50 ` [PATCH 2/6] io_uring: add basic fixed file allocator Jens Axboe
2022-05-09 15:50 ` [PATCH 3/6] io_uring: allow allocated fixed files for openat/openat2 Jens Axboe
2022-05-12  8:21   ` Hao Xu
2022-05-12 12:23     ` Jens Axboe
2022-05-13  5:28       ` Hao Xu
2022-05-13 12:25         ` Jens Axboe
2022-05-13 12:56           ` Jens Axboe
2022-05-13  4:38   ` Hao Xu
2022-05-13 12:28     ` Jens Axboe
2022-05-09 15:50 ` [PATCH 4/6] io_uring: allow allocated fixed files for accept Jens Axboe
2022-05-09 15:50 ` [PATCH 5/6] io_uring: bump max direct descriptor count to 1M Jens Axboe
2022-05-09 15:50 ` [PATCH 6/6] io_uring: add flag for allocating a fully sparse direct descriptor space Jens Axboe
2022-05-10  4:44   ` Hao Xu
2022-05-10 12:27     ` Jens Axboe
2022-05-13 10:56 ` [PATCHSET v2 0/6] Allow allocated direct descriptors Hao Xu
2022-06-09  8:57 ` Stefan Metzmacher
2022-06-10 11:06   ` Hao Xu
2022-06-10 11:28     ` Stefan Metzmacher
2022-06-10 13:04       ` Hao Xu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox