* [PATCH 1/2] fs: make do_mkdirat() take struct filename
2020-11-16 4:45 [PATCH 0/2] io_uring: add mkdirat support Dmitry Kadashev
@ 2020-11-16 4:45 ` Dmitry Kadashev
2021-01-25 4:38 ` Jens Axboe
2020-11-16 4:45 ` [PATCH 2/2] io_uring: add support for IORING_OP_MKDIRAT Dmitry Kadashev
` (2 subsequent siblings)
3 siblings, 1 reply; 22+ messages in thread
From: Dmitry Kadashev @ 2020-11-16 4:45 UTC (permalink / raw)
To: io-uring; +Cc: axboe, viro, linux-fsdevel, Dmitry Kadashev
Pass in the struct filename pointers instead of the user string, and
update the three callers to do the same. This is heavily based on
commit dbea8d345177 ("fs: make do_renameat2() take struct filename").
This behaves like do_unlinkat() and do_renameat2().
Signed-off-by: Dmitry Kadashev <[email protected]>
---
fs/internal.h | 1 +
fs/namei.c | 20 ++++++++++++++------
2 files changed, 15 insertions(+), 6 deletions(-)
diff --git a/fs/internal.h b/fs/internal.h
index 6fd14ea213c3..23b8b427dbd2 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -80,6 +80,7 @@ long do_unlinkat(int dfd, struct filename *name);
int may_linkat(struct path *link);
int do_renameat2(int olddfd, struct filename *oldname, int newdfd,
struct filename *newname, unsigned int flags);
+long do_mkdirat(int dfd, struct filename *name, umode_t mode);
/*
* namespace.c
diff --git a/fs/namei.c b/fs/namei.c
index 03d0e11e4f36..9d26a51f3f54 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3654,17 +3654,23 @@ int vfs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
}
EXPORT_SYMBOL(vfs_mkdir);
-static long do_mkdirat(int dfd, const char __user *pathname, umode_t mode)
+long do_mkdirat(int dfd, struct filename *name, umode_t mode)
{
struct dentry *dentry;
struct path path;
int error;
unsigned int lookup_flags = LOOKUP_DIRECTORY;
+ if (IS_ERR(name))
+ return PTR_ERR(name);
+
retry:
- dentry = user_path_create(dfd, pathname, &path, lookup_flags);
- if (IS_ERR(dentry))
- return PTR_ERR(dentry);
+ name->refcnt++; /* filename_create() drops our ref */
+ dentry = filename_create(dfd, name, &path, lookup_flags);
+ if (IS_ERR(dentry)) {
+ error = PTR_ERR(dentry);
+ goto out;
+ }
if (!IS_POSIXACL(path.dentry->d_inode))
mode &= ~current_umask();
@@ -3676,17 +3682,19 @@ static long do_mkdirat(int dfd, const char __user *pathname, umode_t mode)
lookup_flags |= LOOKUP_REVAL;
goto retry;
}
+out:
+ putname(name);
return error;
}
SYSCALL_DEFINE3(mkdirat, int, dfd, const char __user *, pathname, umode_t, mode)
{
- return do_mkdirat(dfd, pathname, mode);
+ return do_mkdirat(dfd, getname(pathname), mode);
}
SYSCALL_DEFINE2(mkdir, const char __user *, pathname, umode_t, mode)
{
- return do_mkdirat(AT_FDCWD, pathname, mode);
+ return do_mkdirat(AT_FDCWD, getname(pathname), mode);
}
int vfs_rmdir(struct inode *dir, struct dentry *dentry)
--
2.28.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] fs: make do_mkdirat() take struct filename
2020-11-16 4:45 ` [PATCH 1/2] fs: make do_mkdirat() take struct filename Dmitry Kadashev
@ 2021-01-25 4:38 ` Jens Axboe
2021-01-26 22:55 ` Al Viro
0 siblings, 1 reply; 22+ messages in thread
From: Jens Axboe @ 2021-01-25 4:38 UTC (permalink / raw)
To: Dmitry Kadashev, io-uring; +Cc: viro, linux-fsdevel
On 11/15/20 9:45 PM, Dmitry Kadashev wrote:
> Pass in the struct filename pointers instead of the user string, and
> update the three callers to do the same. This is heavily based on
> commit dbea8d345177 ("fs: make do_renameat2() take struct filename").
>
> This behaves like do_unlinkat() and do_renameat2().
Al, are you OK with this patch? Leaving it quoted, though you should
have the original too.
>
> Signed-off-by: Dmitry Kadashev <[email protected]>
> ---
> fs/internal.h | 1 +
> fs/namei.c | 20 ++++++++++++++------
> 2 files changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/fs/internal.h b/fs/internal.h
> index 6fd14ea213c3..23b8b427dbd2 100644
> --- a/fs/internal.h
> +++ b/fs/internal.h
> @@ -80,6 +80,7 @@ long do_unlinkat(int dfd, struct filename *name);
> int may_linkat(struct path *link);
> int do_renameat2(int olddfd, struct filename *oldname, int newdfd,
> struct filename *newname, unsigned int flags);
> +long do_mkdirat(int dfd, struct filename *name, umode_t mode);
>
> /*
> * namespace.c
> diff --git a/fs/namei.c b/fs/namei.c
> index 03d0e11e4f36..9d26a51f3f54 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -3654,17 +3654,23 @@ int vfs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
> }
> EXPORT_SYMBOL(vfs_mkdir);
>
> -static long do_mkdirat(int dfd, const char __user *pathname, umode_t mode)
> +long do_mkdirat(int dfd, struct filename *name, umode_t mode)
> {
> struct dentry *dentry;
> struct path path;
> int error;
> unsigned int lookup_flags = LOOKUP_DIRECTORY;
>
> + if (IS_ERR(name))
> + return PTR_ERR(name);
> +
> retry:
> - dentry = user_path_create(dfd, pathname, &path, lookup_flags);
> - if (IS_ERR(dentry))
> - return PTR_ERR(dentry);
> + name->refcnt++; /* filename_create() drops our ref */
> + dentry = filename_create(dfd, name, &path, lookup_flags);
> + if (IS_ERR(dentry)) {
> + error = PTR_ERR(dentry);
> + goto out;
> + }
>
> if (!IS_POSIXACL(path.dentry->d_inode))
> mode &= ~current_umask();
> @@ -3676,17 +3682,19 @@ static long do_mkdirat(int dfd, const char __user *pathname, umode_t mode)
> lookup_flags |= LOOKUP_REVAL;
> goto retry;
> }
> +out:
> + putname(name);
> return error;
> }
>
> SYSCALL_DEFINE3(mkdirat, int, dfd, const char __user *, pathname, umode_t, mode)
> {
> - return do_mkdirat(dfd, pathname, mode);
> + return do_mkdirat(dfd, getname(pathname), mode);
> }
>
> SYSCALL_DEFINE2(mkdir, const char __user *, pathname, umode_t, mode)
> {
> - return do_mkdirat(AT_FDCWD, pathname, mode);
> + return do_mkdirat(AT_FDCWD, getname(pathname), mode);
> }
>
> int vfs_rmdir(struct inode *dir, struct dentry *dentry)
>
--
Jens Axboe
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] fs: make do_mkdirat() take struct filename
2021-01-25 4:38 ` Jens Axboe
@ 2021-01-26 22:55 ` Al Viro
2021-02-01 11:09 ` Dmitry Kadashev
0 siblings, 1 reply; 22+ messages in thread
From: Al Viro @ 2021-01-26 22:55 UTC (permalink / raw)
To: Jens Axboe; +Cc: Dmitry Kadashev, io-uring, linux-fsdevel
On Sun, Jan 24, 2021 at 09:38:19PM -0700, Jens Axboe wrote:
> On 11/15/20 9:45 PM, Dmitry Kadashev wrote:
> > Pass in the struct filename pointers instead of the user string, and
> > update the three callers to do the same. This is heavily based on
> > commit dbea8d345177 ("fs: make do_renameat2() take struct filename").
> >
> > This behaves like do_unlinkat() and do_renameat2().
>
> Al, are you OK with this patch? Leaving it quoted, though you should
> have the original too.
> > -static long do_mkdirat(int dfd, const char __user *pathname, umode_t mode)
> > +long do_mkdirat(int dfd, struct filename *name, umode_t mode)
> > {
> > struct dentry *dentry;
> > struct path path;
> > int error;
> > unsigned int lookup_flags = LOOKUP_DIRECTORY;
> >
> > + if (IS_ERR(name))
> > + return PTR_ERR(name);
> > +
> > retry:
> > - dentry = user_path_create(dfd, pathname, &path, lookup_flags);
> > - if (IS_ERR(dentry))
> > - return PTR_ERR(dentry);
> > + name->refcnt++; /* filename_create() drops our ref */
> > + dentry = filename_create(dfd, name, &path, lookup_flags);
> > + if (IS_ERR(dentry)) {
> > + error = PTR_ERR(dentry);
> > + goto out;
> > + }
No. This is going to be a source of confusion from hell. If anything,
you want a variant of filename_create() that does not drop name on
success. With filename_create() itself being an inlined wrapper
for it.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] fs: make do_mkdirat() take struct filename
2021-01-26 22:55 ` Al Viro
@ 2021-02-01 11:09 ` Dmitry Kadashev
2021-02-01 15:00 ` Al Viro
0 siblings, 1 reply; 22+ messages in thread
From: Dmitry Kadashev @ 2021-02-01 11:09 UTC (permalink / raw)
To: Al Viro; +Cc: Jens Axboe, io-uring, linux-fsdevel
On Wed, Jan 27, 2021 at 5:55 AM Al Viro <[email protected]> wrote:
>
> On Sun, Jan 24, 2021 at 09:38:19PM -0700, Jens Axboe wrote:
> > On 11/15/20 9:45 PM, Dmitry Kadashev wrote:
> > > Pass in the struct filename pointers instead of the user string, and
> > > update the three callers to do the same. This is heavily based on
> > > commit dbea8d345177 ("fs: make do_renameat2() take struct filename").
> > >
> > > This behaves like do_unlinkat() and do_renameat2().
> >
> > Al, are you OK with this patch? Leaving it quoted, though you should
> > have the original too.
>
> > > -static long do_mkdirat(int dfd, const char __user *pathname, umode_t mode)
> > > +long do_mkdirat(int dfd, struct filename *name, umode_t mode)
> > > {
> > > struct dentry *dentry;
> > > struct path path;
> > > int error;
> > > unsigned int lookup_flags = LOOKUP_DIRECTORY;
> > >
> > > + if (IS_ERR(name))
> > > + return PTR_ERR(name);
> > > +
> > > retry:
> > > - dentry = user_path_create(dfd, pathname, &path, lookup_flags);
> > > - if (IS_ERR(dentry))
> > > - return PTR_ERR(dentry);
> > > + name->refcnt++; /* filename_create() drops our ref */
> > > + dentry = filename_create(dfd, name, &path, lookup_flags);
> > > + if (IS_ERR(dentry)) {
> > > + error = PTR_ERR(dentry);
> > > + goto out;
> > > + }
>
> No. This is going to be a source of confusion from hell. If anything,
> you want a variant of filename_create() that does not drop name on
> success. With filename_create() itself being an inlined wrapper
> for it.
Hi Al,
I think I need more guidance here. First of all, I've based that code on
commit 7cdfa44227b0 ("vfs: Fix refcounting of filenames in fs_parser"), which
does exactly the same refcount bump in fs_parser.c for filename_lookup(). I'm
not saying it's a good excuse to introduce more code like that if that's a bad
code though.
What I _am_ saying is we probably want to make the approaches consistent (at
least eventually), which means we'd need the same "don't drop the name" variant
of filename_lookup? And given the fact filename_parentat (used from
filename_create) drops the name on error it looks like we'd need another copy of
it too? Do you think it's really worth it or maybe all of these functions will
make things more confusing? (from the looks of it right now the convention is
that the `struct filename` ownership is always transferred when it is passed as
an arg)
Also, do you have a good name for such functions that do not drop the name?
And, just for my education, can you explain why the reference counting for
struct filename exists if it's considered a bad practice to increase the
reference counter (assuming the cleanup code is correct)?
Thanks.
--
Dmitry Kadashev
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] fs: make do_mkdirat() take struct filename
2021-02-01 11:09 ` Dmitry Kadashev
@ 2021-02-01 15:00 ` Al Viro
2021-02-01 15:29 ` Al Viro
2021-02-02 4:39 ` Dmitry Kadashev
0 siblings, 2 replies; 22+ messages in thread
From: Al Viro @ 2021-02-01 15:00 UTC (permalink / raw)
To: Dmitry Kadashev; +Cc: Jens Axboe, io-uring, linux-fsdevel
On Mon, Feb 01, 2021 at 06:09:01PM +0700, Dmitry Kadashev wrote:
> Hi Al,
>
> I think I need more guidance here. First of all, I've based that code on
> commit 7cdfa44227b0 ("vfs: Fix refcounting of filenames in fs_parser"), which
> does exactly the same refcount bump in fs_parser.c for filename_lookup(). I'm
> not saying it's a good excuse to introduce more code like that if that's a bad
> code though.
It is a bad code. If you look at that function, you'll see that the entire
mess around put_f is rather hard to follow and reason about. That's a function
with no users, and I'm not sure we want to keep it long-term.
> What I _am_ saying is we probably want to make the approaches consistent (at
> least eventually), which means we'd need the same "don't drop the name" variant
> of filename_lookup?
"don't drop the name on success", similar to what filename_parentat() does.
> And given the fact filename_parentat (used from
> filename_create) drops the name on error it looks like we'd need another copy of
> it too?
No need.
> Do you think it's really worth it or maybe all of these functions will
> make things more confusing? (from the looks of it right now the convention is
> that the `struct filename` ownership is always transferred when it is passed as
> an arg)
>
> Also, do you have a good name for such functions that do not drop the name?
>
> And, just for my education, can you explain why the reference counting for
> struct filename exists if it's considered a bad practice to increase the
> reference counter (assuming the cleanup code is correct)?
The last one is the easiest to answer - we want to keep the imported strings
around for audit. It's not so much a proper refcounting as it is "we might
want freeing delayed" implemented as refcount.
As for do_mkdirat(), you probably want semantics similar to do_unlinkat(), i.e.
have it consume the argument passed to it. The main complication comes
from ESTALE retries; want -ESTALE from ->mkdir() itself to trigger "redo
filename_parentat() with LOOKUP_REVAL, then try the rest one more time".
For which you need to keep filename around. OK, so you want a variant of
filename_create() that would _not_ consume the filename on success (i.e.
act as filename_parentat() itself does). Which is trivial to implement -
just rename filename_create() to __filename_create() and remove one of
two putname() in there, leaving just the one in failure exits. Then
filename_create() itself becomes simply
static inline struct dentry *filename_create(int dfd, struct filename *name,
struct path *path, unsigned int lookup_flags)
{
struct dentry *res = __filename_create(dfd, name, path, lookup_flags);
if (!IS_ERR(res))
putname(name);
return res;
}
and in your do_mkdirat() replacement use
dentry = __filename_create(dfd, filename, &path, lookup_flags);
instead of
dentry = user_path_create(dfd, pathname, &path, lookup_flags);
and add
putname(filename);
in the very end. All it takes...
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] fs: make do_mkdirat() take struct filename
2021-02-01 15:00 ` Al Viro
@ 2021-02-01 15:29 ` Al Viro
2021-03-31 16:28 ` Eric W. Biederman
2021-02-02 4:39 ` Dmitry Kadashev
1 sibling, 1 reply; 22+ messages in thread
From: Al Viro @ 2021-02-01 15:29 UTC (permalink / raw)
To: Dmitry Kadashev; +Cc: Jens Axboe, io-uring, linux-fsdevel
On Mon, Feb 01, 2021 at 03:00:42PM +0000, Al Viro wrote:
> The last one is the easiest to answer - we want to keep the imported strings
> around for audit. It's not so much a proper refcounting as it is "we might
> want freeing delayed" implemented as refcount.
BTW, regarding io_uring + audit interactions - just how is that supposed to
work if you offload any work that might lead to audit records (on permission
checks, etc.) to helper threads?
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] fs: make do_mkdirat() take struct filename
2021-02-01 15:29 ` Al Viro
@ 2021-03-31 16:28 ` Eric W. Biederman
2021-03-31 16:46 ` Al Viro
0 siblings, 1 reply; 22+ messages in thread
From: Eric W. Biederman @ 2021-03-31 16:28 UTC (permalink / raw)
To: Al Viro; +Cc: Dmitry Kadashev, Jens Axboe, io-uring, linux-fsdevel
Al Viro <[email protected]> writes:
> On Mon, Feb 01, 2021 at 03:00:42PM +0000, Al Viro wrote:
>
>> The last one is the easiest to answer - we want to keep the imported strings
>> around for audit. It's not so much a proper refcounting as it is "we might
>> want freeing delayed" implemented as refcount.
>
> BTW, regarding io_uring + audit interactions - just how is that supposed to
> work if you offload any work that might lead to audit records (on permission
> checks, etc.) to helper threads?
For people looking into these details. Things have gotten much better
recently.
The big change is that io_uring helper threads are now proper
threads of the process that is using io_uring. The io_uring helper
threads just happen to never execute any userspace code.
Eric
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] fs: make do_mkdirat() take struct filename
2021-03-31 16:28 ` Eric W. Biederman
@ 2021-03-31 16:46 ` Al Viro
0 siblings, 0 replies; 22+ messages in thread
From: Al Viro @ 2021-03-31 16:46 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: Dmitry Kadashev, Jens Axboe, io-uring, linux-fsdevel
On Wed, Mar 31, 2021 at 11:28:04AM -0500, Eric W. Biederman wrote:
> Al Viro <[email protected]> writes:
>
> > On Mon, Feb 01, 2021 at 03:00:42PM +0000, Al Viro wrote:
> >
> >> The last one is the easiest to answer - we want to keep the imported strings
> >> around for audit. It's not so much a proper refcounting as it is "we might
> >> want freeing delayed" implemented as refcount.
> >
> > BTW, regarding io_uring + audit interactions - just how is that supposed to
> > work if you offload any work that might lead to audit records (on permission
> > checks, etc.) to helper threads?
>
> For people looking into these details. Things have gotten much better
> recently.
>
> The big change is that io_uring helper threads are now proper
> threads of the process that is using io_uring. The io_uring helper
> threads just happen to never execute any userspace code.
audit context is per-thread (as it has to be, obviously - multiple threads
can have overlapping syscalls), so getname()/putname() interplay with that
is still not obvious. I agree that these threads have gotten better,
though.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] fs: make do_mkdirat() take struct filename
2021-02-01 15:00 ` Al Viro
2021-02-01 15:29 ` Al Viro
@ 2021-02-02 4:39 ` Dmitry Kadashev
1 sibling, 0 replies; 22+ messages in thread
From: Dmitry Kadashev @ 2021-02-02 4:39 UTC (permalink / raw)
To: Al Viro; +Cc: Jens Axboe, io-uring, linux-fsdevel
On Mon, Feb 1, 2021 at 10:00 PM Al Viro <[email protected]> wrote:
>
> On Mon, Feb 01, 2021 at 06:09:01PM +0700, Dmitry Kadashev wrote:
>
> > Hi Al,
> >
> > I think I need more guidance here. First of all, I've based that code on
> > commit 7cdfa44227b0 ("vfs: Fix refcounting of filenames in fs_parser"), which
> > does exactly the same refcount bump in fs_parser.c for filename_lookup(). I'm
> > not saying it's a good excuse to introduce more code like that if that's a bad
> > code though.
>
> It is a bad code. If you look at that function, you'll see that the entire
> mess around put_f is rather hard to follow and reason about. That's a function
> with no users, and I'm not sure we want to keep it long-term.
But the reason for the put_f mess is the fact that the function accepts either a
string (which it resolves to a struct filename that it then owns) or a struct
filename (that it does not own), not the meddling with the refcount. I'm not
trying to argue that we should do the meddling though, I'm fine with the other
approach.
> > What I _am_ saying is we probably want to make the approaches consistent (at
> > least eventually), which means we'd need the same "don't drop the name" variant
> > of filename_lookup?
>
> "don't drop the name on success", similar to what filename_parentat() does.
OK, that makes things much simpler.
> > And given the fact filename_parentat (used from
> > filename_create) drops the name on error it looks like we'd need another copy of
> > it too?
>
> No need.
OK.
> > Do you think it's really worth it or maybe all of these functions will
> > make things more confusing? (from the looks of it right now the convention is
> > that the `struct filename` ownership is always transferred when it is passed as
> > an arg)
> >
> > Also, do you have a good name for such functions that do not drop the name?
> >
> > And, just for my education, can you explain why the reference counting for
> > struct filename exists if it's considered a bad practice to increase the
> > reference counter (assuming the cleanup code is correct)?
>
> The last one is the easiest to answer - we want to keep the imported strings
> around for audit. It's not so much a proper refcounting as it is "we might
> want freeing delayed" implemented as refcount.
>
> As for do_mkdirat(), you probably want semantics similar to do_unlinkat(), i.e.
> have it consume the argument passed to it. The main complication comes
> from ESTALE retries; want -ESTALE from ->mkdir() itself to trigger "redo
> filename_parentat() with LOOKUP_REVAL, then try the rest one more time".
> For which you need to keep filename around. OK, so you want a variant of
> filename_create() that would _not_ consume the filename on success (i.e.
> act as filename_parentat() itself does). Which is trivial to implement -
> just rename filename_create() to __filename_create() and remove one of
> two putname() in there, leaving just the one in failure exits. Then
> filename_create() itself becomes simply
>
> static inline struct dentry *filename_create(int dfd, struct filename *name,
> struct path *path, unsigned int lookup_flags)
> {
> struct dentry *res = __filename_create(dfd, name, path, lookup_flags);
> if (!IS_ERR(res))
> putname(name);
> return res;
> }
>
> and in your do_mkdirat() replacement use
> dentry = __filename_create(dfd, filename, &path, lookup_flags);
> instead of
> dentry = user_path_create(dfd, pathname, &path, lookup_flags);
> and add
> putname(filename);
> in the very end. All it takes...
Yeah, I just was not sure about naming or whether you'd prefer for other
functions to be changed too. You've answered pretty much all my questions and
even more :)
Thanks a lot Al! I'll post v2 soon (since the audit thing you've discovered does
not affect this patch directly).
--
Dmitry Kadashev
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 2/2] io_uring: add support for IORING_OP_MKDIRAT
2020-11-16 4:45 [PATCH 0/2] io_uring: add mkdirat support Dmitry Kadashev
2020-11-16 4:45 ` [PATCH 1/2] fs: make do_mkdirat() take struct filename Dmitry Kadashev
@ 2020-11-16 4:45 ` Dmitry Kadashev
2020-12-04 10:57 ` [PATCH 0/2] io_uring: add mkdirat support Dmitry Kadashev
2021-01-26 22:35 ` Jens Axboe
3 siblings, 0 replies; 22+ messages in thread
From: Dmitry Kadashev @ 2020-11-16 4:45 UTC (permalink / raw)
To: io-uring; +Cc: axboe, viro, linux-fsdevel, Dmitry Kadashev
IORING_OP_MKDIRAT behaves like mkdirat(2) and takes the same flags
and arguments.
Signed-off-by: Dmitry Kadashev <[email protected]>
---
fs/io_uring.c | 58 +++++++++++++++++++++++++++++++++++
include/uapi/linux/io_uring.h | 1 +
2 files changed, 59 insertions(+)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 365a583033c5..0848b6c18fa6 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -565,6 +565,13 @@ struct io_unlink {
struct filename *filename;
};
+struct io_mkdir {
+ struct file *file;
+ int dfd;
+ umode_t mode;
+ struct filename *filename;
+};
+
struct io_completion {
struct file *file;
struct list_head list;
@@ -692,6 +699,7 @@ struct io_kiocb {
struct io_shutdown shutdown;
struct io_rename rename;
struct io_unlink unlink;
+ struct io_mkdir mkdir;
/* use only after cleaning per-op data, see io_clean_op() */
struct io_completion compl;
};
@@ -979,6 +987,10 @@ static const struct io_op_def io_op_defs[] = {
.work_flags = IO_WQ_WORK_MM | IO_WQ_WORK_FILES |
IO_WQ_WORK_FS | IO_WQ_WORK_BLKCG,
},
+ [IORING_OP_MKDIRAT] = {
+ .work_flags = IO_WQ_WORK_MM | IO_WQ_WORK_FILES |
+ IO_WQ_WORK_FS | IO_WQ_WORK_BLKCG,
+ },
};
enum io_mem_account {
@@ -3745,6 +3757,44 @@ static int io_unlinkat(struct io_kiocb *req, bool force_nonblock)
return 0;
}
+static int io_mkdirat_prep(struct io_kiocb *req,
+ const struct io_uring_sqe *sqe)
+{
+ struct io_mkdir *mkd = &req->mkdir;
+ const char __user *fname;
+
+ if (unlikely(req->flags & REQ_F_FIXED_FILE))
+ return -EBADF;
+
+ mkd->dfd = READ_ONCE(sqe->fd);
+ mkd->mode = READ_ONCE(sqe->len);
+
+ fname = u64_to_user_ptr(READ_ONCE(sqe->addr));
+ mkd->filename = getname(fname);
+ if (IS_ERR(mkd->filename))
+ return PTR_ERR(mkd->filename);
+
+ req->flags |= REQ_F_NEED_CLEANUP;
+ return 0;
+}
+
+static int io_mkdirat(struct io_kiocb *req, bool force_nonblock)
+{
+ struct io_mkdir *mkd = &req->mkdir;
+ int ret;
+
+ if (force_nonblock)
+ return -EAGAIN;
+
+ ret = do_mkdirat(mkd->dfd, mkd->filename, mkd->mode);
+
+ req->flags &= ~REQ_F_NEED_CLEANUP;
+ if (ret < 0)
+ req_set_fail_links(req);
+ io_req_complete(req, ret);
+ return 0;
+}
+
static int io_shutdown_prep(struct io_kiocb *req,
const struct io_uring_sqe *sqe)
{
@@ -5956,6 +6006,8 @@ static int io_req_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
return io_renameat_prep(req, sqe);
case IORING_OP_UNLINKAT:
return io_unlinkat_prep(req, sqe);
+ case IORING_OP_MKDIRAT:
+ return io_mkdirat_prep(req, sqe);
}
printk_once(KERN_WARNING "io_uring: unhandled opcode %d\n",
@@ -6099,6 +6151,9 @@ static void __io_clean_op(struct io_kiocb *req)
case IORING_OP_UNLINKAT:
putname(req->unlink.filename);
break;
+ case IORING_OP_MKDIRAT:
+ putname(req->mkdir.filename);
+ break;
}
req->flags &= ~REQ_F_NEED_CLEANUP;
}
@@ -6214,6 +6269,9 @@ static int io_issue_sqe(struct io_kiocb *req, bool force_nonblock,
case IORING_OP_UNLINKAT:
ret = io_unlinkat(req, force_nonblock);
break;
+ case IORING_OP_MKDIRAT:
+ ret = io_mkdirat(req, force_nonblock);
+ break;
default:
ret = -EINVAL;
break;
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 6bb8229de892..bc256eab7809 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -137,6 +137,7 @@ enum {
IORING_OP_SHUTDOWN,
IORING_OP_RENAMEAT,
IORING_OP_UNLINKAT,
+ IORING_OP_MKDIRAT,
/* this goes last, obviously */
IORING_OP_LAST,
--
2.28.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 0/2] io_uring: add mkdirat support
2020-11-16 4:45 [PATCH 0/2] io_uring: add mkdirat support Dmitry Kadashev
2020-11-16 4:45 ` [PATCH 1/2] fs: make do_mkdirat() take struct filename Dmitry Kadashev
2020-11-16 4:45 ` [PATCH 2/2] io_uring: add support for IORING_OP_MKDIRAT Dmitry Kadashev
@ 2020-12-04 10:57 ` Dmitry Kadashev
2020-12-15 11:43 ` Dmitry Kadashev
2021-01-26 22:35 ` Jens Axboe
3 siblings, 1 reply; 22+ messages in thread
From: Dmitry Kadashev @ 2020-12-04 10:57 UTC (permalink / raw)
To: viro; +Cc: axboe, io-uring, linux-fsdevel
On Mon, Nov 16, 2020 at 11:45:27AM +0700, Dmitry Kadashev wrote:
> This adds mkdirat support to io_uring and is heavily based on recently
> added renameat() / unlinkat() support.
>
> The first patch is preparation with no functional changes, makes
> do_mkdirat accept struct filename pointer rather than the user string.
>
> The second one leverages that to implement mkdirat in io_uring.
>
> Based on for-5.11/io_uring.
>
> Dmitry Kadashev (2):
> fs: make do_mkdirat() take struct filename
> io_uring: add support for IORING_OP_MKDIRAT
>
> fs/internal.h | 1 +
> fs/io_uring.c | 58 +++++++++++++++++++++++++++++++++++
> fs/namei.c | 20 ++++++++----
> include/uapi/linux/io_uring.h | 1 +
> 4 files changed, 74 insertions(+), 6 deletions(-)
>
> --
> 2.28.0
>
Hi Al Viro,
Ping. Jens mentioned before that this looks fine by him, but you or
someone from fsdevel should approve the namei.c part first.
Thanks,
Dmitry
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 0/2] io_uring: add mkdirat support
2020-12-04 10:57 ` [PATCH 0/2] io_uring: add mkdirat support Dmitry Kadashev
@ 2020-12-15 11:43 ` Dmitry Kadashev
2020-12-15 16:20 ` Jens Axboe
0 siblings, 1 reply; 22+ messages in thread
From: Dmitry Kadashev @ 2020-12-15 11:43 UTC (permalink / raw)
To: viro; +Cc: Jens Axboe, io-uring, linux-fsdevel
On Fri, Dec 4, 2020 at 5:57 PM Dmitry Kadashev <[email protected]> wrote:
>
> On Mon, Nov 16, 2020 at 11:45:27AM +0700, Dmitry Kadashev wrote:
> > This adds mkdirat support to io_uring and is heavily based on recently
> > added renameat() / unlinkat() support.
> >
> > The first patch is preparation with no functional changes, makes
> > do_mkdirat accept struct filename pointer rather than the user string.
> >
> > The second one leverages that to implement mkdirat in io_uring.
> >
> > Based on for-5.11/io_uring.
> >
> > Dmitry Kadashev (2):
> > fs: make do_mkdirat() take struct filename
> > io_uring: add support for IORING_OP_MKDIRAT
> >
> > fs/internal.h | 1 +
> > fs/io_uring.c | 58 +++++++++++++++++++++++++++++++++++
> > fs/namei.c | 20 ++++++++----
> > include/uapi/linux/io_uring.h | 1 +
> > 4 files changed, 74 insertions(+), 6 deletions(-)
> >
> > --
> > 2.28.0
> >
>
> Hi Al Viro,
>
> Ping. Jens mentioned before that this looks fine by him, but you or
> someone from fsdevel should approve the namei.c part first.
Another ping.
Jens, you've mentioned the patch looks good to you, and with quite similar
changes (unlinkat, renameat) being sent for 5.11 is there anything that I can do
to help this to be accepted (not necessarily for 5.11 at this point)?
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 0/2] io_uring: add mkdirat support
2020-12-15 11:43 ` Dmitry Kadashev
@ 2020-12-15 16:20 ` Jens Axboe
2020-12-16 6:05 ` Dmitry Kadashev
2021-01-20 8:21 ` Dmitry Kadashev
0 siblings, 2 replies; 22+ messages in thread
From: Jens Axboe @ 2020-12-15 16:20 UTC (permalink / raw)
To: Dmitry Kadashev, viro; +Cc: io-uring, linux-fsdevel
On 12/15/20 4:43 AM, Dmitry Kadashev wrote:
> On Fri, Dec 4, 2020 at 5:57 PM Dmitry Kadashev <[email protected]> wrote:
>>
>> On Mon, Nov 16, 2020 at 11:45:27AM +0700, Dmitry Kadashev wrote:
>>> This adds mkdirat support to io_uring and is heavily based on recently
>>> added renameat() / unlinkat() support.
>>>
>>> The first patch is preparation with no functional changes, makes
>>> do_mkdirat accept struct filename pointer rather than the user string.
>>>
>>> The second one leverages that to implement mkdirat in io_uring.
>>>
>>> Based on for-5.11/io_uring.
>>>
>>> Dmitry Kadashev (2):
>>> fs: make do_mkdirat() take struct filename
>>> io_uring: add support for IORING_OP_MKDIRAT
>>>
>>> fs/internal.h | 1 +
>>> fs/io_uring.c | 58 +++++++++++++++++++++++++++++++++++
>>> fs/namei.c | 20 ++++++++----
>>> include/uapi/linux/io_uring.h | 1 +
>>> 4 files changed, 74 insertions(+), 6 deletions(-)
>>>
>>> --
>>> 2.28.0
>>>
>>
>> Hi Al Viro,
>>
>> Ping. Jens mentioned before that this looks fine by him, but you or
>> someone from fsdevel should approve the namei.c part first.
>
> Another ping.
>
> Jens, you've mentioned the patch looks good to you, and with quite
> similar changes (unlinkat, renameat) being sent for 5.11 is there
> anything that I can do to help this to be accepted (not necessarily
> for 5.11 at this point)?
Since we're aiming for 5.12 at this point, let's just hold off a bit and
see if Al gets time to ack/review the VFS side of things. There's no
immediate rush.
It's on my TODO list, so we'll get there eventually.
--
Jens Axboe
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 0/2] io_uring: add mkdirat support
2020-12-15 16:20 ` Jens Axboe
@ 2020-12-16 6:05 ` Dmitry Kadashev
2021-01-20 8:21 ` Dmitry Kadashev
1 sibling, 0 replies; 22+ messages in thread
From: Dmitry Kadashev @ 2020-12-16 6:05 UTC (permalink / raw)
To: Jens Axboe; +Cc: viro, io-uring, linux-fsdevel
On Tue, Dec 15, 2020 at 11:20 PM Jens Axboe <[email protected]> wrote:
>
> On 12/15/20 4:43 AM, Dmitry Kadashev wrote:
> > On Fri, Dec 4, 2020 at 5:57 PM Dmitry Kadashev <[email protected]> wrote:
> >>
> >> On Mon, Nov 16, 2020 at 11:45:27AM +0700, Dmitry Kadashev wrote:
> >>> This adds mkdirat support to io_uring and is heavily based on recently
> >>> added renameat() / unlinkat() support.
> >>>
> >>> The first patch is preparation with no functional changes, makes
> >>> do_mkdirat accept struct filename pointer rather than the user string.
> >>>
> >>> The second one leverages that to implement mkdirat in io_uring.
> >>>
> >>> Based on for-5.11/io_uring.
> >>>
> >>> Dmitry Kadashev (2):
> >>> fs: make do_mkdirat() take struct filename
> >>> io_uring: add support for IORING_OP_MKDIRAT
> >>>
> >>> fs/internal.h | 1 +
> >>> fs/io_uring.c | 58 +++++++++++++++++++++++++++++++++++
> >>> fs/namei.c | 20 ++++++++----
> >>> include/uapi/linux/io_uring.h | 1 +
> >>> 4 files changed, 74 insertions(+), 6 deletions(-)
> >>>
> >>> --
> >>> 2.28.0
> >>>
> >>
> >> Hi Al Viro,
> >>
> >> Ping. Jens mentioned before that this looks fine by him, but you or
> >> someone from fsdevel should approve the namei.c part first.
> >
> > Another ping.
> >
> > Jens, you've mentioned the patch looks good to you, and with quite
> > similar changes (unlinkat, renameat) being sent for 5.11 is there
> > anything that I can do to help this to be accepted (not necessarily
> > for 5.11 at this point)?
>
> Since we're aiming for 5.12 at this point, let's just hold off a bit and
> see if Al gets time to ack/review the VFS side of things. There's no
> immediate rush.
OK, sounds good, thanks.
--
Dmitry Kadashev
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 0/2] io_uring: add mkdirat support
2020-12-15 16:20 ` Jens Axboe
2020-12-16 6:05 ` Dmitry Kadashev
@ 2021-01-20 8:21 ` Dmitry Kadashev
1 sibling, 0 replies; 22+ messages in thread
From: Dmitry Kadashev @ 2021-01-20 8:21 UTC (permalink / raw)
To: Jens Axboe; +Cc: viro, io-uring, linux-fsdevel
On Tue, Dec 15, 2020 at 11:20 PM Jens Axboe <[email protected]> wrote:
>
> On 12/15/20 4:43 AM, Dmitry Kadashev wrote:
> > On Fri, Dec 4, 2020 at 5:57 PM Dmitry Kadashev <[email protected]> wrote:
> >>
> >> On Mon, Nov 16, 2020 at 11:45:27AM +0700, Dmitry Kadashev wrote:
> >>> This adds mkdirat support to io_uring and is heavily based on recently
> >>> added renameat() / unlinkat() support.
> >>>
> >>> The first patch is preparation with no functional changes, makes
> >>> do_mkdirat accept struct filename pointer rather than the user string.
> >>>
> >>> The second one leverages that to implement mkdirat in io_uring.
> >>>
> >>> Based on for-5.11/io_uring.
> >>>
> >>> Dmitry Kadashev (2):
> >>> fs: make do_mkdirat() take struct filename
> >>> io_uring: add support for IORING_OP_MKDIRAT
> >>>
> >>> fs/internal.h | 1 +
> >>> fs/io_uring.c | 58 +++++++++++++++++++++++++++++++++++
> >>> fs/namei.c | 20 ++++++++----
> >>> include/uapi/linux/io_uring.h | 1 +
> >>> 4 files changed, 74 insertions(+), 6 deletions(-)
> >>>
> >>> --
> >>> 2.28.0
> >>>
> >>
> >> Hi Al Viro,
> >>
> >> Ping. Jens mentioned before that this looks fine by him, but you or
> >> someone from fsdevel should approve the namei.c part first.
> >
> > Another ping.
> >
> > Jens, you've mentioned the patch looks good to you, and with quite
> > similar changes (unlinkat, renameat) being sent for 5.11 is there
> > anything that I can do to help this to be accepted (not necessarily
> > for 5.11 at this point)?
>
> Since we're aiming for 5.12 at this point, let's just hold off a bit and
> see if Al gets time to ack/review the VFS side of things. There's no
> immediate rush.
>
> It's on my TODO list, so we'll get there eventually.
Another reminder, since afaict 5.12 stuff is being merged now.
--
Dmitry Kadashev
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 0/2] io_uring: add mkdirat support
2020-11-16 4:45 [PATCH 0/2] io_uring: add mkdirat support Dmitry Kadashev
` (2 preceding siblings ...)
2020-12-04 10:57 ` [PATCH 0/2] io_uring: add mkdirat support Dmitry Kadashev
@ 2021-01-26 22:35 ` Jens Axboe
2021-01-27 11:06 ` Dmitry Kadashev
3 siblings, 1 reply; 22+ messages in thread
From: Jens Axboe @ 2021-01-26 22:35 UTC (permalink / raw)
To: Dmitry Kadashev, io-uring; +Cc: viro, linux-fsdevel
On 11/15/20 9:45 PM, Dmitry Kadashev wrote:
> This adds mkdirat support to io_uring and is heavily based on recently
> added renameat() / unlinkat() support.
>
> The first patch is preparation with no functional changes, makes
> do_mkdirat accept struct filename pointer rather than the user string.
>
> The second one leverages that to implement mkdirat in io_uring.
>
> Based on for-5.11/io_uring.
I want to tentatively queue this up. Do you have the liburing support
and test case(s) for it as well that you can send?
--
Jens Axboe
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 0/2] io_uring: add mkdirat support
2021-01-26 22:35 ` Jens Axboe
@ 2021-01-27 11:06 ` Dmitry Kadashev
2021-01-27 16:22 ` Jens Axboe
0 siblings, 1 reply; 22+ messages in thread
From: Dmitry Kadashev @ 2021-01-27 11:06 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, viro, linux-fsdevel
On Wed, Jan 27, 2021 at 5:35 AM Jens Axboe <[email protected]> wrote:
>
> On 11/15/20 9:45 PM, Dmitry Kadashev wrote:
> > This adds mkdirat support to io_uring and is heavily based on recently
> > added renameat() / unlinkat() support.
> >
> > The first patch is preparation with no functional changes, makes
> > do_mkdirat accept struct filename pointer rather than the user string.
> >
> > The second one leverages that to implement mkdirat in io_uring.
> >
> > Based on for-5.11/io_uring.
>
> I want to tentatively queue this up. Do you have the liburing support
> and test case(s) for it as well that you can send?
I do, I've sent it in the past, here it is:
https://lore.kernel.org/io-uring/[email protected]/
I need to (figure out the way to) fix the kernel / namei side after Al's
comments though.
--
Dmitry Kadashev
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 0/2] io_uring: add mkdirat support
2021-01-27 11:06 ` Dmitry Kadashev
@ 2021-01-27 16:22 ` Jens Axboe
0 siblings, 0 replies; 22+ messages in thread
From: Jens Axboe @ 2021-01-27 16:22 UTC (permalink / raw)
To: Dmitry Kadashev; +Cc: io-uring, viro, linux-fsdevel
On 1/27/21 4:06 AM, Dmitry Kadashev wrote:
> On Wed, Jan 27, 2021 at 5:35 AM Jens Axboe <[email protected]> wrote:
>>
>> On 11/15/20 9:45 PM, Dmitry Kadashev wrote:
>>> This adds mkdirat support to io_uring and is heavily based on recently
>>> added renameat() / unlinkat() support.
>>>
>>> The first patch is preparation with no functional changes, makes
>>> do_mkdirat accept struct filename pointer rather than the user string.
>>>
>>> The second one leverages that to implement mkdirat in io_uring.
>>>
>>> Based on for-5.11/io_uring.
>>
>> I want to tentatively queue this up. Do you have the liburing support
>> and test case(s) for it as well that you can send?
>
> I do, I've sent it in the past, here it is:
> https://lore.kernel.org/io-uring/[email protected]/
I thought so, thanks. I'll queue it up once we have agreement on the
kernel side.
> I need to (figure out the way to) fix the kernel / namei side after Al's
> comments though.
Thanks, yes please do and re-post it.
--
Jens Axboe
^ permalink raw reply [flat|nested] 22+ messages in thread