From: Lennert Buytenhek <[email protected]>
To: Pavel Begunkov <[email protected]>
Cc: Jens Axboe <[email protected]>, Al Viro <[email protected]>,
[email protected], [email protected],
[email protected], David Laight <[email protected]>,
Matthew Wilcox <[email protected]>
Subject: Re: [PATCH v3 2/2] io_uring: add support for IORING_OP_GETDENTS
Date: Fri, 19 Feb 2021 20:06:37 +0200 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On Fri, Feb 19, 2021 at 12:05:58PM +0000, Pavel Begunkov wrote:
> > IORING_OP_GETDENTS behaves much like getdents64(2) and takes the same
> > arguments, but with a small twist: it takes an additional offset
> > argument, and reading from the specified directory starts at the given
> > offset.
> >
> > For the first IORING_OP_GETDENTS call on a directory, the offset
> > parameter can be set to zero, and for subsequent calls, it can be
> > set to the ->d_off field of the last struct linux_dirent64 returned
> > by the previous IORING_OP_GETDENTS call.
> >
> > Internally, if necessary, IORING_OP_GETDENTS will vfs_llseek() to
> > the right directory position before calling vfs_getdents().
> >
> > IORING_OP_GETDENTS may or may not update the specified directory's
> > file offset, and the file offset should not be relied upon having
> > any particular value during or after an IORING_OP_GETDENTS call.
> >
> > Signed-off-by: Lennert Buytenhek <[email protected]>
> > ---
> > fs/io_uring.c | 73 +++++++++++++++++++++++++++++++++++
> > include/uapi/linux/io_uring.h | 1 +
> > 2 files changed, 74 insertions(+)
> >
> > diff --git a/fs/io_uring.c b/fs/io_uring.c
> > index 056bd4c90ade..6853bf48369a 100644
> > --- a/fs/io_uring.c
> > +++ b/fs/io_uring.c
> > @@ -635,6 +635,13 @@ struct io_mkdir {
> > struct filename *filename;
> > };
> >
> [...]
> > +static int io_getdents(struct io_kiocb *req, unsigned int issue_flags)
> > +{
> > + struct io_getdents *getdents = &req->getdents;
> > + bool pos_unlock = false;
> > + int ret = 0;
> > +
> > + /* getdents always requires a blocking context */
> > + if (issue_flags & IO_URING_F_NONBLOCK)
> > + return -EAGAIN;
> > +
> > + /* for vfs_llseek and to serialize ->iterate_shared() on this file */
> > + if (file_count(req->file) > 1) {
>
> Looks racy, is it safe? E.g. can be concurrently dupped and used, or
> just several similar IORING_OP_GETDENTS requests.
I thought that it was safe, but I thought about it a bit more, and it
seems that it is unsafe -- if you IORING_REGISTER_FILES to register the
dirfd and then close the dirfd, you'll get a file_count of 1, while you
can submit concurrent operations. So I'll remove the conditional
locking. Thanks!
(If not for IORING_REGISTER_FILES, it seems safe, because then
io_file_get() will hold a(t least one) reference on the file while the
operation is in flight, so then if file_count(req->file) == 1 here,
then it means that the file is no longer referenced by any fdtable,
and nobody else should be able to get a reference to it -- but that's
a bit of a useless optimization.)
(Logic was taken from __fdget_pos, where it is safe for a different
reason, i.e. __fget_light will not bump the refcount iff current->files
is unshared.)
> > + pos_unlock = true;
> > + mutex_lock(&req->file->f_pos_lock);
> > + }
> > +
> > + if (req->file->f_pos != getdents->pos) {
> > + loff_t res = vfs_llseek(req->file, getdents->pos, SEEK_SET);
>
> I may be missing the previous discussions, but can this ever become
> stateless, like passing an offset? Including readdir.c and beyond.
My aim was to only make the minimally required change initially, but
to make that optimization possible in the future (e.g. by reserving the
right to either update or not update the file position) -- but I'll
try doing the optimization now.
> > + if (res < 0)
> > + ret = res;
> > + }
> > +
> > + if (ret == 0) {
> > + ret = vfs_getdents(req->file, getdents->dirent,
> > + getdents->count);
> > + }
> > +
> > + if (pos_unlock)
> > + mutex_unlock(&req->file->f_pos_lock);
> > +
> > + if (ret < 0) {
> > + if (ret == -ERESTARTSYS)
> > + ret = -EINTR;
> > + req_set_fail_links(req);
> > + }
> > + io_req_complete(req, ret);
> > + return 0;
> > +}
> [...]
>
> --
> Pavel Begunkov
next prev parent reply other threads:[~2021-02-19 18:07 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-18 12:26 [PATCH v3 0/2] io_uring: add support for IORING_OP_GETDENTS Lennert Buytenhek
2021-02-18 12:27 ` [PATCH v3 1/2] readdir: split the core of getdents64(2) out into vfs_getdents() Lennert Buytenhek
2021-02-18 12:27 ` [PATCH v3 2/2] io_uring: add support for IORING_OP_GETDENTS Lennert Buytenhek
2021-02-19 12:05 ` Pavel Begunkov
2021-02-19 12:10 ` Pavel Begunkov
2021-02-19 18:06 ` Lennert Buytenhek [this message]
2021-02-19 12:34 ` Matthew Wilcox
2021-02-19 18:07 ` Lennert Buytenhek
2021-02-19 18:59 ` Lennert Buytenhek
2021-02-20 17:44 ` [PATCH v3 0/2] " David Laight
2021-02-20 18:29 ` Jens Axboe
2021-02-21 19:38 ` David Laight
2021-02-21 21:12 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox