public inbox for [email protected]
 help / color / mirror / Atom feed
From: Christian Brauner <[email protected]>
To: Hao Xu <[email protected]>
Cc: Dave Chinner <[email protected]>,
	Pavel Begunkov <[email protected]>,
	[email protected], Jens Axboe <[email protected]>,
	[email protected],
	Dominique Martinet <[email protected]>,
	Alexander Viro <[email protected]>,
	Stefan Roesch <[email protected]>, Clay Harris <[email protected]>,
	[email protected], Wanpeng Li <[email protected]>,
	[email protected]
Subject: Re: [PATCH 3/5] io_uring: add support for getdents
Date: Mon, 31 Jul 2023 09:50:26 +0200	[thread overview]
Message-ID: <20230731-erwidern-geleast-d9033413683d@brauner> (raw)
In-Reply-To: <[email protected]>

On Mon, Jul 31, 2023 at 03:34:48PM +0800, Hao Xu wrote:
> On 7/31/23 09:58, Dave Chinner wrote:
> > On Thu, Jul 27, 2023 at 06:28:52PM +0200, Christian Brauner wrote:
> > > On Thu, Jul 27, 2023 at 05:17:30PM +0100, Pavel Begunkov wrote:
> > > > On 7/27/23 16:52, Christian Brauner wrote:
> > > > > On Thu, Jul 27, 2023 at 04:12:12PM +0100, Pavel Begunkov wrote:
> > > > > It would also solve it for writes which is what my kiocb_modified()
> > > > > comment was about. So right now you have:
> > > > 
> > > > Great, I assumed there are stricter requirements for mtime not
> > > > transiently failing.
> > > 
> > > But I mean then wouldn't this already be a problem today?
> > > kiocb_modified() can error out with EAGAIN today:
> > > 
> > >            ret = inode_needs_update_time(inode, &now);
> > >            if (ret <= 0)
> > >                    return ret;
> > >            if (flags & IOCB_NOWAIT)
> > >                    return -EAGAIN;
> > > 
> > >            return __file_update_time(file, &now, ret);
> > > 
> > > the thing is that it doesn't matter for ->write_iter() - for xfs at
> > > least - because xfs does it as part of preparatory checks before
> > > actually doing any real work. The problem happens when you do actual
> > > work and afterwards call kiocb_modified(). That's why I think (2) is
> > > preferable.
> > 
> > This has nothing to do with what "XFS does". It's actually an
> > IOCB_NOWAIT API design constraint.
> > 
> > That is, IOCB_NOWAIT means "complete the whole operation without
> > blocking or return -EAGAIN having done nothing".  If we have to do
> > something that might block (like a timestamp update) then we need to
> > punt the entire operation before anything has been modified.  This
> > requires all the "do we need to modify this" checks to be done up
> > front before we start modifying anything.
> > 
> > So while it looks like this might be "an XFS thing", that's because
> > XFS tends to be the first filesystem that most io_uring NOWAIT
> > functionality is implemented on. IOWs, what you see is XFS is doing
> > things the way IOCB_NOWAIT requires to be done. i.e. it's a
> > demonstration of how nonblocking filesystem modification operations
> > need to be run, not an "XFS thing"...
> > 
> > > > > I would prefer 2) which seems cleaner to me. But I might miss why this
> > > > > won't work. So input needed/wanted.
> > > > 
> > > > Maybe I didn't fully grasp the (2) idea
> > > > 
> > > > 2.1: all read_iter, write_iter, etc. callbacks should do file_accessed()
> > > > before doing IO, which sounds like a good option if everyone agrees with
> > > > that. Taking a look at direct block io, it's already like this.
> > > 
> > > Yes, that's what I'm talking about. I'm asking whether that's ok for xfs
> > > maintainers basically. i_op->write_iter() already works like that since
> > > the dawn of time but i_op->read_iter doesn't and I'm proposing to make
> > > it work like that and wondering if there's any issues I'm unaware of.
> > 
> > XFS already calls file_accessed() in the DIO read path before the
> > read gets issued. I don't see any problem with lifting it to before
> 
> Hi Dave,
> 
> Here I've a question, in DIO read path, if we update the time but
> later somehow got errors before actual reading, e.g. return -EAGAIN
> from the xfs_ilock_iocb(), shouldn't we revert the time update since
> we actually doesn't read the file? We can lazily update the time but
> on the contrary a false update sounds weird to me.

You've read the file and you failed but you did access the file. We have
the same logic kinda for readdir in the sense that we call
file_accessed() no matter if ->iterate_shared() or ->iterate() succeeded
or not.

Plus, you should just be able to reorder the read checks to mirror the
write checks. Afaict, the order of the write checks are:

xfs_file_write_checks()
-> xfs_ilock_iocb()
-> kiocb_modified()

so we can just do:

+ xfs_file_read_checks()
  +-> xfs_ilock_iocb()
      +-> file_accessed()/kiocb_accessed()

for the read checks. Then we managed to acquire the necessary lock.

  reply	other threads:[~2023-07-31  7:50 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-18 13:21 [PATCH v4 0/5] io_uring getdents Hao Xu
2023-07-18 13:21 ` [PATCH 1/5] fs: split off vfs_getdents function of getdents64 syscall Hao Xu
2023-07-18 13:21 ` [PATCH 2/5] vfs_getdents/struct dir_context: add flags field Hao Xu
2023-07-18 13:21 ` [PATCH 3/5] io_uring: add support for getdents Hao Xu
2023-07-19  8:56   ` Hao Xu
2023-07-26 15:00   ` Christian Brauner
2023-07-27 11:51     ` Hao Xu
2023-07-27 14:27       ` Christian Brauner
2023-07-27 15:12         ` Pavel Begunkov
2023-07-27 15:52           ` Christian Brauner
2023-07-27 16:17             ` Pavel Begunkov
2023-07-27 16:28               ` Christian Brauner
2023-07-31  1:58                 ` Dave Chinner
2023-07-31  7:34                   ` Hao Xu
2023-07-31  7:50                     ` Christian Brauner [this message]
2023-07-31  7:40                   ` Christian Brauner
2023-07-30 18:02         ` Hao Xu
2023-07-31  8:18           ` Christian Brauner
2023-07-31  9:31             ` Hao Xu
2023-07-31  1:33         ` Dave Chinner
2023-07-31  8:13           ` Christian Brauner
2023-07-31 15:26             ` Darrick J. Wong
2023-07-31 22:18               ` Dave Chinner
2023-08-01  0:28               ` Jens Axboe
2023-08-01  0:47                 ` Matthew Wilcox
2023-08-01  0:49                   ` Jens Axboe
2023-08-01  1:01                     ` Matthew Wilcox
2023-08-01  7:00                       ` Christian Brauner
2023-08-01  6:59                     ` Christian Brauner
2023-08-01  7:17                 ` Christian Brauner
2023-08-08  4:34                 ` Hao Xu
2023-08-08  5:18                   ` Hao Xu
2023-08-08  9:33                 ` Hao Xu
2023-08-08 22:55                   ` Jens Axboe
2023-08-01 18:39             ` Hao Xu
2023-07-18 13:21 ` [PATCH 4/5] xfs: add NOWAIT semantics for readdir Hao Xu
2023-07-19  2:35   ` kernel test robot
2023-07-18 13:21 ` [PATCH RFC 5/5] disable fixed file for io_uring getdents for now Hao Xu
2023-07-26 14:23   ` Christian Brauner
2023-07-27 12:09     ` Hao Xu
2023-07-19  6:04 ` [PATCH v4 0/5] io_uring getdents Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230731-erwidern-geleast-d9033413683d@brauner \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox