From: Dylan Yudaken <[email protected]>
To: "[email protected]" <[email protected]>,
"[email protected]" <[email protected]>,
"[email protected]" <[email protected]>
Cc: Kernel Team <[email protected]>,
"[email protected]" <[email protected]>
Subject: Re: [PATCH v2 4/4] io_uring: pre-increment f_pos on rw
Date: Tue, 22 Feb 2022 08:26:52 +0000 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On Mon, 2022-02-21 at 18:00 +0000, Pavel Begunkov wrote:
> On 2/21/22 14:16, Dylan Yudaken wrote:
> > In read/write ops, preincrement f_pos when no offset is specified,
> > and
> > then attempt fix up the position after IO completes if it completed
> > less
> > than expected. This fixes the problem where multiple queued up IO
> > will all
> > obtain the same f_pos, and so perform the same read/write.
> >
> > This is still not as consistent as sync r/w, as it is able to
> > advance the
> > file offset past the end of the file. It seems it would be quite a
> > performance hit to work around this limitation - such as by keeping
> > track
> > of concurrent operations - and the downside does not seem to be too
> > problematic.
> >
> > The attempt to fix up the f_pos after will at least mean that in
> > situations
> > where a single operation is run, then the position will be
> > consistent.
> >
> > Co-developed-by: Jens Axboe <[email protected]>
> > Signed-off-by: Jens Axboe <[email protected]>
> > Signed-off-by: Dylan Yudaken <[email protected]>
> > ---
> > fs/io_uring.c | 81 ++++++++++++++++++++++++++++++++++++++++++----
> > -----
> > 1 file changed, 68 insertions(+), 13 deletions(-)
> >
> > diff --git a/fs/io_uring.c b/fs/io_uring.c
> > index abd8c739988e..a951d0754899 100644
> > --- a/fs/io_uring.c
> > +++ b/fs/io_uring.c
> > @@ -3066,21 +3066,71 @@ static inline void io_rw_done(struct kiocb
> > *kiocb, ssize_t ret)
>
> [...]
>
> > + return false;
> > }
> > }
> > - return is_stream ? NULL : &kiocb->ki_pos;
> > + *ppos = is_stream ? NULL : &kiocb->ki_pos;
> > + return false;
> > +}
> > +
> > +static inline void
> > +io_kiocb_done_pos(struct io_kiocb *req, struct kiocb *kiocb, u64
> > actual)
>
> That's a lot of inlining, I wouldn't be surprised if the compiler
> will even refuse to do that.
>
> io_kiocb_done_pos() {
> // rest of it
> }
>
> inline io_kiocb_done_pos() {
> if (!(flags & CUR_POS));
> return;
> __io_kiocb_done_pos();
> }
>
> io_kiocb_update_pos() is huge as well
Good idea, will split the slower paths out.
>
> > +{
> > + u64 expected;
> > +
> > + if (likely(!(req->flags & REQ_F_CUR_POS)))
> > + return;
> > +
> > + expected = req->rw.len;
> > + if (actual >= expected)
> > + return;
> > +
> > + /*
> > + * It's not definitely safe to lock here, and the
> > assumption is,
> > + * that if we cannot lock the position that it will be
> > changing,
> > + * and if it will be changing - then we can't update it
> > anyway
> > + */
> > + if (req->file->f_mode & FMODE_ATOMIC_POS
> > + && !mutex_trylock(&req->file->f_pos_lock))
> > + return;
> > +
> > + /*
> > + * now we want to move the pointer, but only if everything
> > is consistent
> > + * with how we left it originally
> > + */
> > + if (req->file->f_pos == kiocb->ki_pos + (expected -
> > actual))
> > + req->file->f_pos = kiocb->ki_pos;
>
> I wonder, is it good enough / safe to just assign it considering that
> the request was executed outside of locks? vfs_seek()?
No I do not think so - in the case of multiple r/w the same thing will
happen, even with no vfs_seek().
>
> > +
> > + /* else something else messed with f_pos and we can't do
> > anything */
> > +
> > + if (req->file->f_mode & FMODE_ATOMIC_POS)
> > + mutex_unlock(&req->file->f_pos_lock);
> > }
>
> Do we even care about races while reading it? E.g.
> pos = READ_ONCE();
I think so - if I remove all the locks the test cases fail.
>
> >
> > - ppos = io_kiocb_update_pos(req, kiocb);
> > -
> > ret = rw_verify_area(READ, req->file, ppos, req->result);
> > if (unlikely(ret)) {
> > kfree(iovec);
> > + io_kiocb_done_pos(req, kiocb, 0);
>
> Why do we update it on failure?
>
> [...]
>
> > - ppos = io_kiocb_update_pos(req, kiocb);
> > -
> > ret = rw_verify_area(WRITE, req->file, ppos, req->result);
> > if (unlikely(ret))
> > goto out_free;
> > @@ -3858,6 +3912,7 @@ static int io_write(struct io_kiocb *req,
> > unsigned int issue_flags)
> > return ret ?: -EAGAIN;
> > }
> > out_free:
> > + io_kiocb_done_pos(req, kiocb, 0);
>
> Looks weird. It appears we don't need it on failure and
> successes are covered by kiocb_done() / ->ki_complete
>
> > /* it's reportedly faster than delegating the null check to
> > kfree() */
> > if (iovec)
> > kfree(iovec);
>
next prev parent reply other threads:[~2022-02-22 8:27 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-21 14:16 [PATCH v2 0/4] io_uring: consistent behaviour with linked read/write Dylan Yudaken
2022-02-21 14:16 ` [PATCH v2 1/4] io_uring: remove duplicated calls to io_kiocb_ppos Dylan Yudaken
2022-02-21 14:16 ` [PATCH v2 2/4] io_uring: update kiocb->ki_pos at execution time Dylan Yudaken
2022-02-21 16:32 ` Jens Axboe
2022-02-21 14:16 ` [PATCH v2 3/4] io_uring: do not recalculate ppos unnecessarily Dylan Yudaken
2022-02-21 14:16 ` [PATCH v2 4/4] io_uring: pre-increment f_pos on rw Dylan Yudaken
2022-02-21 18:00 ` Pavel Begunkov
2022-02-22 7:20 ` Hao Xu
2022-02-22 8:26 ` Dylan Yudaken [this message]
2022-02-22 7:34 ` Hao Xu
2022-02-22 10:52 ` Dylan Yudaken
2022-02-21 16:33 ` [PATCH v2 0/4] io_uring: consistent behaviour with linked read/write Jens Axboe
2022-02-21 17:48 ` Dylan Yudaken
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7cb73276b793dbae411938d7b84e20d8a2356749.camel@fb.com \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox