From: Linus Torvalds <[email protected]>
To: Jeremy Allison <[email protected]>
Cc: Andy Lutomirski <[email protected]>, Jens Axboe <[email protected]>,
Linux API Mailing List <[email protected]>,
Dave Chinner <[email protected]>,
"[email protected]" <[email protected]>,
Matthew Wilcox <[email protected]>,
Stefan Metzmacher <[email protected]>,
Al Viro <[email protected]>,
linux-fsdevel <[email protected]>,
Samba Technical <[email protected]>,
io-uring <[email protected]>
Subject: Re: copy on write for splice() from file to pipe?
Date: Fri, 10 Feb 2023 11:42:17 -0800 [thread overview]
Message-ID: <CAHk-=wgztjawo+nPjnJJdOe8rHcTYznD6u34TBzdstuTjpotbg@mail.gmail.com> (raw)
In-Reply-To: <Y+aat8sggTtgff+A@jeremy-acer>
On Fri, Feb 10, 2023 at 11:27 AM Jeremy Allison <[email protected]> wrote:
>
> 1). Client opens file with a lease. Hurrah, we think we can use splice() !
> 2). Client writes into file.
> 3). Client calls SMB_FLUSH to ensure data is on disk.
> 4). Client reads the data just wrtten to ensure it's good.
> 5). Client overwrites the previously written data.
>
> Now when client issues (4), the read request, if we
> zero-copy using splice() - I don't think theres a way
> we get notified when the data has finally left the
> system and the mapped splice memory in the buffer cache
> is safe to overwrite by the write (5).
Well, but we know that either:
(a) the client has already gotten the read reply, and does the write
afterwards. So (4) has already not just left the network stack, but
actually made it all the way to the client.
OR
(b) (4) and (5) clearly aren't ordered on the client side (ie your
"client" is not one single thread, and did an independent read and
overlapping write), and the client can't rely on one happening before
the other _anyway_.
So if it's (b), then you might as well do the write first, because
there's simply no ordering between the two. If you have a concurrent
read and a concurrent write to the same file, the read result is going
to be random anyway.
(And yes, you can find POSIX language specifies that writes are atomic
"all or nothing" operations, but Linux has actually never done that,
and it's literally a nonsensical requirement and not actually true in
any system: try doing a single gigabyte "write()" system call, and at
a minimum you'll see the file size grow when doing "stat()" calls in
another window. So it's purely "POSIX says that, but it bears no
relationship to the truth")
Or am I missing something?
Linus
next prev parent reply other threads:[~2023-02-10 19:44 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-09 13:55 copy on write for splice() from file to pipe? Stefan Metzmacher
2023-02-09 14:11 ` Matthew Wilcox
2023-02-09 14:29 ` Stefan Metzmacher
2023-02-09 16:41 ` Linus Torvalds
2023-02-09 19:17 ` Stefan Metzmacher
2023-02-09 19:36 ` Linus Torvalds
2023-02-09 19:48 ` Linus Torvalds
2023-02-09 20:33 ` Jeremy Allison
2023-02-10 20:45 ` Stefan Metzmacher
2023-02-10 20:51 ` Linus Torvalds
2023-02-10 2:16 ` Dave Chinner
2023-02-10 4:06 ` Dave Chinner
2023-02-10 4:44 ` Matthew Wilcox
2023-02-10 6:57 ` Dave Chinner
2023-02-10 15:14 ` Andy Lutomirski
2023-02-10 16:33 ` Linus Torvalds
2023-02-10 17:57 ` Andy Lutomirski
2023-02-10 18:19 ` Jeremy Allison
2023-02-10 19:29 ` Stefan Metzmacher
2023-02-10 18:37 ` Linus Torvalds
2023-02-10 19:01 ` Andy Lutomirski
2023-02-10 19:18 ` Linus Torvalds
2023-02-10 19:27 ` Jeremy Allison
2023-02-10 19:42 ` Stefan Metzmacher
2023-02-10 19:42 ` Linus Torvalds [this message]
2023-02-10 19:54 ` Stefan Metzmacher
2023-02-10 19:29 ` Linus Torvalds
2023-02-13 9:07 ` Herbert Xu
2023-02-10 19:55 ` Andy Lutomirski
2023-02-10 20:27 ` Linus Torvalds
2023-02-10 20:32 ` Jens Axboe
2023-02-10 20:36 ` Linus Torvalds
2023-02-10 20:39 ` Jens Axboe
2023-02-10 20:44 ` Linus Torvalds
2023-02-10 20:50 ` Jens Axboe
2023-02-10 21:14 ` Andy Lutomirski
2023-02-10 21:27 ` Jens Axboe
2023-02-10 21:51 ` Jens Axboe
2023-02-10 22:08 ` Linus Torvalds
2023-02-10 22:16 ` Jens Axboe
2023-02-10 22:17 ` Linus Torvalds
2023-02-10 22:25 ` Jens Axboe
2023-02-10 22:35 ` Linus Torvalds
2023-02-10 22:51 ` Jens Axboe
2023-02-11 3:18 ` Ming Lei
2023-02-11 6:17 ` Ming Lei
2023-02-11 14:13 ` Jens Axboe
2023-02-11 15:05 ` Ming Lei
2023-02-11 15:33 ` Jens Axboe
2023-02-11 18:57 ` Linus Torvalds
2023-02-12 2:46 ` Jens Axboe
2023-02-10 4:47 ` Linus Torvalds
2023-02-10 6:19 ` Dave Chinner
2023-02-10 17:23 ` Linus Torvalds
2023-02-10 17:47 ` Linus Torvalds
2023-02-13 9:28 ` Herbert Xu
2023-02-10 22:41 ` David Laight
2023-02-10 22:51 ` Jens Axboe
2023-02-13 9:30 ` Herbert Xu
2023-02-13 9:25 ` Herbert Xu
2023-02-13 18:01 ` Andy Lutomirski
2023-02-14 1:22 ` Herbert Xu
2023-02-17 23:13 ` Andy Lutomirski
2023-02-20 4:54 ` Herbert Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAHk-=wgztjawo+nPjnJJdOe8rHcTYznD6u34TBzdstuTjpotbg@mail.gmail.com' \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox