From: Stefan Metzmacher <[email protected]>
To: Jeremy Allison <[email protected]>,
Linus Torvalds <[email protected]>
Cc: Andy Lutomirski <[email protected]>, Jens Axboe <[email protected]>,
Linux API Mailing List <[email protected]>,
Dave Chinner <[email protected]>,
"[email protected]" <[email protected]>,
Matthew Wilcox <[email protected]>,
Al Viro <[email protected]>,
linux-fsdevel <[email protected]>,
Samba Technical <[email protected]>,
io-uring <[email protected]>
Subject: Re: copy on write for splice() from file to pipe?
Date: Fri, 10 Feb 2023 20:42:05 +0100 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <Y+aat8sggTtgff+A@jeremy-acer>
Am 10.02.23 um 20:27 schrieb Jeremy Allison:
> On Fri, Feb 10, 2023 at 11:18:05AM -0800, Linus Torvalds via samba-technical wrote:
>>
>> We should point the fingers at either the _user_ of splice - as Jeremy
>> Allison has done a couple of times - or we should point it at the sink
>> that cannot deal with unstable sources.
>> ....
>> - it sounds like the particular user in question (samba) already very
>> much has a reasonable model for "I have exclusive access to this" that
>> just wasn't used
>
> Having said that, I just had a phone discussion with Ralph Boehme
> on the Samba Team, who has been following along with this in
> read-only mode, and he did point out one case I had missed.
>
> 1). Client opens file with a lease. Hurrah, we think we can use splice() !
> 2). Client writes into file.
> 3). Client calls SMB_FLUSH to ensure data is on disk.
> 4). Client reads the data just wrtten to ensure it's good.
> 5). Client overwrites the previously written data.
>
> Now when client issues (4), the read request, if we
> zero-copy using splice() - I don't think theres a way
> we get notified when the data has finally left the
> system and the mapped splice memory in the buffer cache
> is safe to overwrite by the write (5).
>
> So the read in (4) could potentially return the data
> written in (5), if the buffer cache mapped memory has
> not yet been sent out over the network.
>
> That is certainly unexpected behavior for the client,
> even if the client leased the file.
>
> If that's the case, then splice() is unusable for
> Samba even in the leased file case.
I think we just need some coordination in userspace.
What might be helpful in addition would be some kind of
notification that all pages are no longer used by the network
layer, IORING_OP_SENDMSG_ZC already supports such a notification,
maybe we can build something similar.
>> Maybe this thread raised some awareness of it for some people, but
>> more realistically - maybe we can really document this whole issue
>> somewhere much more clearly
>
> Complete comprehensive documentation on this would
> be extremely helpful (to say the least :-).
Yes, good documentation is always good :-)
next prev parent reply other threads:[~2023-02-10 19:43 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-09 13:55 copy on write for splice() from file to pipe? Stefan Metzmacher
2023-02-09 14:11 ` Matthew Wilcox
2023-02-09 14:29 ` Stefan Metzmacher
2023-02-09 16:41 ` Linus Torvalds
2023-02-09 19:17 ` Stefan Metzmacher
2023-02-09 19:36 ` Linus Torvalds
2023-02-09 19:48 ` Linus Torvalds
2023-02-09 20:33 ` Jeremy Allison
2023-02-10 20:45 ` Stefan Metzmacher
2023-02-10 20:51 ` Linus Torvalds
2023-02-10 2:16 ` Dave Chinner
2023-02-10 4:06 ` Dave Chinner
2023-02-10 4:44 ` Matthew Wilcox
2023-02-10 6:57 ` Dave Chinner
2023-02-10 15:14 ` Andy Lutomirski
2023-02-10 16:33 ` Linus Torvalds
2023-02-10 17:57 ` Andy Lutomirski
2023-02-10 18:19 ` Jeremy Allison
2023-02-10 19:29 ` Stefan Metzmacher
2023-02-10 18:37 ` Linus Torvalds
2023-02-10 19:01 ` Andy Lutomirski
2023-02-10 19:18 ` Linus Torvalds
2023-02-10 19:27 ` Jeremy Allison
2023-02-10 19:42 ` Stefan Metzmacher [this message]
2023-02-10 19:42 ` Linus Torvalds
2023-02-10 19:54 ` Stefan Metzmacher
2023-02-10 19:29 ` Linus Torvalds
2023-02-13 9:07 ` Herbert Xu
2023-02-10 19:55 ` Andy Lutomirski
2023-02-10 20:27 ` Linus Torvalds
2023-02-10 20:32 ` Jens Axboe
2023-02-10 20:36 ` Linus Torvalds
2023-02-10 20:39 ` Jens Axboe
2023-02-10 20:44 ` Linus Torvalds
2023-02-10 20:50 ` Jens Axboe
2023-02-10 21:14 ` Andy Lutomirski
2023-02-10 21:27 ` Jens Axboe
2023-02-10 21:51 ` Jens Axboe
2023-02-10 22:08 ` Linus Torvalds
2023-02-10 22:16 ` Jens Axboe
2023-02-10 22:17 ` Linus Torvalds
2023-02-10 22:25 ` Jens Axboe
2023-02-10 22:35 ` Linus Torvalds
2023-02-10 22:51 ` Jens Axboe
2023-02-11 3:18 ` Ming Lei
2023-02-11 6:17 ` Ming Lei
2023-02-11 14:13 ` Jens Axboe
2023-02-11 15:05 ` Ming Lei
2023-02-11 15:33 ` Jens Axboe
2023-02-11 18:57 ` Linus Torvalds
2023-02-12 2:46 ` Jens Axboe
2023-02-10 4:47 ` Linus Torvalds
2023-02-10 6:19 ` Dave Chinner
2023-02-10 17:23 ` Linus Torvalds
2023-02-10 17:47 ` Linus Torvalds
2023-02-13 9:28 ` Herbert Xu
2023-02-10 22:41 ` David Laight
2023-02-10 22:51 ` Jens Axboe
2023-02-13 9:30 ` Herbert Xu
2023-02-13 9:25 ` Herbert Xu
2023-02-13 18:01 ` Andy Lutomirski
2023-02-14 1:22 ` Herbert Xu
2023-02-17 23:13 ` Andy Lutomirski
2023-02-20 4:54 ` Herbert Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox