public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jens Axboe <[email protected]>
To: Andy Lutomirski <[email protected]>
Cc: Linus Torvalds <[email protected]>,
	Dave Chinner <[email protected]>,
	Matthew Wilcox <[email protected]>,
	Stefan Metzmacher <[email protected]>,
	linux-fsdevel <[email protected]>,
	Linux API Mailing List <[email protected]>,
	io-uring <[email protected]>,
	"[email protected]" <[email protected]>,
	Al Viro <[email protected]>,
	Samba Technical <[email protected]>
Subject: Re: copy on write for splice() from file to pipe?
Date: Fri, 10 Feb 2023 14:27:17 -0700	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <CALCETrVx4cj7KrhaevtFN19rf=A6kauFTr7UPzQVage0MsBLrg@mail.gmail.com>

On 2/10/23 2:14?PM, Andy Lutomirski wrote:
> On Fri, Feb 10, 2023 at 12:50 PM Jens Axboe <[email protected]> wrote:
>>
>> On 2/10/23 1:44?PM, Linus Torvalds wrote:
>>> On Fri, Feb 10, 2023 at 12:39 PM Jens Axboe <[email protected]> wrote:
>>>>
>>>> Right, I'm referencing doing zerocopy data sends with io_uring, using
>>>> IORING_OP_SEND_ZC. This isn't from a file, it's from a memory location,
>>>> but the important bit here is the split notifications and how you
>>>> could wire up a OP_SENDFILE similarly to what Andy described.
>>>
>>> Sure, I think it's much more reasonable with io_uring than with splice itself.
>>>
>>> So I was mainly just reacting to the "strict-splice" thing where Andy
>>> was talking about tracking the page refcounts. I don't think anything
>>> like that can be done at a splice() level, but higher levels that
>>> actually know about the whole IO might be able to do something like
>>> that.
>>>
>>> Maybe we're just talking past each other.
>>
>> Maybe slightly, as I was not really intending to comment on the strict
>> splice thing. But yeah I agree on splice, it would not be trivial to do
>> there. At least with io_uring we have the communication channel we need.
>> And tracking page refcounts seems iffy and fraught with potential
>> issues.
>>
> 
> Hmm.
> 
> Are there any real-world use cases for zero-copy splice() that
> actually depend on splicing from a file to a pipe and then later from
> the pipe to a socket (or file or whatever)?  Or would everything
> important be covered by a potential new io_uring operation that copies
> from one fd directly to another fd?

I think it makes sense. As Linus has referenced, the sex appeal of
splice is the fact that it is dealing with pipes, and you can access
these internal buffers through other means. But that is probably largely
just something that is sexy design wise, nothing that _really_ matters
in practice. And the pipes do get in the way, for example I had to add
pipe resizing fcntl helpers to bump the size. If you're doing a plain
sendfile, the pipes just kind of get in the way too imho.

Another upside (from the io_uring) perspective is that splice isn't very
efficient through io_uring, as it requires offload to io-wq. This could
obviously be solved by some refactoring in terms of non-blocking, but it
hasn't really been that relevant (and nobody has complained about it). A
new sendfile op would nicely get around that too as it could be designed
with async in nature, rather than the classic sync syscall model that
splice follows.

-- 
Jens Axboe


  reply	other threads:[~2023-02-10 21:27 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-09 13:55 copy on write for splice() from file to pipe? Stefan Metzmacher
2023-02-09 14:11 ` Matthew Wilcox
2023-02-09 14:29   ` Stefan Metzmacher
2023-02-09 16:41 ` Linus Torvalds
2023-02-09 19:17   ` Stefan Metzmacher
2023-02-09 19:36     ` Linus Torvalds
2023-02-09 19:48       ` Linus Torvalds
2023-02-09 20:33         ` Jeremy Allison
2023-02-10 20:45         ` Stefan Metzmacher
2023-02-10 20:51           ` Linus Torvalds
2023-02-10  2:16   ` Dave Chinner
2023-02-10  4:06     ` Dave Chinner
2023-02-10  4:44       ` Matthew Wilcox
2023-02-10  6:57         ` Dave Chinner
2023-02-10 15:14           ` Andy Lutomirski
2023-02-10 16:33             ` Linus Torvalds
2023-02-10 17:57               ` Andy Lutomirski
2023-02-10 18:19                 ` Jeremy Allison
2023-02-10 19:29                   ` Stefan Metzmacher
2023-02-10 18:37                 ` Linus Torvalds
2023-02-10 19:01                   ` Andy Lutomirski
2023-02-10 19:18                     ` Linus Torvalds
2023-02-10 19:27                       ` Jeremy Allison
2023-02-10 19:42                         ` Stefan Metzmacher
2023-02-10 19:42                         ` Linus Torvalds
2023-02-10 19:54                           ` Stefan Metzmacher
2023-02-10 19:29                       ` Linus Torvalds
2023-02-13  9:07                         ` Herbert Xu
2023-02-10 19:55                       ` Andy Lutomirski
2023-02-10 20:27                         ` Linus Torvalds
2023-02-10 20:32                           ` Jens Axboe
2023-02-10 20:36                             ` Linus Torvalds
2023-02-10 20:39                               ` Jens Axboe
2023-02-10 20:44                                 ` Linus Torvalds
2023-02-10 20:50                                   ` Jens Axboe
2023-02-10 21:14                                     ` Andy Lutomirski
2023-02-10 21:27                                       ` Jens Axboe [this message]
2023-02-10 21:51                                         ` Jens Axboe
2023-02-10 22:08                                           ` Linus Torvalds
2023-02-10 22:16                                             ` Jens Axboe
2023-02-10 22:17                                             ` Linus Torvalds
2023-02-10 22:25                                               ` Jens Axboe
2023-02-10 22:35                                                 ` Linus Torvalds
2023-02-10 22:51                                                   ` Jens Axboe
2023-02-11  3:18                                             ` Ming Lei
2023-02-11  6:17                                               ` Ming Lei
2023-02-11 14:13                                               ` Jens Axboe
2023-02-11 15:05                                                 ` Ming Lei
2023-02-11 15:33                                                   ` Jens Axboe
2023-02-11 18:57                                                     ` Linus Torvalds
2023-02-12  2:46                                                       ` Jens Axboe
2023-02-10  4:47       ` Linus Torvalds
2023-02-10  6:19         ` Dave Chinner
2023-02-10 17:23           ` Linus Torvalds
2023-02-10 17:47             ` Linus Torvalds
2023-02-13  9:28               ` Herbert Xu
2023-02-10 22:41             ` David Laight
2023-02-10 22:51               ` Jens Axboe
2023-02-13  9:30               ` Herbert Xu
2023-02-13  9:25           ` Herbert Xu
2023-02-13 18:01             ` Andy Lutomirski
2023-02-14  1:22               ` Herbert Xu
2023-02-17 23:13                 ` Andy Lutomirski
2023-02-20  4:54                   ` Herbert Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox