public inbox for [email protected]
 help / color / mirror / Atom feed
From: Ming Lei <[email protected]>
To: Miklos Szeredi <[email protected]>
Cc: Linus Torvalds <[email protected]>,
	Jens Axboe <[email protected]>,
	[email protected], [email protected],
	[email protected],
	Alexander Viro <[email protected]>,
	Stefan Hajnoczi <[email protected]>,
	Miklos Szeredi <[email protected]>,
	Bernd Schubert <[email protected]>,
	Nitesh Shetty <[email protected]>,
	Christoph Hellwig <[email protected]>,
	Ziyang Zhang <[email protected]>,
	[email protected]
Subject: Re: [PATCH 1/4] fs/splice: enhance direct pipe & splice for moving pages in kernel
Date: Tue, 14 Feb 2023 22:35:24 +0800	[thread overview]
Message-ID: <Y+ucLFG/ap8uqwPG@T590> (raw)
In-Reply-To: <CAJfpegtOetw46DvR1PeuX5L9-fe7Qk75mq5L4tGwpS_wuEz=1g@mail.gmail.com>

Hi Miklos,

On Tue, Feb 14, 2023 at 12:03:44PM +0100, Miklos Szeredi wrote:
> On Mon, 13 Feb 2023 at 21:04, Linus Torvalds
> <[email protected]> wrote:
> >
> > On Sat, Feb 11, 2023 at 5:39 PM Ming Lei <[email protected]> wrote:
> > >
> > > >
> > > >  (a) what's the point of MAY_READ? A non-readable page sounds insane
> > > > and wrong. All sinks expect to be able to read.
> > >
> > > For example, it is one page which needs sink end to fill data, so
> > > we needn't to zero it in source end every time, just for avoiding
> > > leak kernel data if (unexpected)sink end simply tried to read from
> > > the spliced page instead of writing data to page.
> >
> > I still don't understand.
> >
> > A sink *reads* the data. It doesn't write the data.
> 
> I think Ming is trying to generalize splice to allow flowing data in
> the opposite direction.

I think it isn't opposite direction, it is just that sink may be
WRITE to buffer, and the model is:

device(produce buffer in ->splice_read()) -> direct pipe ->
	file(consume buffer via READ or WRITE)

Follows kernel side implementation:

	splice_direct_to_actor(pipe, sd, source_actor)

	direct_actor():
		__splice_from_pipe(pipe, sd, sink_actor)

	sink_actor():
		get_page()

then read from file/socket to page.

The current userspace owns the whole buffer, so I understand the buffer
ownership can be transferred to consumer/sink side.

> So yes, sink would be writing to the buffer.
> And it MUST NOT be reading the data since the buffer may be
> uninitialized.

The added SPLICE_F_PRIV_FOR_READ[WRITE] is enough to avoid
un-expected READ, but the source end needs to confirm the buffer
ownership can be transferred to consumer, probably PIPE_BUF_FLAG_GIFT
can be used for this purpose.

> 
> The problem is how to tell the original source that the buffer is
> ready?  PG_uptodate comes to mind, but pipe buffers allow partial
> pages to be passed around, and there's no mechanism to describe a
> partially uptodate buffer.

I understand it isn't one issue from block device driver viewpoint at
least, since the WRITE to buffer in sink end can be thought as DMA
to buffer from device, and it is the upper layer(FS)'s responsibility
for updating page flag. And block driver needn't to handle page
status update.

So seems it is one fuse specific issue?


Thanks,
Ming


  reply	other threads:[~2023-02-14 14:36 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-10 15:32 [PATCH 0/4] io_uring: add IORING_OP_READ[WRITE]_SPLICE_BUF Ming Lei
2023-02-10 15:32 ` [PATCH 1/4] fs/splice: enhance direct pipe & splice for moving pages in kernel Ming Lei
2023-02-11 15:42   ` Ming Lei
2023-02-11 18:57     ` Linus Torvalds
2023-02-12  1:39       ` Ming Lei
2023-02-13 20:04         ` Linus Torvalds
2023-02-14  0:52           ` Ming Lei
2023-02-14  2:35             ` Ming Lei
2023-02-14 11:03           ` Miklos Szeredi
2023-02-14 14:35             ` Ming Lei [this message]
2023-02-14 15:39               ` Miklos Szeredi
2023-02-15  0:11                 ` Ming Lei
2023-02-15 10:36                   ` Miklos Szeredi
2023-02-10 15:32 ` [PATCH 2/4] fs/splice: allow to ignore signal in __splice_from_pipe Ming Lei
2023-02-10 15:32 ` [PATCH 3/4] io_uring: add IORING_OP_READ[WRITE]_SPLICE_BUF Ming Lei
2023-02-11 15:45   ` Jens Axboe
2023-02-11 16:12     ` Ming Lei
2023-02-11 16:52       ` Jens Axboe
2023-02-12  3:22         ` Ming Lei
2023-02-12  3:55           ` Jens Axboe
2023-02-13  1:06             ` Ming Lei
2023-02-11 17:13   ` Jens Axboe
2023-02-12  1:48     ` Ming Lei
2023-02-12  2:42       ` Jens Axboe
2023-02-10 15:32 ` [PATCH 4/4] ublk_drv: support splice based read/write zero copy Ming Lei
2023-02-10 21:54 ` [PATCH 0/4] io_uring: add IORING_OP_READ[WRITE]_SPLICE_BUF Jens Axboe
2023-02-10 22:19   ` Jens Axboe
2023-02-11  5:13   ` Ming Lei
2023-02-11 15:45     ` Jens Axboe
2023-02-14 16:36 ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y+ucLFG/ap8uqwPG@T590 \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox