public inbox for [email protected]
 help / color / mirror / Atom feed
From: Ming Lei <[email protected]>
To: Pavel Begunkov <[email protected]>
Cc: Jens Axboe <[email protected]>,
	[email protected], [email protected],
	Uday Shankar <[email protected]>,
	Akilesh Kailash <[email protected]>
Subject: Re: [PATCH V8 5/7] io_uring: support leased group buffer with REQ_F_GROUP_KBUF
Date: Mon, 4 Nov 2024 08:16:24 +0800	[thread overview]
Message-ID: <ZygSWB08t1PPyPyv@fedora> (raw)
In-Reply-To: <[email protected]>

On Sun, Nov 03, 2024 at 10:31:25PM +0000, Pavel Begunkov wrote:
> On 11/1/24 01:04, Ming Lei wrote:
> > On Thu, Oct 31, 2024 at 01:16:07PM +0000, Pavel Begunkov wrote:
> > > On 10/30/24 02:04, Ming Lei wrote:
> > > > On Wed, Oct 30, 2024 at 01:25:33AM +0000, Pavel Begunkov wrote:
> > > > > On 10/30/24 00:45, Ming Lei wrote:
> > > > > > On Tue, Oct 29, 2024 at 04:47:59PM +0000, Pavel Begunkov wrote:
> > > > > > > On 10/25/24 13:22, Ming Lei wrote:
> > > > > > > ...
> > > > > > > > diff --git a/io_uring/rw.c b/io_uring/rw.c
> > > > > > > > index 4bc0d762627d..5a2025d48804 100644
> > > > > > > > --- a/io_uring/rw.c
> > > > > > > > +++ b/io_uring/rw.c
> > > > > > > > @@ -245,7 +245,8 @@ static int io_prep_rw_setup(struct io_kiocb *req, int ddir, bool do_import)
> > > > > > > >      	if (io_rw_alloc_async(req))
> > > > > > > >      		return -ENOMEM;
> > > > > > > > -	if (!do_import || io_do_buffer_select(req))
> > > > > > > > +	if (!do_import || io_do_buffer_select(req) ||
> > > > > > > > +	    io_use_leased_grp_kbuf(req))
> > > > > > > >      		return 0;
> > > > > > > >      	rw = req->async_data;
> > > > > > > > @@ -489,6 +490,11 @@ static bool __io_complete_rw_common(struct io_kiocb *req, long res)
> > > > > > > >      		}
> > > > > > > >      		req_set_fail(req);
> > > > > > > >      		req->cqe.res = res;
> > > > > > > > +		if (io_use_leased_grp_kbuf(req)) {
> > > > > > > 
> > > > > > > That's what I'm talking about, we're pushing more and
> > > > > > > into the generic paths (or patching every single hot opcode
> > > > > > > there is). You said it's fine for ublk the way it was, i.e.
> > > > > > > without tracking, so let's then pretend it's a ublk specific
> > > > > > > feature, kill that addition and settle at that if that's the
> > > > > > > way to go.
> > > > > > 
> > > > > > As I mentioned before, it isn't ublk specific, zeroing is required
> > > > > > because the buffer is kernel buffer, that is all. Any other approach
> > > > > > needs this kind of handling too. The coming fuse zc need it.
> > > > > > 
> > > > > > And it can't be done in driver side, because driver has no idea how
> > > > > > to consume the kernel buffer.
> > > > > > 
> > > > > > Also it is only required in case of short read/recv, and it isn't
> > > > > > hot path, not mention it is just one check on request flag.
> > > > > 
> > > > > I agree, it's not hot, it's a failure path, and the recv side
> > > > > is of medium hotness, but the main concern is that the feature
> > > > > is too actively leaking into other requests.
> > > > The point is that if you'd like to support kernel buffer. If yes, this
> > > > kind of change can't be avoided.
> > > 
> > > There is no guarantee with the patchset that there will be any IO done
> > > with that buffer, e.g. place a nop into the group, and even then you
> > 
> > Yes, here it depends on user. In case of ublk, the application has to be
> > trusted, and the situation is same with other user-emulated storage, such
> > as qemu.
> > 
> > > have offsets and length, so it's not clear what the zeroying is supposed
> > > to achieve.
> > 
> > The buffer may bee one page cache page, if it isn't initialized
> > completely, kernel data may be leaked to userspace via mmap.
> > 
> > > Either the buffer comes fully "initialised", i.e. free of
> > > kernel private data, or we need to track what parts of the buffer were
> > > used.
> > 
> > That is why the only workable way is to zero the remainder in
> > consumer of OP, imo.
> 
> If it can leak kernel data in some way, I'm afraid zeroing of the
> remainder alone won't be enough to prevent it, e.g. the recv/read
> len doesn't have to match the buffer size.

The leased kernel buffer size is fixed, and the recv/read len is known
in case of short read/recv, the remainder part is known too, so can you
explain why zeroing remainder alone isn't enough?



Thanks,
Ming


  reply	other threads:[~2024-11-04  0:16 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-25 12:22 [PATCH V8 0/8] io_uring: support sqe group and leased group kbuf Ming Lei
2024-10-25 12:22 ` [PATCH V8 1/7] io_uring: add io_link_req() helper Ming Lei
2024-10-25 12:22 ` [PATCH V8 2/7] io_uring: add io_submit_fail_link() helper Ming Lei
2024-10-25 12:22 ` [PATCH V8 3/7] io_uring: add helper of io_req_commit_cqe() Ming Lei
2024-10-25 12:22 ` [PATCH V8 4/7] io_uring: support SQE group Ming Lei
2024-10-29  0:12   ` Jens Axboe
2024-10-29  1:50     ` Ming Lei
2024-10-29 16:38       ` Pavel Begunkov
2024-10-31 21:24   ` Jens Axboe
2024-10-31 21:39     ` Jens Axboe
2024-11-01  0:00       ` Jens Axboe
2024-10-25 12:22 ` [PATCH V8 5/7] io_uring: support leased group buffer with REQ_F_GROUP_KBUF Ming Lei
2024-10-29 16:47   ` Pavel Begunkov
2024-10-30  0:45     ` Ming Lei
2024-10-30  1:25       ` Pavel Begunkov
2024-10-30  2:04         ` Ming Lei
2024-10-31 13:16           ` Pavel Begunkov
2024-11-01  1:04             ` Ming Lei
2024-11-03 22:31               ` Pavel Begunkov
2024-11-04  0:16                 ` Ming Lei [this message]
2024-11-04  1:08                   ` Pavel Begunkov
2024-11-04  1:21                     ` Ming Lei
2024-11-04 12:23                       ` Pavel Begunkov
2024-11-04 13:08                         ` Ming Lei
2024-11-04 13:24                           ` Pavel Begunkov
2024-11-04 13:35                             ` Ming Lei
2024-11-04 16:38                               ` Pavel Begunkov
2024-11-05  3:37                                 ` Ming Lei
2024-10-25 12:22 ` [PATCH V8 6/7] io_uring/uring_cmd: support leasing device kernel buffer to io_uring Ming Lei
2024-10-25 12:22 ` [PATCH V8 7/7] ublk: support leasing io " Ming Lei
2024-10-29 17:01 ` [PATCH V8 0/8] io_uring: support sqe group and leased group kbuf Pavel Begunkov
2024-10-29 17:04   ` Jens Axboe
2024-10-29 19:18     ` Jens Axboe
2024-10-29 20:06       ` Jens Axboe
2024-10-29 21:26         ` Jens Axboe
2024-10-30  2:03           ` Ming Lei
2024-10-30  2:43             ` Jens Axboe
2024-10-30  3:08               ` Ming Lei
2024-10-30  4:11                 ` Ming Lei
2024-10-30 13:20                   ` Jens Axboe
2024-10-31  2:53                     ` Ming Lei
2024-10-31 13:35                       ` Jens Axboe
2024-10-31 15:07                         ` Jens Axboe
2024-11-01  2:57                           ` Ming Lei
2024-11-01  1:39                         ` Ming Lei
2024-10-31 13:42                       ` Pavel Begunkov
2024-10-30 13:18                 ` Jens Axboe
2024-10-31 13:25               ` Pavel Begunkov
2024-10-31 14:29                 ` Jens Axboe
2024-10-31 15:25                   ` Pavel Begunkov
2024-10-31 15:42                     ` Jens Axboe
2024-10-31 16:29                       ` Pavel Begunkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZygSWB08t1PPyPyv@fedora \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox