public inbox for [email protected]
 help / color / mirror / Atom feed
From: Bernd Schubert <[email protected]>
To: Kent Overstreet <[email protected]>,
	Bernd Schubert <[email protected]>
Cc: Miklos Szeredi <[email protected]>,
	Amir Goldstein <[email protected]>,
	"[email protected]" <[email protected]>,
	Andrew Morton <[email protected]>,
	"[email protected]" <[email protected]>,
	Ingo Molnar <[email protected]>,
	Peter Zijlstra <[email protected]>,
	Andrei Vagin <[email protected]>,
	"[email protected]" <[email protected]>
Subject: Re: [PATCH RFC v2 00/19] fuse: fuse-over-io-uring
Date: Wed, 12 Jun 2024 15:40:14 +0000	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <olaitdmh662osparvdobr267qgjitygkl7lt7zdiyyi6ee6jlc@xaashssdxwxm>

On 6/12/24 16:19, Kent Overstreet wrote:
> On Wed, Jun 12, 2024 at 03:53:42PM GMT, Bernd Schubert wrote:
>> I will definitely look at it this week. Although I don't like the idea
>> to have a new kthread. We already have an application thread and have
>> the fuse server thread, why do we need another one?
> 
> Ok, I hadn't found the fuse server thread - that should be fine.
> 
>>>
>>> The next thing I was going to look at is how you guys are using splice,
>>> we want to get away from that too.
>>
>> Well, Ming Lei is working on that for ublk_drv and I guess that new approach
>> could be adapted as well onto the current way of io-uring.
>> It _probably_ wouldn't work with IORING_OP_READV/IORING_OP_WRITEV.
>>
>> https://lore.gnuweeb.org/io-uring/[email protected]/T/
>>
>>>
>>> Brian was also saying the fuse virtio_fs code may be worth
>>> investigating, maybe that could be adapted?
>>
>> I need to check, but really, the majority of the new additions
>> is just to set up things, shutdown and to have sanity checks.
>> Request sending/completing to/from the ring is not that much new lines.
> 
> What I'm wondering is how read/write requests are handled. Are the data
> payloads going in the same ringbuffer as the commands? That could work,
> if the ringbuffer is appropriately sized, but alignment is a an issue.

That is exactly the big discussion Miklos and I have. Basically in my
series another buffer is vmalloced, mmaped and then assigned to ring entries.
Fuse meta headers and application payload goes into that buffer.
In both kernel/userspace directions. io-uring only allows 80B, so only a
really small request would fit into it.
Legacy /dev/fuse has an alignment issue as payload follows directly as the fuse
header - intrinsically fixed in the ring patches.

I will now try without mmap and just provide a user buffer as pointer in the 80B
section.


> 
> We just looked up the device DMA requirements and with modern NVME only
> 4 byte alignment is required, but the block layer likely isn't set up to
> handle that.

I think existing fuse headers have and their data have a 4 byte alignment.
Maybe even 8 byte, I don't remember without looking through all request types.
If you try a simple O_DIRECT read/write to libfuse/example_passthrough_hp
without the ring patches it will fail because of alignment. Needs to be fixed
in legacy fuse and would also avoid compat issues we had in libfuse when the
kernel header was updated.

> 
> So - prearranged buffer? Or are you using splice to get pages that
> userspace has read into into the kernel pagecache?

I didn't even try to use splice yet, because for the DDN (my employer) use case
we cannot use  zero copy, at least not without violating the rule that one
cannot access the application buffer in userspace.

I will definitely look into Mings work, as it will be useful for others.


Cheers,
Bernd

  reply	other threads:[~2024-06-12 15:40 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-29 18:00 [PATCH RFC v2 00/19] fuse: fuse-over-io-uring Bernd Schubert
2024-05-29 18:00 ` [PATCH RFC v2 19/19] fuse: {uring} Optimize async sends Bernd Schubert
2024-05-31 16:24   ` Jens Axboe
2024-05-31 17:36     ` Bernd Schubert
2024-05-31 19:10       ` Jens Axboe
2024-06-01 16:37         ` Bernd Schubert
2024-05-30  7:07 ` [PATCH RFC v2 00/19] fuse: fuse-over-io-uring Amir Goldstein
2024-05-30 12:09   ` Bernd Schubert
2024-05-30 15:36 ` Kent Overstreet
2024-05-30 16:02   ` Bernd Schubert
2024-05-30 16:10     ` Kent Overstreet
2024-05-30 16:17       ` Bernd Schubert
2024-05-30 17:30         ` Kent Overstreet
2024-05-30 19:09         ` Josef Bacik
2024-05-30 20:05           ` Kent Overstreet
2024-05-31  3:53         ` [PATCH] fs: sys_ringbuffer() (WIP) Kent Overstreet
2024-05-31 13:11           ` kernel test robot
2024-05-31 15:49           ` kernel test robot
2024-05-30 16:21     ` [PATCH RFC v2 00/19] fuse: fuse-over-io-uring Jens Axboe
2024-05-30 16:32       ` Bernd Schubert
2024-05-30 17:26         ` Jens Axboe
2024-05-30 17:16       ` Kent Overstreet
2024-05-30 17:28         ` Jens Axboe
2024-05-30 17:58           ` Kent Overstreet
2024-05-30 18:48             ` Jens Axboe
2024-05-30 19:35               ` Kent Overstreet
2024-05-31  0:11                 ` Jens Axboe
2024-06-04 23:45       ` Ming Lei
2024-05-30 20:47 ` Josef Bacik
2024-06-11  8:20 ` Miklos Szeredi
2024-06-11 10:26   ` Bernd Schubert
2024-06-11 15:35     ` Miklos Szeredi
2024-06-11 17:37       ` Bernd Schubert
2024-06-11 23:35         ` Kent Overstreet
2024-06-12 13:53           ` Bernd Schubert
2024-06-12 14:19             ` Kent Overstreet
2024-06-12 15:40               ` Bernd Schubert [this message]
2024-06-12 15:55                 ` Kent Overstreet
2024-06-12 16:15                   ` Bernd Schubert
2024-06-12 16:24                     ` Kent Overstreet
2024-06-12 16:44                       ` Bernd Schubert
2024-06-12  7:39         ` Miklos Szeredi
2024-06-12 13:32           ` Bernd Schubert
2024-06-12 13:46             ` Bernd Schubert
2024-06-12 14:07             ` Miklos Szeredi
2024-06-12 14:56               ` Bernd Schubert
2024-08-02 23:03                 ` Bernd Schubert
2024-08-29 22:32                 ` Bernd Schubert
2024-08-30 13:12                   ` Jens Axboe
2024-08-30 13:28                     ` Bernd Schubert
2024-08-30 13:33                       ` Jens Axboe
2024-08-30 14:55                         ` Pavel Begunkov
2024-08-30 15:10                           ` Bernd Schubert
2024-08-30 20:08                           ` Jens Axboe
2024-08-31  0:02                             ` Bernd Schubert
2024-08-31  0:49                               ` Bernd Schubert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox