public inbox for [email protected]
 help / color / mirror / Atom feed
From: Bernd Schubert <[email protected]>
To: Jens Axboe <[email protected]>, Bernd Schubert <[email protected]>,
	Miklos Szeredi <[email protected]>,
	Amir Goldstein <[email protected]>,
	"[email protected]" <[email protected]>
Cc: "[email protected]" <[email protected]>
Subject: Re: [PATCH RFC v2 19/19] fuse: {uring} Optimize async sends
Date: Sat, 1 Jun 2024 18:37:01 +0200	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>



On 5/31/24 21:10, Jens Axboe wrote:
> On 5/31/24 11:36 AM, Bernd Schubert wrote:
>> On 5/31/24 18:24, Jens Axboe wrote:
>>> On 5/29/24 12:00 PM, Bernd Schubert wrote:
>>>> This is to avoid using async completion tasks
>>>> (i.e. context switches) when not needed.
>>>>
>>>> Cc: [email protected]
>>>> Signed-off-by: Bernd Schubert <[email protected]>
>>>
>>> This patch is very confusing, even after having pulled the other
>>> changes. In general, would be great if the io_uring list was CC'ed on
>>
>> Hmm, let me try to explain. And yes, I definitely need to add these details 
>> to the commit message
>>
>> Without the patch:
>>
>> <sending a struct fuse_req> 
>>
>> fuse_uring_queue_fuse_req
>>     fuse_uring_send_to_ring
>>         io_uring_cmd_complete_in_task
>>         
>> <async task runs>
>>     io_uring_cmd_done()
> 
> And this is a worthwhile optimization, you always want to complete it
> line if at all possible. But none of this logic or code belongs in fuse,
> it really should be provided by io_uring helpers.
> 
> I would just drop this patch for now and focus on the core
> functionality. Send out a version with that, and then we'll be happy to
> help this as performant as it can be. This is where the ask on "how to
> reproduce your numbers" comes from - with that, it's usually trivial to
> spot areas where things could be improved. And I strongly suspect that
> will involve providing you with the right API to use here, and perhaps
> refactoring a bit on the fuse side. Making up issue_flags is _really_
> not something a user should do.

Great that you agree, I don't like the issue_flag handling in fuse code either. 
I will also follow your suggestion to drop this patch. 


> 
>> 1) (current == queue->server_task)
>> fuse_uring_cmd (IORING_OP_URING_CMD) received a completion for a 
>> previous fuse_req, after completion it fetched the next fuse_req and 
>> wants to send it - for 'current == queue->server_task' issue flags
>> got stored in struct fuse_ring_queue::uring_cmd_issue_flags
> 
> And queue->server_task is the owner of the ring? Then yes that is safe

Yeah, it is the thread that submits SQEs - should be the owner of the ring, 
unless daemon side does something wrong (given that there are several
userspace implementation and not a single libfuse only, we need to expect
and handle implementation errors, though).

>>
>> 2) 'else if (current->io_uring)'
>>
>> (actually documented in the code)
>>
>> 2.1 This might be through IORING_OP_URING_CMD as well, but then server 
>> side uses multiple threads to access the same ring - not nice. We only
>> store issue_flags into the queue for 'current == queue->server_task', so
>> we do not know issue_flags - sending through task is needed.
> 
> What's the path leading to you not having the issue_flags?

We get issue flags here, but I want to keep changes to libfuse small and want
to avoid changing non uring related function signatures. Which is the the
why we store issue_flags for the presumed ring owner thread in the queue data
structure, but we don't have it for possible other threads then

Example:

IORING_OP_URING_CMD
   fuse_uring_cmd
       fuse_uring_commit_and_release
           fuse_uring_req_end_and_get_next --> until here issue_flags passed
               fuse_request_end -> generic fuse function,  issue_flags not passed
                   req->args->end() / fuse_writepage_end
                       fuse_simple_background
                           fuse_request_queue_background
                               fuse_request_queue_background_uring
                                   fuse_uring_queue_fuse_req
                                       fuse_uring_send_to_ring
                                           io_uring_cmd_done
                   
      
I.e. we had issue_flags up to fuse_uring_req_end_and_get_next(), but then
call into generic fuse functions and stop passing through issue_flags.
For the ring-owner we take issue flags stored by fuse_uring_cmd()
into struct fuse_ring_queue, but if daemon side uses multiple threads to
access the ring we won't have that. Well, we could allow it and store
it into an array or rb-tree, but I don't like that multiple threads access
something that is optimized to have a thread per core already.

> 
>> 2.2 This might be an application request through the mount point, through
>> the io-uring interface. We do know issue flags either.
>> (That one was actually a surprise for me, when xfstests caught it.
>> Initially I had a condition to send without the extra task then lockdep
>> caught that.
> 
> In general, if you don't know the context (eg you don't have issue_flags
> passed in), you should probably assume the only way is to sanely proceed
> is to have it processed by the task itself.
> 
>>
>> In both cases it has to use a tasks.
>>
>>
>> My question here is if 'current->io_uring' is reliable.
> 
> Yes that will be reliable in the sense that it tells you that the
> current task has (at least) one io_uring context setup. But it doesn't
> tell you anything beyond that, like if it's the owner of this request.

Yeah, you can see that it just checks for current->io_uring and then
uses a task.

> 
>> 3) everything else
>>
>> 3.1) For async requests, interesting are cached reads and writes here. At a minimum
>> writes a holding a spin lock and that lock conflicts with the mutex io-uring is taking - 
>> we need a task as well
>>
>> 3.2) sync - no lock being hold, it can send without the extra task.
> 
> As mentioned, let's drop this patch 19 for now. Send out what you have
> with instructions on how to test it, and I'll give it a spin and see
> what we can do about this.
> 
>>> Outside of that, would be super useful to include a blurb on how you set
>>> things up for testing, and how you run the testing. That would really
>>> help in terms of being able to run and test it, and also to propose
>>> changes that might make a big difference.
>>>
>>
>> Will do in the next version. 
>> You basically need my libfuse uring branch
>> (right now commit history is not cleaned up) and follow
>> instructions in <libfuse>/xfstests/README.md how to run xfstests.
>> Missing is a slight patch for that dir to set extra daemon parameters,
>> like direct-io (fuse' FOPEN_DIRECT_IO) and io-uring. Will add that libfuse
>> during the next days.
> 
> I'll leave the xfstests to you for now, but running some perf testing
> just to verify how it's being used would be useful and help improve it
> for sure.
> 

Ah you meant performance tests. I used libfuse/example/passthrough_hp from
my uring branch and then fio on top of that for reads/writes and mdtest from
the ior repo for metadata. Maybe I should upload my scripts somewhere.


Thanks,
Beernd

  reply	other threads:[~2024-06-01 16:37 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-29 18:00 [PATCH RFC v2 00/19] fuse: fuse-over-io-uring Bernd Schubert
2024-05-29 18:00 ` [PATCH RFC v2 19/19] fuse: {uring} Optimize async sends Bernd Schubert
2024-05-31 16:24   ` Jens Axboe
2024-05-31 17:36     ` Bernd Schubert
2024-05-31 19:10       ` Jens Axboe
2024-06-01 16:37         ` Bernd Schubert [this message]
2024-05-30  7:07 ` [PATCH RFC v2 00/19] fuse: fuse-over-io-uring Amir Goldstein
2024-05-30 12:09   ` Bernd Schubert
2024-05-30 15:36 ` Kent Overstreet
2024-05-30 16:02   ` Bernd Schubert
2024-05-30 16:10     ` Kent Overstreet
2024-05-30 16:17       ` Bernd Schubert
2024-05-30 17:30         ` Kent Overstreet
2024-05-30 19:09         ` Josef Bacik
2024-05-30 20:05           ` Kent Overstreet
2024-05-31  3:53         ` [PATCH] fs: sys_ringbuffer() (WIP) Kent Overstreet
2024-05-31 13:11           ` kernel test robot
2024-05-31 15:49           ` kernel test robot
2024-05-30 16:21     ` [PATCH RFC v2 00/19] fuse: fuse-over-io-uring Jens Axboe
2024-05-30 16:32       ` Bernd Schubert
2024-05-30 17:26         ` Jens Axboe
2024-05-30 17:16       ` Kent Overstreet
2024-05-30 17:28         ` Jens Axboe
2024-05-30 17:58           ` Kent Overstreet
2024-05-30 18:48             ` Jens Axboe
2024-05-30 19:35               ` Kent Overstreet
2024-05-31  0:11                 ` Jens Axboe
2024-06-04 23:45       ` Ming Lei
2024-05-30 20:47 ` Josef Bacik
2024-06-11  8:20 ` Miklos Szeredi
2024-06-11 10:26   ` Bernd Schubert
2024-06-11 15:35     ` Miklos Szeredi
2024-06-11 17:37       ` Bernd Schubert
2024-06-11 23:35         ` Kent Overstreet
2024-06-12 13:53           ` Bernd Schubert
2024-06-12 14:19             ` Kent Overstreet
2024-06-12 15:40               ` Bernd Schubert
2024-06-12 15:55                 ` Kent Overstreet
2024-06-12 16:15                   ` Bernd Schubert
2024-06-12 16:24                     ` Kent Overstreet
2024-06-12 16:44                       ` Bernd Schubert
2024-06-12  7:39         ` Miklos Szeredi
2024-06-12 13:32           ` Bernd Schubert
2024-06-12 13:46             ` Bernd Schubert
2024-06-12 14:07             ` Miklos Szeredi
2024-06-12 14:56               ` Bernd Schubert
2024-08-02 23:03                 ` Bernd Schubert
2024-08-29 22:32                 ` Bernd Schubert
2024-08-30 13:12                   ` Jens Axboe
2024-08-30 13:28                     ` Bernd Schubert
2024-08-30 13:33                       ` Jens Axboe
2024-08-30 14:55                         ` Pavel Begunkov
2024-08-30 15:10                           ` Bernd Schubert
2024-08-30 20:08                           ` Jens Axboe
2024-08-31  0:02                             ` Bernd Schubert
2024-08-31  0:49                               ` Bernd Schubert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox