From: Bernd Schubert <[email protected]>
To: Jens Axboe <[email protected]>,
"[email protected]" <[email protected]>
Cc: Ming Lei <[email protected]>, Pavel Begunkov <[email protected]>
Subject: Re: SQPOLL / uring_cmd_iopoll
Date: Sat, 22 Apr 2023 13:40:24 +0000 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 4/22/23 04:13, Jens Axboe wrote:
> On 4/21/23 4:09?PM, Bernd Schubert wrote:
>> Hello,
>>
>> I was wondering if I could set up SQPOLL for fuse/IORING_OP_URING_CMD
>> and what would be the latency win. Now I get a bit confused what the
>> f_op->uring_cmd_iopoll() function is supposed to do.
>
> Certainly, you can use SQPOLL with anything. Whether or not it'd be a
> win depends a lot on what you're doing, rate of IO, etc.
>
> IOPOLL and SQPOLL are two different things. SQPOLL has a kernel side
> submission thread that polls for new SQ entries and submits them when it
> sees them. IOPOLL is a method for avoiding sleeping on waiting on CQ
> entries, where it will instead poll the target for completion instead.
> That's where ->uring_cmd_iopoll() comes in, that's the hook for polling
> for uring commands. For normal fs path read/write requests,
> ->uring_iopoll() is the hook that performs the same kind of action.
>
>> Is it just there to check if SQEs are can be completed as CQE? In rw.c
>
> Not sure I follow what you're trying to convey here, maybe you can
> expand on that? And maybe some of the confusion here is because of
> mixing up SQPOLL and IOPOLL?
Thanks a lot for your help!
I was confused when f_op->uring_cmd_iopoll gets called - if I needed to
check myself if the SQE was submitted or if this for CQE submission.
You already resolved my confusion with your comments below. Thanks a
lot for your help!
>
>> io_do_iopoll() it looks like this. I don't follow all code paths in
>> __io_sq_thread yet, but it looks a like it already checks if the ring
>> has new entries
>>
>> to_submit = io_sqring_entries(ctx);
>> ...
>> ret = io_submit_sqes(ctx, to_submit);
>>
>> --> it will eventually call into ->uring_cmd() ?
>
> The SQPOLL thread will pull off new SQEs, and those will then at some
> point hit ->issue() which is an opcode dependent method for issuing the
> actual request. Once it's been issued, if the ring is IOPOLL, then
> io_iopoll_req_issued() will get called which adds the request to an
> internal poll list. When someone does io_uring_enter(2) to wait for
> events on a ring with IOPOLL, it will iterate that list and call
> ->uring_cmd_iopoll() for uring_cmd requests, and ->uring_iopoll() for
> "normal" requests.
Thanks, that basically answered my SQE confusion - it does all itself.
>
> If the ring is using SQPOLL|IOPOLL, then the SQPOLL thread is also the
> one that does the polling. See __io_sq_thread() -> io_do_iopoll().
>
>> And then io_do_iopoll -> file->f_op->uring_cmd_iopoll is supposed to
>> check for available cq entries and will submit these? I.e. I just return
>> 1 if when the request is ready? And also ensure that
>> req->iopoll_completed is set?
>
> The callback polls for a completion on the target side, which will mark
> is as ->iopoll_completed = true. That still leaves them on the iopoll
> list, and io_do_iopoll() will spot that and post CQEs for them.
>
>> I'm also not sure what I should do with struct io_comp_batch * - I don't
>> have struct request *req_list anywhere in my fuse-uring changes, seems
>> to be blk-mq specific? So I should just ignore that parameter?
>
> Hard to say since the above is a bit confusing and I haven't seen your
> code, but you can always start off just passing NULL. That's fine and
> just doesn't do any completion batching. The latter may or may not be
> useful for your case, but in any case, it's fine to pass NULL.
I had send patches to fsdevel and given it is mostly fuse related, didn't
add you to CC
https://lwn.net/Articles/926773/
The code is also here
https://github.com/bsbernd/linux/tree/fuse-uring-for-6.2
https://github.com/bsbernd/libfuse/tree/uring
I got SQPOLL working using this simple function
/**
* This is called for requests when the ring is configured with
* IORING_SETUP_IOPOLL.
*/
int fuse_uring_cmd_poll(struct io_uring_cmd *cmd, struct io_comp_batch *iob,
unsigned int poll_flags)
{
/* Not much to be done here, when IORING_SETUP_IOPOLL is set
* io_uring_cmd_done() already sets req->iopoll_completed.
* The caller (io_do_iopoll) already checks this flag
* and won't enter this function at all then.
* When we get called we just need to return 0 and tell the
* caller that the cmd is not ready yet.
*/
return 0;
}
Just gave it a quick file creation/removal run with single threaded bonnie++
and performance is actually lower than before (around 8000 creates/s without
IORING_SETUP_SQPOLL (adding IORING_SETUP_IOPOLL doesn't help either) and
about 5000 creates/s with IORING_SETUP_SQPOLL). With plain /dev/fuse
it is about 2000 creates/s.
Main improvement comes from ensuring request submission
(application) and request handling (ring/thread) are on the same core.
I'm running into some scheduler issues, which I work around for now using
migrate_disable()/migrate_enable() in before/after fuse request waitq,
without that performance for metadata requests is similar to plain
/dev/fuse.
I will soon post an RFC v2 series with more benchmark results including
read and write.
>
>> Btw, this might be useful for ublk as well?
>
> Not sure what "this" is :-)
Ming already replied.
Thanks,
Bernd
next prev parent reply other threads:[~2023-04-22 13:40 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-21 22:09 SQPOLL / uring_cmd_iopoll Bernd Schubert
2023-04-22 2:13 ` Jens Axboe
2023-04-22 13:40 ` Bernd Schubert [this message]
2023-04-24 12:55 ` Bernd Schubert
2023-04-22 12:55 ` Ming Lei
2023-04-22 14:08 ` Jens Axboe
2023-04-23 1:06 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox