From: Pavel Begunkov <[email protected]>
To: Ming Lei <[email protected]>, Jens Axboe <[email protected]>
Cc: [email protected], [email protected],
Uday Shankar <[email protected]>,
Akilesh Kailash <[email protected]>
Subject: Re: [PATCH V8 0/8] io_uring: support sqe group and leased group kbuf
Date: Thu, 31 Oct 2024 13:42:55 +0000 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <ZyLxJdn7bboZMAcs@fedora>
On 10/31/24 02:53, Ming Lei wrote:
> On Wed, Oct 30, 2024 at 07:20:48AM -0600, Jens Axboe wrote:
>> On 10/29/24 10:11 PM, Ming Lei wrote:
>>> On Wed, Oct 30, 2024 at 11:08:16AM +0800, Ming Lei wrote:
>>>> On Tue, Oct 29, 2024 at 08:43:39PM -0600, Jens Axboe wrote:
>>>
>>> ...
>>>
>>>>> You could avoid the OP dependency with just a flag, if you really wanted
>>>>> to. But I'm not sure it makes a lot of sense. And it's a hell of a lot
>>>>
>>>> Yes, IO_LINK won't work for submitting multiple IOs concurrently, extra
>>>> syscall makes application too complicated, and IO latency is increased.
>>>>
>>>>> simpler than the sqe group scheme, which I'm a bit worried about as it's
>>>>> a bit complicated in how deep it needs to go in the code. This one
>>>>> stands alone, so I'd strongly encourage we pursue this a bit further and
>>>>> iron out the kinks. Maybe it won't work in the end, I don't know, but it
>>>>> seems pretty promising and it's soooo much simpler.
>>>>
>>>> If buffer register and lookup are always done in ->prep(), OP dependency
>>>> may be avoided.
>>>
>>> Even all buffer register and lookup are done in ->prep(), OP dependency
>>> still can't be avoided completely, such as:
>>>
>>> 1) two local buffers for sending to two sockets
>>>
>>> 2) group 1: IORING_OP_LOCAL_KBUF1 & [send(sock1), send(sock2)]
>>>
>>> 3) group 2: IORING_OP_LOCAL_KBUF2 & [send(sock1), send(sock2)]
>>>
>>> group 1 and group 2 needs to be linked, but inside each group, the two
>>> sends may be submitted in parallel.
>>
>> That is where groups of course work, in that you can submit 2 groups and
>> have each member inside each group run independently. But I do think we
>> need to decouple the local buffer and group concepts entirely. For the
>> first step, getting local buffers working with zero copy would be ideal,
>> and then just live with the fact that group 1 needs to be submitted
>> first and group 2 once the first ones are done.
>
> IMHO, it is one _kernel_ zero copy(_performance_) feature, which often imply:
>
> - better performance expectation
> - no big change on existed application for using this feature
Yes, the feature doesn't make sense without appropriate performance
wins, but I outright disagree with "there should be no big uapi
changes to use it". It might be nice if the user doesn't have to
change anything, but I find it of lower priority than performance,
clarity of the overall design and so on.
> Application developer is less interested in sort of crippled or immature
> feature, especially need big change on existed code logic(then two code paths
> need to maintain), with potential performance regression.
Then we just need avoid creating a "crippled" feature, I believe
everyone is on the same page here. As for maturity, features don't
get there at the same pace, extra layers of complexity definitely
do make getting into shape much slower. You can argue you like how
the uapi turned to be, though I believe there are still rough edges
if we consider it a generic feature, but the kernel side of things is
fairly complicated.
> With sqe group and REQ_F_GROUP_KBUF, application just needs lines of
> code change for using the feature, and it is pretty easy to evaluate
> the feature since no any extra logic change & no extra syscall/wait
> introduced. The whole patchset has been mature enough, unfortunately
> blocked without obvious reasons.
>
>>
>> Once local buffers are done, we can look at doing the sqe grouping in a
>> nice way. I do think it's a potentially powerful concept, but we're
>> going to make a lot more progress on this issue if we carefully separate
>> dependencies and get each of them done separately.
>
> One fundamental difference between local buffer and REQ_F_GROUP_KBUF is
>
> - local buffer has to be provided and used in ->prep()
> - REQ_F_GROUP_KBUF needs to be provided in ->issue() instead of ->prep()
I'd need to take a look at that local buffer patch to say, but likely
there is a way to shift all of it to ->issue(), which would be more
aligned with fixed file resolution and how links use it.
--
Pavel Begunkov
next prev parent reply other threads:[~2024-10-31 13:42 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-25 12:22 [PATCH V8 0/8] io_uring: support sqe group and leased group kbuf Ming Lei
2024-10-25 12:22 ` [PATCH V8 1/7] io_uring: add io_link_req() helper Ming Lei
2024-10-25 12:22 ` [PATCH V8 2/7] io_uring: add io_submit_fail_link() helper Ming Lei
2024-10-25 12:22 ` [PATCH V8 3/7] io_uring: add helper of io_req_commit_cqe() Ming Lei
2024-10-25 12:22 ` [PATCH V8 4/7] io_uring: support SQE group Ming Lei
2024-10-29 0:12 ` Jens Axboe
2024-10-29 1:50 ` Ming Lei
2024-10-29 16:38 ` Pavel Begunkov
2024-10-25 12:22 ` [PATCH V8 5/7] io_uring: support leased group buffer with REQ_F_GROUP_KBUF Ming Lei
2024-10-29 16:47 ` Pavel Begunkov
2024-10-30 0:45 ` Ming Lei
2024-10-30 1:25 ` Pavel Begunkov
2024-10-30 2:04 ` Ming Lei
2024-10-31 13:16 ` Pavel Begunkov
2024-10-25 12:22 ` [PATCH V8 6/7] io_uring/uring_cmd: support leasing device kernel buffer to io_uring Ming Lei
2024-10-25 12:22 ` [PATCH V8 7/7] ublk: support leasing io " Ming Lei
2024-10-29 17:01 ` [PATCH V8 0/8] io_uring: support sqe group and leased group kbuf Pavel Begunkov
2024-10-29 17:04 ` Jens Axboe
2024-10-29 19:18 ` Jens Axboe
2024-10-29 20:06 ` Jens Axboe
2024-10-29 21:26 ` Jens Axboe
2024-10-30 2:03 ` Ming Lei
2024-10-30 2:43 ` Jens Axboe
2024-10-30 3:08 ` Ming Lei
2024-10-30 4:11 ` Ming Lei
2024-10-30 13:20 ` Jens Axboe
2024-10-31 2:53 ` Ming Lei
2024-10-31 13:35 ` Jens Axboe
2024-10-31 15:07 ` Jens Axboe
2024-10-31 13:42 ` Pavel Begunkov [this message]
2024-10-30 13:18 ` Jens Axboe
2024-10-31 13:25 ` Pavel Begunkov
2024-10-31 14:29 ` Jens Axboe
2024-10-31 15:25 ` Pavel Begunkov
2024-10-31 15:42 ` Jens Axboe
2024-10-31 16:29 ` Pavel Begunkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox