From: Hao Xu <[email protected]>
To: Pavel Begunkov <[email protected]>, Jens Axboe <[email protected]>
Cc: [email protected], Joseph Qi <[email protected]>
Subject: Re: [POC RFC 0/3] support graph like dependent sqes
Date: Sat, 18 Dec 2021 14:57:08 +0800 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
在 2021/12/18 上午3:33, Pavel Begunkov 写道:
> On 12/16/21 16:55, Hao Xu wrote:
>> 在 2021/12/15 上午2:16, Pavel Begunkov 写道:
>>> On 12/14/21 16:53, Hao Xu wrote:
>>>> 在 2021/12/14 下午11:21, Pavel Begunkov 写道:
>>>>> On 12/14/21 05:57, Hao Xu wrote:
>>>>>> This is just a proof of concept which is incompleted, send it
>>>>>> early for
>>>>>> thoughts and suggestions.
>>>>>>
>>>>>> We already have IOSQE_IO_LINK to describe linear dependency
>>>>>> relationship sqes. While this patchset provides a new feature to
>>>>>> support DAG dependency. For instance, 4 sqes have a relationship
>>>>>> as below:
>>>>>> --> 2 --
>>>>>> / \
>>>>>> 1 --- ---> 4
>>>>>> \ /
>>>>>> --> 3 --
>>>>>> IOSQE_IO_LINK serializes them to 1-->2-->3-->4, which unneccessarily
>>>>>> serializes 2 and 3. But a DAG can fully describe it.
>>>>>>
>>>>>> For the detail usage, see the following patches' messages.
>>>>>>
>>>>>> Tested it with 100 direct read sqes, each one reads a BS=4k block
>>>>>> data
>>>>>> in a same file, blocks are not overlapped. These sqes form a graph:
>>>>>> 2
>>>>>> 3
>>>>>> 1 --> 4 --> 100
>>>>>> ...
>>>>>> 99
>>>>>>
>>>>>> This is an extreme case, just to show the idea.
>>>>>>
>>>>>> results below:
>>>>>> io_link:
>>>>>> IOPS: 15898251
>>>>>> graph_link:
>>>>>> IOPS: 29325513
>>>>>> io_link:
>>>>>> IOPS: 16420361
>>>>>> graph_link:
>>>>>> IOPS: 29585798
>>>>>> io_link:
>>>>>> IOPS: 18148820
>>>>>> graph_link:
>>>>>> IOPS: 27932960
>>>>>
>>>>> Hmm, what do we compare here? IIUC,
>>>>> "io_link" is a huge link of 100 requests. Around 15898251 IOPS
>>>>> "graph_link" is a graph of diameter 3. Around 29585798 IOPS
>>>
>>> Diam 2 graph, my bad
>>>
>>>
>>>>> Is that right? If so it'd more more fair to compare with a
>>>>> similar graph-like scheduling on the userspace side.
>>>>
>>>> The above test is more like to show the disadvantage of LINK
>>>
>>> Oh yeah, links can be slow, especially when it kills potential
>>> parallelism or need extra allocations for keeping state, like
>>> READV and WRITEV.
>>>
>>>
>>>> But yes, it's better to test the similar userspace scheduling since
>>>>
>>>> LINK is definitely not a good choice so have to prove the graph stuff
>>>>
>>>> beat the userspace scheduling. Will test that soon. Thanks.
>>>
>>> Would be also great if you can also post the benchmark once
>>> it's done
>>
>> Wrote a new test to test nop sqes forming a full binary tree with
>> (2^10)-1 nodes,
>> which I think it a more general case. Turns out the result is still
>> not stable and
>> the kernel side graph link is much slow. I'll try to optimize it.
>
> That's expected unfortunately. And without reacting on results
> of previous requests, it's hard to imagine to be useful. BPF may
> have helped, e.g. not keeping an explicit graph but just generating
> new requests from the kernel... But apparently even with this it's
> hard to compete with just leaving it in userspace.
>
Tried to exclude the memory allocation stuff, seems it's a bit better
than the user graph.
For the result delivery, I was thinking of attaching BPF program within
a sqe, not creating
a single BPF type sqe. Then we can have data flow in the graph or
linkchain. But I haven't
had a clear draft for it
>> Btw, is there any comparison data between the current io link feature
>> and the
>> userspace scheduling.
>
> Don't remember. I'd try to look up the cover-letter for the patches
> implementing it, I believe there should've been some numbers and
> hopefully test description.
>
> fwiw, before io_uring mailing list got established patches/etc.
> were mostly going through linux-block mailing list. Links are old, so
> patches might be there.
>
next prev parent reply other threads:[~2021-12-18 6:57 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-14 5:57 [POC RFC 0/3] support graph like dependent sqes Hao Xu
2021-12-14 5:57 ` [PATCH 1/3] io_uring: add data structure for graph sqe feature Hao Xu
2021-12-14 5:57 ` [PATCH 2/3] io_uring: implement new sqe opcode to build graph like links Hao Xu
2021-12-14 5:57 ` [PATCH 3/3] io_uring: implement logic of IOSQE_GRAPH request Hao Xu
2021-12-14 15:21 ` [POC RFC 0/3] support graph like dependent sqes Pavel Begunkov
2021-12-14 16:53 ` Hao Xu
2021-12-14 18:16 ` Pavel Begunkov
2021-12-16 16:55 ` Hao Xu
2021-12-17 19:33 ` Pavel Begunkov
2021-12-18 6:57 ` Hao Xu [this message]
2021-12-21 16:19 ` Pavel Begunkov
2021-12-23 4:14 ` Hao Xu
2021-12-23 10:06 ` Christian Dietrich
2021-12-27 3:27 ` Hao Xu
2021-12-27 5:49 ` Christian Dietrich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=96155b9c-9f35-53b8-456a-8623fc850b03@linux.alibaba.com \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox