public inbox for [email protected]
 help / color / mirror / Atom feed
From: Hao Xu <[email protected]>
To: Pavel Begunkov <[email protected]>, Jens Axboe <[email protected]>
Cc: [email protected], Joseph Qi <[email protected]>
Subject: Re: [POC RFC 0/3] support graph like dependent sqes
Date: Sat, 18 Dec 2021 14:57:08 +0800	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>


在 2021/12/18 上午3:33, Pavel Begunkov 写道:
> On 12/16/21 16:55, Hao Xu wrote:
>> 在 2021/12/15 上午2:16, Pavel Begunkov 写道:
>>> On 12/14/21 16:53, Hao Xu wrote:
>>>> 在 2021/12/14 下午11:21, Pavel Begunkov 写道:
>>>>> On 12/14/21 05:57, Hao Xu wrote:
>>>>>> This is just a proof of concept which is incompleted, send it 
>>>>>> early for
>>>>>> thoughts and suggestions.
>>>>>>
>>>>>> We already have IOSQE_IO_LINK to describe linear dependency
>>>>>> relationship sqes. While this patchset provides a new feature to
>>>>>> support DAG dependency. For instance, 4 sqes have a relationship
>>>>>> as below:
>>>>>>        --> 2 --
>>>>>>       /        \
>>>>>> 1 ---          ---> 4
>>>>>>       \        /
>>>>>>        --> 3 --
>>>>>> IOSQE_IO_LINK serializes them to 1-->2-->3-->4, which unneccessarily
>>>>>> serializes 2 and 3. But a DAG can fully describe it.
>>>>>>
>>>>>> For the detail usage, see the following patches' messages.
>>>>>>
>>>>>> Tested it with 100 direct read sqes, each one reads a BS=4k block 
>>>>>> data
>>>>>> in a same file, blocks are not overlapped. These sqes form a graph:
>>>>>>        2
>>>>>>        3
>>>>>> 1 --> 4 --> 100
>>>>>>       ...
>>>>>>        99
>>>>>>
>>>>>> This is an extreme case, just to show the idea.
>>>>>>
>>>>>> results below:
>>>>>> io_link:
>>>>>> IOPS: 15898251
>>>>>> graph_link:
>>>>>> IOPS: 29325513
>>>>>> io_link:
>>>>>> IOPS: 16420361
>>>>>> graph_link:
>>>>>> IOPS: 29585798
>>>>>> io_link:
>>>>>> IOPS: 18148820
>>>>>> graph_link:
>>>>>> IOPS: 27932960
>>>>>
>>>>> Hmm, what do we compare here? IIUC,
>>>>> "io_link" is a huge link of 100 requests. Around 15898251 IOPS
>>>>> "graph_link" is a graph of diameter 3. Around 29585798 IOPS
>>>
>>> Diam 2 graph, my bad
>>>
>>>
>>>>> Is that right? If so it'd more more fair to compare with a
>>>>> similar graph-like scheduling on the userspace side.
>>>>
>>>> The above test is more like to show the disadvantage of LINK
>>>
>>> Oh yeah, links can be slow, especially when it kills potential
>>> parallelism or need extra allocations for keeping state, like
>>> READV and WRITEV.
>>>
>>>
>>>> But yes, it's better to test the similar userspace  scheduling since
>>>>
>>>> LINK is definitely not a good choice so have to prove the graph stuff
>>>>
>>>> beat the userspace scheduling. Will test that soon. Thanks.
>>>
>>> Would be also great if you can also post the benchmark once
>>> it's done
>>
>> Wrote a new test to test nop sqes forming a full binary tree with 
>> (2^10)-1 nodes,
>> which I think it a more general case.  Turns out the result is still 
>> not stable and
>> the kernel side graph link is much slow. I'll try to optimize it.
>
> That's expected unfortunately. And without reacting on results
> of previous requests, it's hard to imagine to be useful. BPF may
> have helped, e.g. not keeping an explicit graph but just generating
> new requests from the kernel... But apparently even with this it's
> hard to compete with just leaving it in userspace.
>
Tried to exclude the memory allocation stuff, seems it's a bit better 
than the user graph.

For the result delivery, I was thinking of attaching BPF program within 
a sqe, not creating

a single BPF type sqe. Then we can have data flow in the graph or 
linkchain. But I haven't

had a clear draft for it

>> Btw, is there any comparison data between the current io link feature 
>> and the
>> userspace scheduling.
>
> Don't remember. I'd try to look up the cover-letter for the patches
> implementing it, I believe there should've been some numbers and
> hopefully test description.
>
> fwiw, before io_uring mailing list got established patches/etc.
> were mostly going through linux-block mailing list. Links are old, so
> patches might be there.
>

  reply	other threads:[~2021-12-18  6:57 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-14  5:57 [POC RFC 0/3] support graph like dependent sqes Hao Xu
2021-12-14  5:57 ` [PATCH 1/3] io_uring: add data structure for graph sqe feature Hao Xu
2021-12-14  5:57 ` [PATCH 2/3] io_uring: implement new sqe opcode to build graph like links Hao Xu
2021-12-14  5:57 ` [PATCH 3/3] io_uring: implement logic of IOSQE_GRAPH request Hao Xu
2021-12-14 15:21 ` [POC RFC 0/3] support graph like dependent sqes Pavel Begunkov
2021-12-14 16:53   ` Hao Xu
2021-12-14 18:16     ` Pavel Begunkov
2021-12-16 16:55       ` Hao Xu
2021-12-17 19:33         ` Pavel Begunkov
2021-12-18  6:57           ` Hao Xu [this message]
2021-12-21 16:19             ` Pavel Begunkov
2021-12-23  4:14               ` Hao Xu
2021-12-23 10:06 ` Christian Dietrich
2021-12-27  3:27   ` Hao Xu
2021-12-27  5:49     ` Christian Dietrich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=96155b9c-9f35-53b8-456a-8623fc850b03@linux.alibaba.com \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox