From: Pavel Begunkov <[email protected]>
To: Hao Xu <[email protected]>, Jens Axboe <[email protected]>
Cc: [email protected], Joseph Qi <[email protected]>
Subject: Re: [PATCH for-5.16 v3 0/8] task work optimization
Date: Thu, 28 Oct 2021 20:08:58 +0100 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 10/28/21 12:46, Hao Xu wrote:
> 在 2021/10/28 下午2:07, Hao Xu 写道:
>> 在 2021/10/28 上午2:15, Pavel Begunkov 写道:
>>> On 10/27/21 15:02, Hao Xu wrote:
>>>> Tested this patchset by manually replace __io_queue_sqe() in
>>>> io_queue_sqe() by io_req_task_queue() to construct 'heavy' task works.
>>>> Then test with fio:
>>>
>>> If submissions and completions are done by the same task it doesn't
>>> really matter in which order they're executed because the task won't
>>> get back to userspace execution to see CQEs until tw returns.
>> It may matter, it depends on the time cost of submittion
>> and the DMA IO time. Pick up sqpoll mode as example,
>> we submit 10 reqs:
>> t1 io_submit_sqes
>> -->io_req_task_queue
>> t2 io_task_work_run
>> we actually do the submittion in t2, but if the workload
>> is big engough, the 'irq completion TW' will be inserted
>> to the TW list after t2 is fully done, then those
>> 'irq completion TW' will be delayed to the next round.
>> With this patchset, we can handle them first.
>>> Furthermore, it even might be worse because the earlier you submit
>>> the better with everything else equal.
>>>
>>> IIRC, that's how it's with fio, right? If so, you may get better
>>> numbers with a test that does submissions and completions in
>>> different threads.
>> Because of the completion cache, I doubt if it works.
>> For single ctx, it seems we always update the cqring
>> pointer after all the TWs in the list are done.
> I suddenly realized sqpoll mode does submissions and completions
> in different threads, and in this situation this patchset always
> first commit_cqring() after handling TWs in priority list.
> So this is the right test, do I miss something?
Yep, should be it. So the scope of the feature is SQPOLL or
completion/submission with different tasks.
>>>
>>> Also interesting to find an explanation for you numbers assuming
>> The reason may be what I said above, but I don't have a
>> strict proof now.
>>> they're stable. 7/8 batching? How often it does it go this path?
>>> If only one task submits requests it should already be covered
>>> with existing batching.
>> the problem of the existing batch is(given there is only
>> one ctx):
>> 1. we flush it after all the TWs done
>> 2. we batch them if we have uring lock.
>> the new batch is:
>> 1. don't care about uring lock
>> 2. we can flush the completions in the priority list
>> in advance.(which means userland can see it earlier.)
>>>
>>>
>>>> ioengine=io_uring
>>>> sqpoll=1
>>>> thread=1
>>>> bs=4k
>>>> direct=1
>>>> rw=randread
>>>> time_based=1
>>>> runtime=600
>>>> randrepeat=0
>>>> group_reporting=1
>>>> filename=/dev/nvme0n1
>>>>
>>>> 2/8 set unlimited priority_task_list, 8/8 set a limitation of
>>>> 1/3 * (len_prority_list + len_normal_list), data below:
>>>> depth no 8/8 include 8/8 before this patchset
>>>> 1 7.05 7.82 7.10
>>>> 2 8.47 8.48 8.60
>>>> 4 10.42 9.99 10.42
>>>> 8 13.78 13.13 13.22
>>>> 16 27.41 27.92 24.33
>>>> 32 49.40 46.16 53.08
>>>> 64 102.53 105.68 103.36
>>>> 128 196.98 202.76 205.61
>>>> 256 372.99 375.61 414.88
>>>> 512 747.23 763.95 791.30
>>>> 1024 1472.59 1527.46 1538.72
>>>> 2048 3153.49 3129.22 3329.01
>>>> 4096 6387.86 5899.74 6682.54
>>>> 8192 12150.25 12433.59 12774.14
>>>> 16384 23085.58 24342.84 26044.71
>>>>
>>>> It seems 2/8 is better, haven't tried other choices other than 1/3,
>>>> still put 8/8 here for people's further thoughts.
>>>>
>>>> Hao Xu (8):
>>>> io-wq: add helper to merge two wq_lists
>>>> io_uring: add a priority tw list for irq completion work
>>>> io_uring: add helper for task work execution code
>>>> io_uring: split io_req_complete_post() and add a helper
>>>> io_uring: move up io_put_kbuf() and io_put_rw_kbuf()
>>>> io_uring: add nr_ctx to record the number of ctx in a task
>>>> io_uring: batch completion in prior_task_list
>>>> io_uring: add limited number of TWs to priority task list
>>>>
>>>> fs/io-wq.h | 21 +++++++
>>>> fs/io_uring.c | 168 +++++++++++++++++++++++++++++++++++---------------
>>>> 2 files changed, 138 insertions(+), 51 deletions(-)
>>>>
>>>
>
--
Pavel Begunkov
next prev parent reply other threads:[~2021-10-28 19:11 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-27 14:02 [PATCH for-5.16 v3 0/8] task work optimization Hao Xu
2021-10-27 14:02 ` [PATCH 1/8] io-wq: add helper to merge two wq_lists Hao Xu
2021-10-27 14:02 ` [PATCH 2/8] io_uring: add a priority tw list for irq completion work Hao Xu
2021-10-27 14:02 ` [PATCH 3/8] io_uring: add helper for task work execution code Hao Xu
2021-10-27 14:02 ` [PATCH 4/8] io_uring: split io_req_complete_post() and add a helper Hao Xu
2021-10-27 14:02 ` [PATCH 5/8] io_uring: move up io_put_kbuf() and io_put_rw_kbuf() Hao Xu
2021-10-27 14:02 ` [PATCH 6/8] io_uring: add nr_ctx to record the number of ctx in a task Hao Xu
2021-10-28 6:27 ` Hao Xu
2021-10-27 14:02 ` [PATCH 7/8] io_uring: batch completion in prior_task_list Hao Xu
2021-10-27 14:02 ` [PATCH 8/8] io_uring: add limited number of TWs to priority task list Hao Xu
2021-10-27 16:39 ` [PATCH for-5.16 v3 0/8] task work optimization Hao Xu
2021-10-27 18:15 ` Pavel Begunkov
2021-10-28 6:07 ` Hao Xu
2021-10-28 11:46 ` Hao Xu
2021-10-28 19:08 ` Pavel Begunkov [this message]
2021-10-29 6:18 ` Hao Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox