public inbox for [email protected]
 help / color / mirror / Atom feed
From: Hao Xu <[email protected]>
To: Pavel Begunkov <[email protected]>, Jens Axboe <[email protected]>
Cc: [email protected], Joseph Qi <[email protected]>
Subject: Re: [PATCH for-5.16 v3 0/8] task work optimization
Date: Thu, 28 Oct 2021 14:07:44 +0800	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

在 2021/10/28 上午2:15, Pavel Begunkov 写道:
> On 10/27/21 15:02, Hao Xu wrote:
>> Tested this patchset by manually replace __io_queue_sqe() in
>> io_queue_sqe() by io_req_task_queue() to construct 'heavy' task works.
>> Then test with fio:
> 
> If submissions and completions are done by the same task it doesn't
> really matter in which order they're executed because the task won't
> get back to userspace execution to see CQEs until tw returns.
It may matter, it depends on the time cost of submittion
and the DMA IO time. Pick up sqpoll mode as example,
we submit 10 reqs:
t1          io_submit_sqes
             -->io_req_task_queue
t2          io_task_work_run
we actually do the submittion in t2,  but if the workload
is big engough, the 'irq completion TW' will be inserted
to the TW list after t2 is fully done, then those
'irq completion TW' will be delayed to the next round.
With this patchset, we can handle them first.
> Furthermore, it even might be worse because the earlier you submit
> the better with everything else equal.
> 
> IIRC, that's how it's with fio, right? If so, you may get better
> numbers with a test that does submissions and completions in
> different threads.
Because of the completion cache, I doubt if it works.
For single ctx, it seems we always update the cqring
pointer after all the TWs in the list are done.
> 
> Also interesting to find an explanation for you numbers assuming
The reason may be what I said above, but I don't have a
strict proof now.
> they're stable. 7/8 batching? How often it does it go this path?
> If only one task submits requests it should already be covered
> with existing batching.
the problem of the existing batch is(given there is only
one ctx):
1. we flush it after all the TWs done
2. we batch them if we have uring lock.
the new batch is:
1. don't care about uring lock
2. we can flush the completions in the priority list
    in advance.(which means userland can see it earlier.)
> 
> 
>> ioengine=io_uring
>> sqpoll=1
>> thread=1
>> bs=4k
>> direct=1
>> rw=randread
>> time_based=1
>> runtime=600
>> randrepeat=0
>> group_reporting=1
>> filename=/dev/nvme0n1
>>
>> 2/8 set unlimited priority_task_list, 8/8 set a limitation of
>> 1/3 * (len_prority_list + len_normal_list), data below:
>>     depth     no 8/8   include 8/8      before this patchset
>>      1        7.05         7.82              7.10
>>      2        8.47         8.48              8.60
>>      4        10.42        9.99              10.42
>>      8        13.78        13.13             13.22
>>      16       27.41        27.92             24.33
>>      32       49.40        46.16             53.08
>>      64       102.53       105.68            103.36
>>      128      196.98       202.76            205.61
>>      256      372.99       375.61            414.88
>>      512      747.23       763.95            791.30
>>      1024     1472.59      1527.46           1538.72
>>      2048     3153.49      3129.22           3329.01
>>      4096     6387.86      5899.74           6682.54
>>      8192     12150.25     12433.59          12774.14
>>      16384    23085.58     24342.84          26044.71
>>
>> It seems 2/8 is better, haven't tried other choices other than 1/3,
>> still put 8/8 here for people's further thoughts.
>>
>> Hao Xu (8):
>>    io-wq: add helper to merge two wq_lists
>>    io_uring: add a priority tw list for irq completion work
>>    io_uring: add helper for task work execution code
>>    io_uring: split io_req_complete_post() and add a helper
>>    io_uring: move up io_put_kbuf() and io_put_rw_kbuf()
>>    io_uring: add nr_ctx to record the number of ctx in a task
>>    io_uring: batch completion in prior_task_list
>>    io_uring: add limited number of TWs to priority task list
>>
>>   fs/io-wq.h    |  21 +++++++
>>   fs/io_uring.c | 168 +++++++++++++++++++++++++++++++++++---------------
>>   2 files changed, 138 insertions(+), 51 deletions(-)
>>
> 


  reply	other threads:[~2021-10-28  6:07 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-27 14:02 [PATCH for-5.16 v3 0/8] task work optimization Hao Xu
2021-10-27 14:02 ` [PATCH 1/8] io-wq: add helper to merge two wq_lists Hao Xu
2021-10-27 14:02 ` [PATCH 2/8] io_uring: add a priority tw list for irq completion work Hao Xu
2021-10-27 14:02 ` [PATCH 3/8] io_uring: add helper for task work execution code Hao Xu
2021-10-27 14:02 ` [PATCH 4/8] io_uring: split io_req_complete_post() and add a helper Hao Xu
2021-10-27 14:02 ` [PATCH 5/8] io_uring: move up io_put_kbuf() and io_put_rw_kbuf() Hao Xu
2021-10-27 14:02 ` [PATCH 6/8] io_uring: add nr_ctx to record the number of ctx in a task Hao Xu
2021-10-28  6:27   ` Hao Xu
2021-10-27 14:02 ` [PATCH 7/8] io_uring: batch completion in prior_task_list Hao Xu
2021-10-27 14:02 ` [PATCH 8/8] io_uring: add limited number of TWs to priority task list Hao Xu
2021-10-27 16:39 ` [PATCH for-5.16 v3 0/8] task work optimization Hao Xu
2021-10-27 18:15 ` Pavel Begunkov
2021-10-28  6:07   ` Hao Xu [this message]
2021-10-28 11:46     ` Hao Xu
2021-10-28 19:08       ` Pavel Begunkov
2021-10-29  6:18         ` Hao Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7a528ce1-a44e-3ee7-095c-1a92528ec441@linux.alibaba.com \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox