From: Jens Axboe <[email protected]>
To: JeffleXu <[email protected]>,
Heinz Mauelshagen <[email protected]>,
Mikulas Patocka <[email protected]>
Cc: Mike Snitzer <[email protected]>,
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
Subject: Re: [dm-devel] [PATCH 4/4] dm: support I/O polling
Date: Sun, 7 Mar 2021 20:55:13 -0700 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 3/7/21 8:54 PM, JeffleXu wrote:
>
>
> On 3/6/21 1:56 AM, Heinz Mauelshagen wrote:
>>
>> On 3/5/21 6:46 PM, Heinz Mauelshagen wrote:
>>> On 3/5/21 10:52 AM, JeffleXu wrote:
>>>>
>>>> On 3/3/21 6:09 PM, Mikulas Patocka wrote:
>>>>>
>>>>> On Wed, 3 Mar 2021, JeffleXu wrote:
>>>>>
>>>>>>
>>>>>> On 3/3/21 3:05 AM, Mikulas Patocka wrote:
>>>>>>
>>>>>>> Support I/O polling if submit_bio_noacct_mq_direct returned non-empty
>>>>>>> cookie.
>>>>>>>
>>>>>>> Signed-off-by: Mikulas Patocka <[email protected]>
>>>>>>>
>>>>>>> ---
>>>>>>> drivers/md/dm.c | 5 +++++
>>>>>>> 1 file changed, 5 insertions(+)
>>>>>>>
>>>>>>> Index: linux-2.6/drivers/md/dm.c
>>>>>>> ===================================================================
>>>>>>> --- linux-2.6.orig/drivers/md/dm.c 2021-03-02
>>>>>>> 19:26:34.000000000 +0100
>>>>>>> +++ linux-2.6/drivers/md/dm.c 2021-03-02 19:26:34.000000000 +0100
>>>>>>> @@ -1682,6 +1682,11 @@ static void __split_and_process_bio(stru
>>>>>>> }
>>>>>>> }
>>>>>>> + if (ci.poll_cookie != BLK_QC_T_NONE) {
>>>>>>> + while (atomic_read(&ci.io->io_count) > 1 &&
>>>>>>> + blk_poll(ci.poll_queue, ci.poll_cookie, true)) ;
>>>>>>> + }
>>>>>>> +
>>>>>>> /* drop the extra reference count */
>>>>>>> dec_pending(ci.io, errno_to_blk_status(error));
>>>>>>> }
>>>>>> It seems that the general idea of your design is to
>>>>>> 1) submit *one* split bio
>>>>>> 2) blk_poll(), waiting the previously submitted split bio complets
>>>>> No, I submit all the bios and poll for the last one.
>>>>>
>>>>>> and then submit next split bio, repeating the above process. I'm
>>>>>> afraid
>>>>>> the performance may be an issue here, since the batch every time
>>>>>> blk_poll() reaps may decrease.
>>>>> Could you benchmark it?
>>>> I only tested dm-linear.
>>>>
>>>> The configuration (dm table) of dm-linear is:
>>>> 0 1048576 linear /dev/nvme0n1 0
>>>> 1048576 1048576 linear /dev/nvme2n1 0
>>>> 2097152 1048576 linear /dev/nvme5n1 0
>>>>
>>>>
>>>> fio script used is:
>>>> ```
>>>> $cat fio.conf
>>>> [global]
>>>> name=iouring-sqpoll-iopoll-1
>>>> ioengine=io_uring
>>>> iodepth=128
>>>> numjobs=1
>>>> thread
>>>> rw=randread
>>>> direct=1
>>>> registerfiles=1
>>>> hipri=1
>>>> runtime=10
>>>> time_based
>>>> group_reporting
>>>> randrepeat=0
>>>> filename=/dev/mapper/testdev
>>>> bs=4k
>>>>
>>>> [job-1]
>>>> cpus_allowed=14
>>>> ```
>>>>
>>>> IOPS (IRQ mode) | IOPS (iopoll mode (hipri=1))
>>>> --------------- | --------------------
>>>> 213k | 19k
>>>>
>>>> At least, it doesn't work well with io_uring interface.
>>>>
>>>>
>>>
>>>
>>> Jeffle,
>>>
>>> I ran your above fio test on a linear LV split across 3 NVMes to
>>> second your split mapping
>>> (system: 32 core Intel, 256GiB RAM) comparing io engines sync, libaio
>>> and io_uring,
>>> the latter w/ and w/o hipri (sync+libaio obviously w/o registerfiles
>>> and hipri) which resulted ok:
>>>
>>>
>>>
>>> sync | libaio | IRQ mode (hipri=0) | iopoll (hipri=1)
>>> ------|----------|---------------------|----------------- 56.3K |
>>> 290K | 329K | 351K I can't second your
>>> drastic hipri=1 drop here...
>>
>>
>> Sorry, email mess.
>>
>>
>> sync | libaio | IRQ mode (hipri=0) | iopoll (hipri=1)
>> -------|----------|---------------------|-----------------
>> 56.3K | 290K | 329K | 351K
>>
>>
>>
>> I can't second your drastic hipri=1 drop here...
>>
>
> Hummm, that's indeed somewhat strange...
>
> My test environment:
> - CPU: 128 cores, though only one CPU core is used since
> 'cpus_allowed=14' in fio configuration
> - memory: 983G memory free
> - NVMe: Huawai ES3510P (HWE52P434T0L005N), with 'nvme.poll_queues=3'
>
> Maybe you didn't specify 'nvme.poll_queues=XXX'? In this case, IO still
> goes into IRQ mode, even you have specified 'hipri=1'?
That would be my guess too, and the patches also have a very suspicious
clear of HIPRI which shouldn't be there (which would let that fly through).
--
Jens Axboe
next prev parent reply other threads:[~2021-03-08 3:56 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-02 19:05 [PATCH 4/4] dm: support I/O polling Mikulas Patocka
2021-03-03 2:53 ` [dm-devel] " JeffleXu
2021-03-03 10:09 ` Mikulas Patocka
2021-03-04 2:57 ` JeffleXu
2021-03-04 10:09 ` Mikulas Patocka
2021-03-05 18:21 ` Jens Axboe
2021-03-04 15:01 ` Jeff Moyer
2021-03-04 15:11 ` Mike Snitzer
2021-03-04 15:12 ` [dm-devel] " Mikulas Patocka
2021-03-05 9:52 ` JeffleXu
2021-03-05 17:46 ` Heinz Mauelshagen
2021-03-05 17:56 ` Heinz Mauelshagen
2021-03-05 18:09 ` Mike Snitzer
2021-03-05 18:19 ` [dm-devel] " Heinz Mauelshagen
2021-03-08 3:54 ` JeffleXu
2021-03-08 3:55 ` Jens Axboe [this message]
2021-03-09 11:42 ` Heinz Mauelshagen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox