From: Christian Loehle <[email protected]>
To: Bart Van Assche <[email protected]>, [email protected]
Cc: [email protected], [email protected], [email protected],
[email protected], [email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], Qais Yousef <[email protected]>
Subject: Re: [RFC PATCH 0/2] Introduce per-task io utilization boost
Date: Wed, 6 Mar 2024 10:49:22 +0000 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
Hi Bart,
On 05/03/2024 18:36, Bart Van Assche wrote:
> On 3/5/24 01:13, Christian Loehle wrote:
>> On 05/03/2024 00:20, Bart Van Assche wrote:
>>> On 3/4/24 12:16, Christian Loehle wrote:
>>>> - Higher cap is not always beneficial, we might place the task away
>>>> from the CPU where the interrupt handler is running, making it run
>>>> on an unboosted CPU which may have a bigger impact than the difference
>>>> between the CPU's capacity the task moved to. (Of course the boost will
>>>> then be reverted again, but a ping-pong every interval is possible).
>>>
>>> In the above I see "the interrupt handler". Does this mean that the NVMe
>>> controller in the test setup only supports one completion interrupt for
>>> all completion queues instead of one completion interrupt per completion
>>> queue? There are already Android phones and developer boards available
>>> that support the latter, namely the boards equipped with a UFSHCI 4.0 controller.
>>
>> No, both NVMe test setups have one completion interrupt per completion queue,
>> so this caveat doesn't affect them, higher capacity CPU is strictly better.
>> The UFS and both mmc setups (eMMC with CQE and sdcard) only have one completion
>> interrupt (on CPU0 on my setup).
>
> I think that measurements should be provided in the cover letter for the
> two types of storage controllers: one series of measurements for a
> storage controller with a single completion interrupt and a second
> series of measurements for storage controllers with one completion
> interrupt per CPU.
Of the same type of storage controller? Or what is missing for you in
the cover letter exactly (ufs/emmc: single completion interrupt,
nvme: one completion interrupt per CPU).
>
>> FWIW you do gain an additional ~20% (in my specific setup) if you move the ufshcd
>> interrupt to a big CPU, too. Similarly for the mmc.
>> Unfortunately the infrastructure is far from being there for the scheduler to move the
>> interrupt to the same performance domain as the task, which is often optimal both in
>> terms of throughput and in terms of power.
>> I'll go looking for a stable testing platform with UFS as you mentioned, benefits of this
>> patch will of course be greatly increased.
>
> I'm not sure whether making the completion interrupt follow the workload
> is a good solution. I'm concerned that this would increase energy
> consumption by keeping the big cores active longer than necessary. I
> like this solution better (improves storage performance on at least
> devices with a UFSHCI 3.0 controller): "[PATCH v2 0/2] sched: blk:
> Handle HMP systems when completing IO"
> (https://lore.kernel.org/linux-block/[email protected]/).
That patch is good, don't get me wrong, but you still lose out by running everything
up to blk_mq_complete_request() on (potentially) a LITTlE (that might be run on a low OPP),
while having a big CPU available at a high OPP anyway ("for free").
It is only adjacent to the series but I've done some measurements (Pixel6 again, same device
as cover letter, Base is Android 6.6 mainline kernel (so without my series, but I somewhat forced
the effects by task pinning), Applied is with both of sched: blk: Handle HMP systems when completing IO):
Pretty numbers (IOPS):
Base irq@CPU0 median: 6969
Base irq@CPU6 median: 8407 (+20.6%)
Applied irq@CPU0 median: 7144 (+2.5%)
Applied irq@CPU6 median: 8288 (18.9%)
This is with psyncx1 4K Random Read again, of course anything with queue depth
takes advantage of batch completions to significantly reduce irq pressure.
Not so pretty numbers and full list commands used:
w/o patch:
irq on CPU0 (default):
psyncx1: 7000 6969 7025 6954 6964
io_uring4x128: 28766 28280 28339 28310 28349
irq on CPU6:
psyncx1: 8342 8492 8355 8407 8532
io_uring4x128: 28641 28356 25908 25787 25853
with patch:
irq on CPU0:
psyncx1: 7672 7144 7301 6976 6889
io_uring4x128: 28266 26314 27648 24482 25301
irq on CPU6:
psyncx1: 8208 8401 8351 8221 8288
io_uring4x128: 25603 25438 25453 25514 25402
for i in $(seq 0 4); do taskset c0 /data/local/tmp/fio_aosp_build --name=test --rw=randread --bs=4k --runtime=30 --time_based --filename=/dev/block/sda --minimal | awk -F ";" '{print $8}'; sleep 30; done
for i in $(seq 0 4); do taskset c0 /data/local/tmp/fio_aosp_build --name=test --rw=randread --bs=4k --runtime=30 --time_based --filename=/dev/block/sda --ioengine=io_uring --iodepth=128 --numjobs=4 --group_reporting --minimal | awk -F ";" '{print $8}'; sleep 30; done
echo 6 > /proc/irq/296/smp_affinity_list
Kind Regards,
Christian
next prev parent reply other threads:[~2024-03-06 10:49 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-04 20:16 [RFC PATCH 0/2] Introduce per-task io utilization boost Christian Loehle
2024-03-04 20:16 ` [RFC PATCH 1/2] sched/fair: Introduce per-task io util boost Christian Loehle
2024-03-25 3:30 ` Qais Yousef
2024-03-04 20:16 ` [RFC PATCH 2/2] cpufreq/schedutil: Remove iowait boost Christian Loehle
2024-03-18 14:07 ` Rafael J. Wysocki
2024-03-18 16:40 ` Christian Loehle
2024-03-18 17:08 ` Rafael J. Wysocki
2024-03-19 13:58 ` Christian Loehle
2024-03-25 2:37 ` Qais Yousef
2024-04-19 13:42 ` Christian Loehle
2024-04-29 11:18 ` Qais Yousef
2024-05-07 15:19 ` Christian Loehle
2024-05-12 15:29 ` Qais Yousef
2024-03-05 0:20 ` [RFC PATCH 0/2] Introduce per-task io utilization boost Bart Van Assche
2024-03-05 9:13 ` Christian Loehle
2024-03-05 18:36 ` Bart Van Assche
2024-03-06 10:49 ` Christian Loehle [this message]
2024-03-21 12:39 ` Qais Yousef
2024-03-21 17:57 ` Christian Loehle
2024-03-21 19:52 ` Bart Van Assche
2024-03-25 12:06 ` Christian Loehle
2024-03-25 17:23 ` Bart Van Assche
2024-03-25 2:53 ` Qais Yousef
2024-03-22 18:08 ` Vincent Guittot
2024-03-25 2:20 ` Qais Yousef
2024-03-25 17:18 ` Christian Loehle
2024-03-25 12:24 ` Christian Loehle
2024-03-28 10:09 ` Vincent Guittot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox