From: Christian Loehle <[email protected]>
To: "Rafael J. Wysocki" <[email protected]>
Cc: [email protected], [email protected],
[email protected], [email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected], [email protected]
Subject: Re: [RFC PATCH 5/8] cpufreq/schedutil: Remove iowait boost
Date: Thu, 3 Oct 2024 10:10:41 +0100 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAJZ5v0hJWwsErT193i394bHOczvCQwU_5AVVTJ1oKDe7kTW82g@mail.gmail.com>
On 9/30/24 17:34, Rafael J. Wysocki wrote:
> On Thu, Sep 5, 2024 at 11:27 AM Christian Loehle
> <[email protected]> wrote:
>>
>> iowait boost in schedutil was introduced by
>> commit ("21ca6d2c52f8 cpufreq: schedutil: Add iowait boosting").
>> with it more or less following intel_pstate's approach to increase
>> frequency after an iowait wakeup.
>> Behaviour that is piggy-backed onto iowait boost is problematic
>> due to a lot of reasons, so remove it.
>>
>> For schedutil specifically these are some of the reasons:
>> 1. Boosting is applied even in scenarios where it doesn't improve
>> throughput.
>
> Well, I wouldn't argue this way because it is kind of like saying that
> air conditioning is used even when it doesn't really help. It is
> sometimes hard to know in advance whether or not it will help though.
Right, it's a heuristic that's often wrong and costs energy when it
triggers is what I was trying to say.
>
>> 2. The boost is not accounted for in EAS: a) feec() will only consider
>> the actual task utilization for task placement, but another CPU might
>> be more energy-efficient at that capacity than the boosted one.)
>> b) When placing a non-IO task while a CPU is boosted compute_energy()
>> assumes a lower OPP than what is actually applied. This leads to
>> wrong EAS decisions.
>
> That's a very good point IMV and so is the one regarding UCLAMP_MAX (8
> in your list).
>
> If the goal is to set the adequate performance for a given utilization
> level (either actual or prescribed), boosting doesn't really play well
> with this and it shouldn't be used at least in these cases.
>
>> 3. Actual IO heavy workloads are hardly distinguished from infrequent
>> in_iowait wakeups.
>
> Do infrequent in_iowait wakeups really cause the boosting to be
> applied at full swing?
Maybe not full swing, but the relatively high rate_limit_us and TICK_NSEC
found on Android deivces does indeed lead to occasional boosting periods
even for 'infrequent'/unrelated wakeups.
>
>> 4. The boost isn't accounted for in task placement.
>
> I'm not sure what exactly this means. "Big" vs "little" or something else?
That should be "[...] in task placement for HMP", you're right.
Essentially if we were to consider a task to be 100% of capacity boost-worthy,
we need to consider that at task placement. Now we cap out at the local CPU,
which might be rather small. (~10% of the biggest CPU on mobile).
Logically this argument (a CAS argument essentially), should probably come
before the EAS one to make more sense.
>> 5. The boost isn't associated with a task, it therefore lingers on the
>> rq even after the responsible task has migrated / stopped.
>
> Fair enough, but this is rather a problem with the implementation of
> boosting and not with the basic idea of it.
Unfortunately the lingering (or to use a term with less negative connotation:
holding) almost is a necessity, too, as described in the cover-letter.
If we only boost at enqueue (and immediately scale down on dequeue) we lose
out massively, as the interrupt isn't boosted and we have to run at the lower
frequency for the DVFS transition delay (even if on x86 that may be close to
negligible). IMO this is the main reason why the mechanism can't evolve (into
something like a per-task strategy).
Even a per-task strategy would need to a) set a timer in case the iowait
period is too long and b) remove boost from prev_cpu if enqueued somewhere
else.
>
>> 6. The boost isn't associated with a task, it therefore needs to ramp
>> up again when migrated.
>
> Well, that again is somewhat implementation-related IMV, and it need
> not be problematic in principle. Namely, if a task migrates and it is
> not the only one in the "new" CPUs runqueue, and the other tasks in
> there don't use in_iowait, maybe it's better to not boost it?
Agreed, this can be argued about (and also isn't a huge problem in
practice).
>
> It also means that boosting is not very consistent, though, which is a
> valid point.
>
>> 7. Since schedutil doesn't know which task is getting woken up,
>> multiple unrelated in_iowait tasks lead to boosting.
>
> Well, that's by design: it boosts, when "there is enough IO pressure
> in the runqueue", so to speak.>
> Basically, it is a departure from the "make performance follow
> utilization" general idea and it is based on the observation that in
> some cases performance can be improved by taking additional
> information into account.
>
> It is also about pure performance, not about energy efficiency.
And the lines between those become more and more blurry, see the GFX
regression. There's very few free lunches up for grabs these days, if
you're boosting performance on X, you're likely paying for it on Y.
That is fine as long as boosting X is deliberate which iowait boosting
very much is not.
>
>> 8. Boosting is hard to control with UCLAMP_MAX (which is only active
>> when the task is on the rq, which for boosted tasks is usually not
>> the case for most of the time).
>>
>> One benefit of schedutil specifically is the reliance on the
>> scheduler's utilization signals, which have evolved a lot since it's
>> original introduction. Some cases that benefitted from iowait boosting
>> in the past can now be covered by e.g. util_est.
>
> And it would be good to give some examples of this.
>
> IMV you have a clean-cut argument in the EAS and UCLAMP_MAX cases, but
> apart from that it is all a bit hand-wavy.
Thanks Rafael, you brought up some good points!
next prev parent reply other threads:[~2024-10-03 9:10 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-05 9:26 [RFT RFC PATCH 0/8] cpufreq: cpuidle: Remove iowait behaviour Christian Loehle
2024-09-05 9:26 ` [RFC PATCH 1/8] cpuidle: menu: Remove iowait influence Christian Loehle
2024-09-30 14:58 ` Rafael J. Wysocki
2024-09-05 9:26 ` [RFC PATCH 2/8] cpuidle: Prefer teo over menu governor Christian Loehle
2024-09-30 15:06 ` Rafael J. Wysocki
2024-09-30 16:12 ` Christian Loehle
2024-09-30 16:42 ` Rafael J. Wysocki
2024-09-05 9:26 ` [RFC PATCH 3/8] TEST: cpufreq/schedutil: Linear iowait boost step Christian Loehle
2024-09-05 9:26 ` [RFC PATCH 4/8] TEST: cpufreq/schedutil: iowait boost cap sysfs Christian Loehle
2024-09-05 9:26 ` [RFC PATCH 5/8] cpufreq/schedutil: Remove iowait boost Christian Loehle
2024-09-30 16:34 ` Rafael J. Wysocki
2024-10-03 9:10 ` Christian Loehle [this message]
2024-10-03 9:47 ` Quentin Perret
2024-10-03 10:30 ` Christian Loehle
2024-10-05 0:39 ` Andres Freund
2024-10-09 9:54 ` Christian Loehle
2024-09-05 9:26 ` [RFC PATCH 6/8] cpufreq: intel_pstate: " Christian Loehle
2024-09-12 11:22 ` [RFC PATCH] TEST: cpufreq: intel_pstate: sysfs iowait_boost_cap Christian Loehle
2024-09-30 18:03 ` [RFC PATCH 6/8] cpufreq: intel_pstate: Remove iowait boost Rafael J. Wysocki
2024-09-30 20:35 ` srinivas pandruvada
2024-10-01 9:57 ` Christian Loehle
2024-10-01 14:46 ` srinivas pandruvada
2024-09-05 9:26 ` [RFC PATCH 7/8] cpufreq: Remove SCHED_CPUFREQ_IOWAIT update Christian Loehle
2024-09-05 9:26 ` [RFC PATCH 8/8] io_uring: Do not set iowait before sleeping Christian Loehle
2024-09-05 12:31 ` [RFT RFC PATCH 0/8] cpufreq: cpuidle: Remove iowait behaviour Christian Loehle
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox