Re: [RFC PATCH 2/2] cpufreq/schedutil: Remove iowait boost

public inbox for [email protected]
 help / color / mirror / Atom feed

From: Christian Loehle <[email protected]>
To: Qais Yousef <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>,
	[email protected], [email protected],
	[email protected], [email protected],
	[email protected], [email protected],
	[email protected], [email protected],
	[email protected], [email protected],
	[email protected], [email protected],
	[email protected], [email protected],
	[email protected]
Subject: Re: [RFC PATCH 2/2] cpufreq/schedutil: Remove iowait boost
Date: Tue, 7 May 2024 16:19:20 +0100	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <20240429111816.mqok5biihvy46eba@airbuntu>

On 29/04/2024 12:18, Qais Yousef wrote:
> On 04/19/24 14:42, Christian Loehle wrote:
> 
>>> I think the major thing we need to be careful about is the behavior when the
>>> task is sleeping. I think the boosting will be removed when the task is
>>> dequeued and I can bet there will be systems out there where the BLOCK softirq
>>> being boosted when the task is sleeping will matter.
>>
>> Currently I see this mainly protected by the sugov rate_limit_us.
>> With the enqueue's being the dominating cpufreq updates it's not really an
>> issue, the boost is expected to survive the sleep duration, during which it
>> wouldn't be active.
>> I did experiment with some sort of 'stickiness' of the boost to the rq, but
>> it is somewhat of a pain to deal with if we want to remove it once enqueued
>> on a different rq. A sugov 1ms timer is much simpler of course.
>> Currently it's not necessary IMO, but for the sake of being future-proof in
>> terms of more frequent freq updates I might include it in v2.
> 
> Making sure things work with purpose would be really great. This implicit
> dependency is not great IMHO and make both testing and reasoning about why
> things are good or bad harder when analysing real workloads. Especially by non
> kernel developers.

Agreed.
Even without your proposed changes [1] relying on sugov rate_limit_us is
unfortunate.
There is a problem with an arbitrarily low rate_limit_us more generally, not
just because we kind of rely on the CPU being boosted right before the task is
actually enqueued (for the interrupt/softirq part of it), but also because of
the latency from requested frequency improvement to actually running on that
frequency. If the task is 90% done by the time it sees the improvement and
the frequency will be updated (back to a lower one) before the next enqueue,
then that's hardly worth the effort.
Currently this is covered by rate_limit_us probabillistically and that seems
to be good enough in practice, but it's not very pleasing (and also EAS can't
take it into consideration).
That's not just exclusive for iowait wakeup tasks of course, but in theory any
that is off the rq frequently (and still requests a higher frequency than it can
realistically build up through util_avg like through uclamp_min).

>>>
>>> FWIW I do have an implementation for per-task iowait boost where I went a step
>>> further and converted intel_pstate too and like Christian didn't notice
>>> a regression. But I am not sure (rather don't think) I triggered this use case.
>>> I can't tell when the systems truly have per-cpu cpufreq control or just appear
>>> so and they are actually shared but not visible at linux level.
>>
>> Please do share your intel_pstate proposal!
> 
> This is what I had. I haven't been working on this for the past few months, but
> I remember tried several tests on different machines then without a problem.
> I tried to re-order patches at some point though and I hope I didn't break
> something accidentally and forgot the state.
> 
> https://github.com/torvalds/linux/compare/master...qais-yousef:linux:uclamp-max-aggregation
> 

Thanks for sharing, that looks reasonable with consolidating it into uclamp_min.
Couple of thoughts on yours, I'm sure you're aware, but consider it me thinking out
loud:
- iowait boost is taken into consideration for task placement, but with just the
4 steps that made it more aggressive on HMP. (Potentially 2-3 consecutive iowait
wakeups to land on the big instead of running at max OPP of a LITTLE).
- If the current iowait boost decay is sensible is questionable, but there should
probably be some decay. Taken to the extreme this would mean something
like blk_wait_io() demands 1024 utilization, if it waits for a very long time.
Repeating myself here, but iowait wakeups itself is tricky to work with (and I
try to work around that).
- The intel_pstate solution will increase boost even if
previous_wakeup->iowait_boost > current->iowait_boost
right? But using current->iowait_boost is a clever idea.

[1]
https://lore.kernel.org/lkml/[email protected]/T/

Kind Regards,
Christian

next prev parent reply	other threads:[~2024-05-07 15:19 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-04 20:16 [RFC PATCH 0/2] Introduce per-task io utilization boost Christian Loehle
2024-03-04 20:16 ` [RFC PATCH 1/2] sched/fair: Introduce per-task io util boost Christian Loehle
2024-03-25  3:30   ` Qais Yousef
2024-03-04 20:16 ` [RFC PATCH 2/2] cpufreq/schedutil: Remove iowait boost Christian Loehle
2024-03-18 14:07   ` Rafael J. Wysocki
2024-03-18 16:40     ` Christian Loehle
2024-03-18 17:08       ` Rafael J. Wysocki
2024-03-19 13:58         ` Christian Loehle
2024-03-25  2:37         ` Qais Yousef
2024-04-19 13:42           ` Christian Loehle
2024-04-29 11:18             ` Qais Yousef
2024-05-07 15:19               ` Christian Loehle [this message]
2024-05-12 15:29                 ` Qais Yousef
2024-03-05  0:20 ` [RFC PATCH 0/2] Introduce per-task io utilization boost Bart Van Assche
2024-03-05  9:13   ` Christian Loehle
2024-03-05 18:36     ` Bart Van Assche
2024-03-06 10:49       ` Christian Loehle
2024-03-21 12:39         ` Qais Yousef
2024-03-21 17:57           ` Christian Loehle
2024-03-21 19:52             ` Bart Van Assche
2024-03-25 12:06               ` Christian Loehle
2024-03-25 17:23                 ` Bart Van Assche
2024-03-25  2:53             ` Qais Yousef
2024-03-22 18:08 ` Vincent Guittot
2024-03-25  2:20   ` Qais Yousef
2024-03-25 17:18     ` Christian Loehle
2024-03-25 12:24   ` Christian Loehle
2024-03-28 10:09     ` Vincent Guittot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox