From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on gnuweeb.org X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=ALL_TRUSTED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,NO_DNS_FOR_FROM, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 Received: from [192.168.137.80] (unknown [182.2.74.86]) by gnuweeb.org (Postfix) with ESMTPSA id E4BB47E342; Tue, 5 Apr 2022 13:13:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gnuweeb.org; s=default; t=1649164430; bh=byKY6UTz0fK6TWtlSIwQIIwJ8VY6Tz2PXZQNSc6FFwU=; h=Date:To:Cc:References:From:Subject:In-Reply-To:From; b=qB9fXD5WlGBLJdi9FAcUvNx+Ou3+3qfnjcgD/3mYB9HWK/dVCRD7WzbNjjVWn6QEz pWq1umCtX/WFdk2IpK3S6P+oYfjCd4hJlDFEY1fgxeyLhJEZ7umI2rPaluveg/dokC eoG5noGnIYZ0JX1yVB45+Ont8l6x9d/SGFO0VMD9aqhKwYdyKefDbJLwfW+DFD/ILP fmQyW6Iyl6Xp66LL64kwBUJwZ93lWoVJrSx7/EXedy8WTwUPp9RlyG4lboi12Hn9Y6 nmBDsvuUx6yR3Inx8i0sx87KlKgQ22fh56HhGddv5aigM2ptQfceZlfiMIpmEhezkk wrDIQQkcCg/Ow== Message-ID: <675544de-3369-e26e-65ba-3b28fff5c126@gnuweeb.org> Date: Tue, 5 Apr 2022 20:13:42 +0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Content-Language: en-US To: Dietmar Eggemann , Linux Kernel Mailing List Cc: Ben Segall , Daniel Bristot de Oliveira , GNU/Weeb Mailing List , Ingo Molnar , Juri Lelli , Mel Gorman , Peter Zijlstra , Steven Rostedt , Vincent Guittot References: From: Ammar Faizi Subject: Re: [Linux 5.18-rc1] WARNING: CPU: 1 PID: 0 at kernel/sched/fair.c:3355 update_blocked_averages In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit List-Id: On 4/5/22 7:21 PM, Dietmar Eggemann wrote: > Tried to recreate the issue but no success so far. I used you config > file, clang-14 and a Xeon CPU E5-2690 v2 (2 sockets 40 CPUs) with 20 > two-level cgoupv1 taskgroups '/X/Y' with 'hackbench (10 groups, 40 fds) > + idling' running in all '/X/Y/'. > > What userspace are you running? HP Laptop, Intel i7-1165G7, 8 CPUs, with 16 GB of RAM. Ubuntu 21.10. Just for daily workstation. Compiling kernel, browsing and coding stuff. > There seemed to be some pressure on your machine when it happened? Yeah, might be, I don't fully remember the activity at the time it happened, though. >> <6>[13420.623334][ C7] perf: interrupt took too long (2530 > 2500), >> lowering kernel.perf_event_max_sample_rate to 78900 > > Maybe you could split the SCHED_WARN_ON so we know which signal causes this? OK, I will apply the diff on top of 5.18-rc1 and will start using it for daily routine tomorrow morning. Let's see if I can hit this bug again. Will send an update later... Thank you. > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index d4bd299d67ab..0d45e09e5bfc 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -3350,9 +3350,9 @@ static inline bool cfs_rq_is_decayed(struct cfs_rq > *cfs_rq) > * Make sure that rounding and/or propagation of PELT values never > * break this. > */ > - SCHED_WARN_ON(cfs_rq->avg.load_avg || > - cfs_rq->avg.util_avg || > - cfs_rq->avg.runnable_avg); > + SCHED_WARN_ON(cfs_rq->avg.load_avg); > + SCHED_WARN_ON(cfs_rq->avg.util_avg); > + SCHED_WARN_ON(cfs_rq->avg.runnable_avg); > > return true; > } > > [...] -- Ammar Faizi