* [PATCH v4 0/2] Two x86 fixes @ 2022-03-01 9:46 Ammar Faizi 2022-03-01 9:46 ` [PATCH v4 1/2] x86/delay: Fix the wrong asm constraint in `delay_loop()` Ammar Faizi 2022-03-01 9:46 ` [PATCH v4 2/2] x86/mce/amd: Fix memory leak when `threshold_create_bank()` fails Ammar Faizi 0 siblings, 2 replies; 18+ messages in thread From: Ammar Faizi @ 2022-03-01 9:46 UTC (permalink / raw) To: Borislav Petkov Cc: Dave Hansen, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Tony Luck, linux-edac, linux-kernel, gwml, x86, stable, Alviro Iskandar Setiawan, Jiri Hladky, Greg Kroah-Hartman, Ammar Faizi Hi, Two fixes for x86 arch. ## Changelog v4: - Address comment from Greg, sha1 commit Fixes only needs to be 12 chars. - Add the author of the fixed commit to the CC list. v3: - Fold in changes from Alviro, the previous version is still leaking @bank[n]. v2: - Fix wrong copy/paste. ## Short Summary Patch 1, fixes the wrong asm constraint in delay_loop function. Fortunately, the constraint violation that's fixed by patch 1 doesn't yield any bug due to the nature of System V ABI. Should we backport this? Patch 2, fixes memory leak in mce/amd code. Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tony Luck <[email protected]> Signed-off-by: Alviro Iskandar Setiawan <[email protected]> Signed-off-by: Ammar Faizi <[email protected]> --- Ammar Faizi (2): x86/delay: Fix the wrong asm constraint in `delay_loop()` x86/mce/amd: Fix memory leak when `threshold_create_bank()` fails arch/x86/kernel/cpu/mce/amd.c | 16 ++++++++++------ arch/x86/lib/delay.c | 4 ++-- 2 files changed, 12 insertions(+), 8 deletions(-) base-commit: 7e57714cd0ad2d5bb90e50b5096a0e671dec1ef3 -- 2.32.0 ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v4 1/2] x86/delay: Fix the wrong asm constraint in `delay_loop()` 2022-03-01 9:46 [PATCH v4 0/2] Two x86 fixes Ammar Faizi @ 2022-03-01 9:46 ` Ammar Faizi 2022-03-01 9:54 ` David Laight 2022-03-01 11:33 ` Alviro Iskandar Setiawan 2022-03-01 9:46 ` [PATCH v4 2/2] x86/mce/amd: Fix memory leak when `threshold_create_bank()` fails Ammar Faizi 1 sibling, 2 replies; 18+ messages in thread From: Ammar Faizi @ 2022-03-01 9:46 UTC (permalink / raw) To: Borislav Petkov Cc: Dave Hansen, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Tony Luck, linux-edac, linux-kernel, gwml, x86, stable, Alviro Iskandar Setiawan, Jiri Hladky, Greg Kroah-Hartman, Ammar Faizi From: Ammar Faizi <[email protected]> The asm constraint does not reflect that the asm statement can modify the value of @loops. But the asm statement in delay_loop() does change the @loops. If by any chance the compiler inlines this function, it may clobber random stuff (e.g. local variable, important temporary value in reg, etc.). Fortunately, delay_loop() is only called indirectly (so it can't inline), and then the register it clobbers is %rax (which is by the nature of the calling convention, it's a caller saved register), so it didn't yield any bug. ^ That shouldn't be an excuse for using the wrong constraint anyway. This changes "a" (as an input) to "+a" (as an input and output). Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Jiri Hladky <[email protected]> Fixes: e01b70ef3eb3 ("x86: fix bug in arch/i386/lib/delay.c file, delay_loop function") Signed-off-by: Ammar Faizi <[email protected]> --- Fortunately, the constraint violation that's fixed by patch 1 doesn't yield any bug due to the nature of System V ABI. Should we backport this? arch/x86/lib/delay.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c index 65d15df6212d..0e65d00e2339 100644 --- a/arch/x86/lib/delay.c +++ b/arch/x86/lib/delay.c @@ -54,8 +54,8 @@ static void delay_loop(u64 __loops) " jnz 2b \n" "3: dec %0 \n" - : /* we don't need output */ - :"a" (loops) + : "+a" (loops) + : ); } -- 2.32.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* RE: [PATCH v4 1/2] x86/delay: Fix the wrong asm constraint in `delay_loop()` 2022-03-01 9:46 ` [PATCH v4 1/2] x86/delay: Fix the wrong asm constraint in `delay_loop()` Ammar Faizi @ 2022-03-01 9:54 ` David Laight 2022-03-03 0:14 ` Ammar Faizi 2022-03-01 11:33 ` Alviro Iskandar Setiawan 1 sibling, 1 reply; 18+ messages in thread From: David Laight @ 2022-03-01 9:54 UTC (permalink / raw) To: 'Ammar Faizi', Borislav Petkov Cc: Dave Hansen, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Tony Luck, [email protected], [email protected], [email protected], [email protected], [email protected], Alviro Iskandar Setiawan, Jiri Hladky, Greg Kroah-Hartman From: Ammar Faizi > Sent: 01 March 2022 09:46 > > The asm constraint does not reflect that the asm statement can modify > the value of @loops. But the asm statement in delay_loop() does change > the @loops. > > If by any chance the compiler inlines this function, it may clobber > random stuff (e.g. local variable, important temporary value in reg, > etc.). > > Fortunately, delay_loop() is only called indirectly (so it can't > inline), and then the register it clobbers is %rax (which is by the > nature of the calling convention, it's a caller saved register), so it > didn't yield any bug. Both the function pointers in that code need killing. They only have two options (each) so conditional branches will almost certainly always have been better. I also wonder how well the comment The additional jump magic is needed to get the timing stable on all the CPU' we have to worry about. applies to any modern cpu! The code is unchanged since (at least) 2.6.27. (It might have been moved from another file.) David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 1/2] x86/delay: Fix the wrong asm constraint in `delay_loop()` 2022-03-01 9:54 ` David Laight @ 2022-03-03 0:14 ` Ammar Faizi 0 siblings, 0 replies; 18+ messages in thread From: Ammar Faizi @ 2022-03-03 0:14 UTC (permalink / raw) To: David Laight, Borislav Petkov Cc: Dave Hansen, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Tony Luck, [email protected], [email protected], [email protected], [email protected], [email protected], Alviro Iskandar Setiawan, Jiri Hladky, Greg Kroah-Hartman On 3/1/22 4:54 PM, David Laight wrote: > Both the function pointers in that code need killing. > They only have two options (each) so conditional branches > will almost certainly always have been better. Yes, I agree with simply using conditional branches to handle this case. But to keep the changes minimal for the stable tree, let's fix the obvious real bug first. Someone can refactor it later, but I don't see that as an urgent thing to refactor. > I also wonder how well the comment > The additional jump magic is needed to get the timing stable > on all the CPU' we have to worry about. > applies to any modern cpu! > The code is unchanged since (at least) 2.6.27. > (It might have been moved from another file.) Not sure about that... Thanks for the feedback. -- Ammar Faizi ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 1/2] x86/delay: Fix the wrong asm constraint in `delay_loop()` 2022-03-01 9:46 ` [PATCH v4 1/2] x86/delay: Fix the wrong asm constraint in `delay_loop()` Ammar Faizi 2022-03-01 9:54 ` David Laight @ 2022-03-01 11:33 ` Alviro Iskandar Setiawan 2022-03-03 0:06 ` Ammar Faizi 2022-03-03 0:35 ` David Laight 1 sibling, 2 replies; 18+ messages in thread From: Alviro Iskandar Setiawan @ 2022-03-01 11:33 UTC (permalink / raw) To: Ammar Faizi Cc: Borislav Petkov, Dave Hansen, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Tony Luck, linux-edac, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86, stable, Jiri Hladky, Greg Kroah-Hartman On Tue, Mar 1, 2022 at 4:46 PM Ammar Faizi wrote: > Fortunately, the constraint violation that's fixed by patch 1 doesn't > yield any bug due to the nature of System V ABI. Should we backport > this? hi sir, it might also be interesting to know that even if it never be inlined, it's still potential to break. for example this code (https://godbolt.org/z/xWMTxhTET) __attribute__((__noinline__)) static void x(int a) { asm("xorl\t%%r8d, %%r8d"::"a"(a)); } extern int p(void); int f(void) { int ret = p(); x(ret); return ret; } translates to this asm x: movl %edi, %eax xorl %r8d, %r8d ret f: subq $8, %rsp call p movl %eax, %r8d movl %eax, %edi call x movl %r8d, %eax addq $8, %rsp ret See the %r8d? It should be clobbered by a function call too. But since no one tells the compiler that we clobber %r8d, it assumes %r8d never changes after that call. The compiler thinks x() is static and will not clobber %r8d, even the ABI says %r8d will be clobbered by a function call. So i think it should be backported to the stable kernel, it's still a fix -- Viro ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 1/2] x86/delay: Fix the wrong asm constraint in `delay_loop()` 2022-03-01 11:33 ` Alviro Iskandar Setiawan @ 2022-03-03 0:06 ` Ammar Faizi 2022-03-03 0:35 ` David Laight 1 sibling, 0 replies; 18+ messages in thread From: Ammar Faizi @ 2022-03-03 0:06 UTC (permalink / raw) To: Alviro Iskandar Setiawan Cc: Borislav Petkov, Dave Hansen, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Tony Luck, linux-edac, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86, stable, Jiri Hladky, Greg Kroah-Hartman, David Laight On 3/1/22 6:33 PM, Alviro Iskandar Setiawan wrote: > hi sir, it might also be interesting to know that even if it never be > inlined, it's still potential to break. > > for example this code (https://godbolt.org/z/xWMTxhTET) > > __attribute__((__noinline__)) static void x(int a) > { > asm("xorl\t%%r8d, %%r8d"::"a"(a)); > } > > extern int p(void); > > int f(void) > { > int ret = p(); > x(ret); > return ret; > } > > translates to this asm > > x: > movl %edi, %eax > xorl %r8d, %r8d > ret > f: > subq $8, %rsp > call p > movl %eax, %r8d > movl %eax, %edi > call x > movl %r8d, %eax > addq $8, %rsp > ret > > See the %r8d? It should be clobbered by a function call too. But since > no one tells the compiler that we clobber %r8d, it assumes %r8d never > changes after that call. The compiler thinks x() is static and will > not clobber %r8d, even the ABI says %r8d will be clobbered by a > function call. So i think it should be backported to the stable > kernel, it's still a fix Thanks. I will add CC stable in the v5. -- Ammar Faizi ^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: [PATCH v4 1/2] x86/delay: Fix the wrong asm constraint in `delay_loop()` 2022-03-01 11:33 ` Alviro Iskandar Setiawan 2022-03-03 0:06 ` Ammar Faizi @ 2022-03-03 0:35 ` David Laight 1 sibling, 0 replies; 18+ messages in thread From: David Laight @ 2022-03-03 0:35 UTC (permalink / raw) To: 'Alviro Iskandar Setiawan', Ammar Faizi Cc: Borislav Petkov, Dave Hansen, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Tony Luck, [email protected], Linux Kernel Mailing List, GNU/Weeb Mailing List, [email protected], [email protected], Jiri Hladky, Greg Kroah-Hartman From: Alviro Iskandar Setiawan > Sent: 01 March 2022 11:34 > > On Tue, Mar 1, 2022 at 4:46 PM Ammar Faizi wrote: > > Fortunately, the constraint violation that's fixed by patch 1 doesn't > > yield any bug due to the nature of System V ABI. Should we backport > > this? > > hi sir, it might also be interesting to know that even if it never be > inlined, it's still potential to break. > > for example this code (https://godbolt.org/z/xWMTxhTET) > > __attribute__((__noinline__)) static void x(int a) > { > asm("xorl\t%%r8d, %%r8d"::"a"(a)); > } But this code isn't doing that. In your example the compiler has looked at the static function and realised that is doesn't use r8 so it need not be saved even though it is a volatile register. In this code the compiler knows %ax is being used, it just doesn't know it is changed - so could assume the value is unchanged. The only code that is likely to break is: int f(int d) { d += 10; __delay(d); return d; } Which might manage to return the value of %eax modified by the asm. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v4 2/2] x86/mce/amd: Fix memory leak when `threshold_create_bank()` fails 2022-03-01 9:46 [PATCH v4 0/2] Two x86 fixes Ammar Faizi 2022-03-01 9:46 ` [PATCH v4 1/2] x86/delay: Fix the wrong asm constraint in `delay_loop()` Ammar Faizi @ 2022-03-01 9:46 ` Ammar Faizi 2022-03-02 17:26 ` Yazen Ghannam 1 sibling, 1 reply; 18+ messages in thread From: Ammar Faizi @ 2022-03-01 9:46 UTC (permalink / raw) To: Borislav Petkov Cc: Dave Hansen, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Tony Luck, linux-edac, linux-kernel, gwml, x86, stable, Alviro Iskandar Setiawan, Jiri Hladky, Greg Kroah-Hartman, Ammar Faizi From: Ammar Faizi <[email protected]> @bp is a local variable, calling mce_threshold_remove_device() when threshold_create_bank() fails will not free the @bp. Note that mce_threshold_remove_device() frees the @bp only if it's already stored in the @threshold_banks per-CPU variable. At that point, the @threshold_banks per-CPU variable is still NULL, so the mce_threshold_remove_device() will just be a no-op and the @bp is leaked. Fix this by storing @bp to @threshold_banks before the loop, so in case we fail, mce_threshold_remove_device() will free the @bp. This bug is introduced by commit 6458de97fc15530b544 ("x86/mce/amd: Straighten CPU hotplug path") [1]. Link: https://lore.kernel.org/all/[email protected] [1] v4: - Add the link to the commit reference again. v3: - Fold in changes from Alviro, the previous version is still leaking @bank[n]. v2: - No changes. Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tony Luck <[email protected]> Cc: [email protected] # v5.8+ Fixes: 6458de97fc15 ("x86/mce/amd: Straighten CPU hotplug path") Co-authored-by: Alviro Iskandar Setiawan <[email protected]> Signed-off-by: Alviro Iskandar Setiawan <[email protected]> Signed-off-by: Ammar Faizi <[email protected]> --- arch/x86/kernel/cpu/mce/amd.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 9f4b508886dd..a5ef161facd9 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -1346,19 +1346,23 @@ int mce_threshold_create_device(unsigned int cpu) if (!bp) return -ENOMEM; + /* + * If we fail, mce_threshold_remove_device() will free the @bp + * via @threshold_banks. + */ + this_cpu_write(threshold_banks, bp); + for (bank = 0; bank < numbanks; ++bank) { if (!(this_cpu_read(bank_map) & (1 << bank))) continue; err = threshold_create_bank(bp, cpu, bank); - if (err) - goto out_err; + if (err) { + mce_threshold_remove_device(cpu); + return err; + } } - this_cpu_write(threshold_banks, bp); if (thresholding_irq_en) mce_threshold_vector = amd_threshold_interrupt; return 0; -out_err: - mce_threshold_remove_device(cpu); - return err; } -- 2.32.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v4 2/2] x86/mce/amd: Fix memory leak when `threshold_create_bank()` fails 2022-03-01 9:46 ` [PATCH v4 2/2] x86/mce/amd: Fix memory leak when `threshold_create_bank()` fails Ammar Faizi @ 2022-03-02 17:26 ` Yazen Ghannam 2022-03-02 23:20 ` Ammar Faizi 0 siblings, 1 reply; 18+ messages in thread From: Yazen Ghannam @ 2022-03-02 17:26 UTC (permalink / raw) To: Ammar Faizi Cc: Borislav Petkov, Dave Hansen, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Tony Luck, linux-edac, linux-kernel, gwml, x86, stable, Alviro Iskandar Setiawan, Jiri Hladky, Greg Kroah-Hartman On Tue, Mar 01, 2022 at 04:46:08PM +0700, Ammar Faizi wrote: > From: Ammar Faizi <[email protected]> > Hi Ammar, ... > diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c > index 9f4b508886dd..a5ef161facd9 100644 > --- a/arch/x86/kernel/cpu/mce/amd.c > +++ b/arch/x86/kernel/cpu/mce/amd.c > @@ -1346,19 +1346,23 @@ int mce_threshold_create_device(unsigned int cpu) > if (!bp) > return -ENOMEM; > > + /* > + * If we fail, mce_threshold_remove_device() will free the @bp > + * via @threshold_banks. > + */ > + this_cpu_write(threshold_banks, bp); > + > for (bank = 0; bank < numbanks; ++bank) { > if (!(this_cpu_read(bank_map) & (1 << bank))) > continue; > err = threshold_create_bank(bp, cpu, bank); > - if (err) > - goto out_err; > + if (err) { > + mce_threshold_remove_device(cpu); > + return err; > + } > } > - this_cpu_write(threshold_banks, bp); > The threshold interrupt handler uses this pointer. I think the goal here is to set this pointer when the list is fully formed and clear this pointer before making any changes to the list. Otherwise, the interrupt handler will operate on incomplete data if an interrupt comes in the middle of these updates. The changes below should deal with memory leak issue while avoiding a race with the threshold interrupt. What do you think? Thanks, Yazen diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 1940d305db1c..8f3b7859331d 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -1294,10 +1294,22 @@ static void threshold_remove_bank(struct threshold_bank *bank) kfree(bank); } +void _mce_threshold_remove_device(struct threshold_bank **bp) +{ + unsigned int bank, numbanks = this_cpu_read(mce_num_banks); + + for (bank = 0; bank < numbanks; bank++) { + if (bp[bank]) { + threshold_remove_bank(bp[bank]); + bp[bank] = NULL; + } + } + kfree(bp); +} + int mce_threshold_remove_device(unsigned int cpu) { struct threshold_bank **bp = this_cpu_read(threshold_banks); - unsigned int bank, numbanks = this_cpu_read(mce_num_banks); if (!bp) return 0; @@ -1308,13 +1320,7 @@ int mce_threshold_remove_device(unsigned int cpu) */ this_cpu_write(threshold_banks, NULL); - for (bank = 0; bank < numbanks; bank++) { - if (bp[bank]) { - threshold_remove_bank(bp[bank]); - bp[bank] = NULL; - } - } - kfree(bp); + _mce_threshold_remove_device(bp); return 0; } @@ -1360,6 +1366,6 @@ int mce_threshold_create_device(unsigned int cpu) mce_threshold_vector = amd_threshold_interrupt; return 0; out_err: - mce_threshold_remove_device(cpu); + _mce_threshold_remove_device(bp); return err; } ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v4 2/2] x86/mce/amd: Fix memory leak when `threshold_create_bank()` fails 2022-03-02 17:26 ` Yazen Ghannam @ 2022-03-02 23:20 ` Ammar Faizi 2022-03-02 23:27 ` Ammar Faizi 0 siblings, 1 reply; 18+ messages in thread From: Ammar Faizi @ 2022-03-02 23:20 UTC (permalink / raw) To: Yazen Ghannam Cc: Borislav Petkov, Dave Hansen, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Tony Luck, linux-edac, linux-kernel, gwml, x86, stable, Alviro Iskandar Setiawan, Jiri Hladky, Greg Kroah-Hartman On 3/3/22 12:26 AM, Yazen Ghannam wrote: > Hi Ammar, Hi Yazen, > ... > The threshold interrupt handler uses this pointer. I think the goal here is to > set this pointer when the list is fully formed and clear this pointer before > making any changes to the list. Otherwise, the interrupt handler will operate > on incomplete data if an interrupt comes in the middle of these updates. > > The changes below should deal with memory leak issue while avoiding a race > with the threshold interrupt. What do you think? Thanks for taking a look into this. I didn't notice that before. The changes look good to me, extra improvements: 1) _mce_threshold_remove_device() should be static as we don't use it in another translation unit. 2) Minor cleanup, we don't need "goto out_err", just early return directly. I will fold them in... -- Ammar Faizi ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 2/2] x86/mce/amd: Fix memory leak when `threshold_create_bank()` fails 2022-03-02 23:20 ` Ammar Faizi @ 2022-03-02 23:27 ` Ammar Faizi 2022-03-03 1:58 ` Alviro Iskandar Setiawan 0 siblings, 1 reply; 18+ messages in thread From: Ammar Faizi @ 2022-03-02 23:27 UTC (permalink / raw) To: Yazen Ghannam Cc: Borislav Petkov, Dave Hansen, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Tony Luck, linux-edac, linux-kernel, gwml, x86, stable, Alviro Iskandar Setiawan, Jiri Hladky, Greg Kroah-Hartman On 3/3/22 6:20 AM, Ammar Faizi wrote: > On 3/3/22 12:26 AM, Yazen Ghannam wrote: >> Hi Ammar, > > Hi Yazen, > >> ... >> The threshold interrupt handler uses this pointer. I think the goal here is to >> set this pointer when the list is fully formed and clear this pointer before >> making any changes to the list. Otherwise, the interrupt handler will operate >> on incomplete data if an interrupt comes in the middle of these updates. >> >> The changes below should deal with memory leak issue while avoiding a race >> with the threshold interrupt. What do you think? > > Thanks for taking a look into this. I didn't notice that before. The > changes look good to me, extra improvements: > > 1) _mce_threshold_remove_device() should be static as we don't use it > in another translation unit. > 2) Minor cleanup, we don't need "goto out_err", just early return > directly. > > I will fold them in... > Please review the patch below, if you think it looks good, I will send this for the v5 series. I added your sign-off. From cae3965734a67d11a5286c612dfddf52398defc8 Mon Sep 17 00:00:00 2001 From: Ammar Faizi <[email protected]> Date: Thu, 3 Mar 2022 05:07:38 +0700 Subject: [PATCH v5 2/2] x86/MCE/AMD: Fix memory leak when `threshold_create_bank()` fails In mce_threshold_create_device(), when threshold_create_bank() fails, the @bp will be leaked, because mce_threshold_remove_device() will not free the @bp. It only frees the @bp when we've already written the @bp to the @threshold_banks per-CPU variable, but at the point, we haven't. Fix this by extracting the cleanup part into a new static function _mce_threshold_remove_device(), then use it from create and remove device function. Also, eliminate the "goto out_err". Just early return inside the loop when we fail. Cc: Borislav Petkov <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: [email protected] # v5.8+ Fixes: 6458de97fc15 ("x86/mce/amd: Straighten CPU hotplug path") Co-authored-by: Alviro Iskandar Setiawan <[email protected]> Signed-off-by: Alviro Iskandar Setiawan <[email protected]> Co-authored-by: Yazen Ghannam <[email protected]> Signed-off-by: Yazen Ghannam <[email protected]> Signed-off-by: Ammar Faizi <[email protected]> --- arch/x86/kernel/cpu/mce/amd.c | 31 ++++++++++++++++++------------- 1 file changed, 18 insertions(+), 13 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 9f4b508886dd..ac7246a4de08 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -1293,10 +1293,22 @@ static void threshold_remove_bank(struct threshold_bank *bank) kfree(bank); } +static void _mce_threshold_remove_device(struct threshold_bank **bp) +{ + unsigned int bank, numbanks = this_cpu_read(mce_num_banks); + + for (bank = 0; bank < numbanks; bank++) { + if (bp[bank]) { + threshold_remove_bank(bp[bank]); + bp[bank] = NULL; + } + } + kfree(bp); +} + int mce_threshold_remove_device(unsigned int cpu) { struct threshold_bank **bp = this_cpu_read(threshold_banks); - unsigned int bank, numbanks = this_cpu_read(mce_num_banks); if (!bp) return 0; @@ -1307,13 +1319,7 @@ int mce_threshold_remove_device(unsigned int cpu) */ this_cpu_write(threshold_banks, NULL); - for (bank = 0; bank < numbanks; bank++) { - if (bp[bank]) { - threshold_remove_bank(bp[bank]); - bp[bank] = NULL; - } - } - kfree(bp); + _mce_threshold_remove_device(bp); return 0; } @@ -1350,15 +1356,14 @@ int mce_threshold_create_device(unsigned int cpu) if (!(this_cpu_read(bank_map) & (1 << bank))) continue; err = threshold_create_bank(bp, cpu, bank); - if (err) - goto out_err; + if (err) { + _mce_threshold_remove_device(bp); + return err; + } } this_cpu_write(threshold_banks, bp); if (thresholding_irq_en) mce_threshold_vector = amd_threshold_interrupt; return 0; -out_err: - mce_threshold_remove_device(cpu); - return err; } -- Ammar Faizi ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v4 2/2] x86/mce/amd: Fix memory leak when `threshold_create_bank()` fails 2022-03-02 23:27 ` Ammar Faizi @ 2022-03-03 1:58 ` Alviro Iskandar Setiawan 2022-03-03 2:07 ` Ammar Faizi 0 siblings, 1 reply; 18+ messages in thread From: Alviro Iskandar Setiawan @ 2022-03-03 1:58 UTC (permalink / raw) To: Ammar Faizi, Yazen Ghannam Cc: Borislav Petkov, Dave Hansen, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Tony Luck, linux-edac, linux-kernel, gwml, x86, stable, Alviro Iskandar Setiawan, Jiri Hladky, Greg Kroah-Hartman On Thu, 3 Mar 2022 06:27:33 +0700, Ammar Faizi wrote: > +static void _mce_threshold_remove_device(struct threshold_bank **bp) > +{ > + unsigned int bank, numbanks = this_cpu_read(mce_num_banks); > + > + for (bank = 0; bank < numbanks; bank++) { > + if (bp[bank]) { > + threshold_remove_bank(bp[bank]); > + bp[bank] = NULL; > + } > + } > + kfree(bp); > +} hi sir, i think this can be improved again, we can avoid calling this_cpu_read(mce_num_banks) twice if we pass the numbanks as an argument, plz review the changes below diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 9f4b508886dd..e492efab68a2 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -1293,10 +1293,23 @@ static void threshold_remove_bank(struct threshold_bank *bank) kfree(bank); } +static void _mce_threshold_remove_device(struct threshold_bank **bp, + unsigned int numbanks) +{ + unsigned int bank; + + for (bank = 0; bank < numbanks; bank++) { + if (bp[bank]) { + threshold_remove_bank(bp[bank]); + bp[bank] = NULL; + } + } + kfree(bp); +} + int mce_threshold_remove_device(unsigned int cpu) { struct threshold_bank **bp = this_cpu_read(threshold_banks); - unsigned int bank, numbanks = this_cpu_read(mce_num_banks); if (!bp) return 0; @@ -1307,13 +1320,7 @@ int mce_threshold_remove_device(unsigned int cpu) */ this_cpu_write(threshold_banks, NULL); - for (bank = 0; bank < numbanks; bank++) { - if (bp[bank]) { - threshold_remove_bank(bp[bank]); - bp[bank] = NULL; - } - } - kfree(bp); + _mce_threshold_remove_device(bp, this_cpu_read(mce_num_banks)); return 0; } @@ -1350,15 +1357,14 @@ int mce_threshold_create_device(unsigned int cpu) if (!(this_cpu_read(bank_map) & (1 << bank))) continue; err = threshold_create_bank(bp, cpu, bank); - if (err) - goto out_err; + if (err) { + _mce_threshold_remove_device(bp, numbanks); + return err; + } } this_cpu_write(threshold_banks, bp); if (thresholding_irq_en) mce_threshold_vector = amd_threshold_interrupt; return 0; -out_err: - mce_threshold_remove_device(cpu); - return err; } -- Viro ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v4 2/2] x86/mce/amd: Fix memory leak when `threshold_create_bank()` fails 2022-03-03 1:58 ` Alviro Iskandar Setiawan @ 2022-03-03 2:07 ` Ammar Faizi 2022-03-03 2:32 ` Ammar Faizi 0 siblings, 1 reply; 18+ messages in thread From: Ammar Faizi @ 2022-03-03 2:07 UTC (permalink / raw) To: Alviro Iskandar Setiawan, Yazen Ghannam Cc: Borislav Petkov, Dave Hansen, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Tony Luck, linux-edac, linux-kernel, gwml, x86, stable, Alviro Iskandar Setiawan, Jiri Hladky, Greg Kroah-Hartman On 3/3/22 8:58 AM, Alviro Iskandar Setiawan wrote: > hi sir, i think this can be improved again, we can avoid calling > this_cpu_read(mce_num_banks) twice if we pass the numbanks as an > argument, plz review the changes below OK, nice improvement. I will fold this in... -- Ammar Faizi ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 2/2] x86/mce/amd: Fix memory leak when `threshold_create_bank()` fails 2022-03-03 2:07 ` Ammar Faizi @ 2022-03-03 2:32 ` Ammar Faizi 2022-03-03 2:51 ` Alviro Iskandar Setiawan 2022-03-07 0:27 ` Ammar Faizi 0 siblings, 2 replies; 18+ messages in thread From: Ammar Faizi @ 2022-03-03 2:32 UTC (permalink / raw) To: Alviro Iskandar Setiawan, Yazen Ghannam Cc: Borislav Petkov, Dave Hansen, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Tony Luck, linux-edac, linux-kernel, gwml, x86, stable, Alviro Iskandar Setiawan, Jiri Hladky, Greg Kroah-Hartman On 3/3/22 9:07 AM, Ammar Faizi wrote: > On 3/3/22 8:58 AM, Alviro Iskandar Setiawan wrote: >> hi sir, i think this can be improved again, we can avoid calling >> this_cpu_read(mce_num_banks) twice if we pass the numbanks as an >> argument, plz review the changes below > > OK, nice improvement. I will fold this in... > It looks like this now. Yazen, Alviro, please review the following patch. If you think it looks good, I will submit it for the v5. From 91a447f837d502b7a040cd68f333fb98f4b941d9 Mon Sep 17 00:00:00 2001 From: Ammar Faizi <[email protected]> Date: Thu, 3 Mar 2022 09:22:17 +0700 Subject: [PATCH v5 2/2] x86/MCE/AMD: Fix memory leak when `threshold_create_bank()` fails In mce_threshold_create_device(), if threshold_create_bank() fails, the @bp will be leaked, because mce_threshold_remove_device() will not free the @bp. It only frees the @bp when we've already written the @bp to the @threshold_banks per-CPU variable, but at the point, we haven't. Fix this by extracting the cleanup part into a new static function _mce_threshold_remove_device(), then use it from create/remove device function. Also, eliminate the "goto out_err". Just early return inside the loop if we fail. Cc: Borislav Petkov <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: [email protected] # v5.8+ Fixes: 6458de97fc15 ("x86/mce/amd: Straighten CPU hotplug path") Co-authored-by: Alviro Iskandar Setiawan <[email protected]> Signed-off-by: Alviro Iskandar Setiawan <[email protected]> Co-authored-by: Yazen Ghannam <[email protected]> Signed-off-by: Yazen Ghannam <[email protected]> Signed-off-by: Ammar Faizi <[email protected]> --- arch/x86/kernel/cpu/mce/amd.c | 32 +++++++++++++++++++------------- 1 file changed, 19 insertions(+), 13 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 9f4b508886dd..e492efab68a2 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -1293,10 +1293,23 @@ static void threshold_remove_bank(struct threshold_bank *bank) kfree(bank); } +static void _mce_threshold_remove_device(struct threshold_bank **bp, + unsigned int numbanks) +{ + unsigned int bank; + + for (bank = 0; bank < numbanks; bank++) { + if (bp[bank]) { + threshold_remove_bank(bp[bank]); + bp[bank] = NULL; + } + } + kfree(bp); +} + int mce_threshold_remove_device(unsigned int cpu) { struct threshold_bank **bp = this_cpu_read(threshold_banks); - unsigned int bank, numbanks = this_cpu_read(mce_num_banks); if (!bp) return 0; @@ -1307,13 +1320,7 @@ int mce_threshold_remove_device(unsigned int cpu) */ this_cpu_write(threshold_banks, NULL); - for (bank = 0; bank < numbanks; bank++) { - if (bp[bank]) { - threshold_remove_bank(bp[bank]); - bp[bank] = NULL; - } - } - kfree(bp); + _mce_threshold_remove_device(bp, this_cpu_read(mce_num_banks)); return 0; } @@ -1350,15 +1357,14 @@ int mce_threshold_create_device(unsigned int cpu) if (!(this_cpu_read(bank_map) & (1 << bank))) continue; err = threshold_create_bank(bp, cpu, bank); - if (err) - goto out_err; + if (err) { + _mce_threshold_remove_device(bp, numbanks); + return err; + } } this_cpu_write(threshold_banks, bp); if (thresholding_irq_en) mce_threshold_vector = amd_threshold_interrupt; return 0; -out_err: - mce_threshold_remove_device(cpu); - return err; } -- Ammar Faizi ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v4 2/2] x86/mce/amd: Fix memory leak when `threshold_create_bank()` fails 2022-03-03 2:32 ` Ammar Faizi @ 2022-03-03 2:51 ` Alviro Iskandar Setiawan 2022-03-07 0:27 ` Ammar Faizi 1 sibling, 0 replies; 18+ messages in thread From: Alviro Iskandar Setiawan @ 2022-03-03 2:51 UTC (permalink / raw) To: Ammar Faizi, Yazen Ghannam Cc: Borislav Petkov, Dave Hansen, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Tony Luck, linux-edac, Linux Kernel Mailing List, GNU/Weeb Mailing List, x86, stable, Jiri Hladky, Greg Kroah-Hartman, Alviro Iskandar Setiawan On Thu, Mar 3, 2022 at 9:32 AM Ammar Faizi wrote: > On 3/3/22 9:07 AM, Ammar Faizi wrote: > > On 3/3/22 8:58 AM, Alviro Iskandar Setiawan wrote: > > > hi sir, i think this can be improved again, we can avoid calling > > > this_cpu_read(mce_num_banks) twice if we pass the numbanks as an > > > argument, plz review the changes below > > > > OK, nice improvement. I will fold this in... > > > > It looks like this now. Yazen, Alviro, please review the > following patch. If you think it looks good, I will submit > it for the v5. > i think it looks good, thanks sir -- Viro ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 2/2] x86/mce/amd: Fix memory leak when `threshold_create_bank()` fails 2022-03-03 2:32 ` Ammar Faizi 2022-03-03 2:51 ` Alviro Iskandar Setiawan @ 2022-03-07 0:27 ` Ammar Faizi 2022-03-09 20:55 ` Yazen Ghannam 1 sibling, 1 reply; 18+ messages in thread From: Ammar Faizi @ 2022-03-07 0:27 UTC (permalink / raw) To: Yazen Ghannam Cc: Borislav Petkov, Dave Hansen, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Tony Luck, linux-edac, linux-kernel, gwml, x86, stable, Alviro Iskandar Setiawan, Jiri Hladky, Greg Kroah-Hartman, Alviro Iskandar Setiawan On 3/3/22 9:32 AM, Ammar Faizi wrote: > It looks like this now. Yazen, Alviro, please review the > following patch. If you think it looks good, I will submit > it for the v5. > > From 91a447f837d502b7a040cd68f333fb98f4b941d9 Mon Sep 17 00:00:00 2001 > From: Ammar Faizi <[email protected]> > Date: Thu, 3 Mar 2022 09:22:17 +0700 > Subject: [PATCH v5 2/2] x86/MCE/AMD: Fix memory leak when `threshold_create_bank()` fails > > In mce_threshold_create_device(), if threshold_create_bank() fails, the > @bp will be leaked, because mce_threshold_remove_device() will not free > the @bp. It only frees the @bp when we've already written the @bp to > the @threshold_banks per-CPU variable, but at the point, we haven't. > > Fix this by extracting the cleanup part into a new static function > _mce_threshold_remove_device(), then use it from create/remove device > function. Also, eliminate the "goto out_err". Just early return inside > the loop if we fail. > > Cc: Borislav Petkov <[email protected]> > Cc: Thomas Gleixner <[email protected]> > Cc: Greg Kroah-Hartman <[email protected]> > Cc: [email protected] # v5.8+ > Fixes: 6458de97fc15 ("x86/mce/amd: Straighten CPU hotplug path") > Co-authored-by: Alviro Iskandar Setiawan <[email protected]> > Signed-off-by: Alviro Iskandar Setiawan <[email protected]> > Co-authored-by: Yazen Ghannam <[email protected]> > Signed-off-by: Yazen Ghannam <[email protected]> > Signed-off-by: Ammar Faizi <[email protected]> Hi, It's Monday morning... Friendly ping for Yazen from AMD. What do you think of this patch? If it looks good, I will submit it for the v5 revision. See the ref below if you lost track of the full message. Ref: https://lore.kernel.org/lkml/[email protected]/ Thanks! -- Ammar Faizi ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 2/2] x86/mce/amd: Fix memory leak when `threshold_create_bank()` fails 2022-03-07 0:27 ` Ammar Faizi @ 2022-03-09 20:55 ` Yazen Ghannam 2022-03-10 1:56 ` Ammar Faizi 0 siblings, 1 reply; 18+ messages in thread From: Yazen Ghannam @ 2022-03-09 20:55 UTC (permalink / raw) To: Ammar Faizi Cc: Borislav Petkov, Dave Hansen, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Tony Luck, linux-edac, linux-kernel, gwml, x86, stable, Alviro Iskandar Setiawan, Jiri Hladky, Greg Kroah-Hartman, Alviro Iskandar Setiawan On Mon, Mar 07, 2022 at 07:27:44AM +0700, Ammar Faizi wrote: > On 3/3/22 9:32 AM, Ammar Faizi wrote: > > It looks like this now. Yazen, Alviro, please review the > > following patch. If you think it looks good, I will submit > > it for the v5. > > > > From 91a447f837d502b7a040cd68f333fb98f4b941d9 Mon Sep 17 00:00:00 2001 > > From: Ammar Faizi <[email protected]> > > Date: Thu, 3 Mar 2022 09:22:17 +0700 > > Subject: [PATCH v5 2/2] x86/MCE/AMD: Fix memory leak when `threshold_create_bank()` fails > > > > In mce_threshold_create_device(), if threshold_create_bank() fails, the > > @bp will be leaked, because mce_threshold_remove_device() will not free > > the @bp. It only frees the @bp when we've already written the @bp to > > the @threshold_banks per-CPU variable, but at the point, we haven't. > > > > Fix this by extracting the cleanup part into a new static function > > _mce_threshold_remove_device(), then use it from create/remove device > > function. Also, eliminate the "goto out_err". Just early return inside > > the loop if we fail. > > The commit message should use passive voice: no "I" or "we". Otherwise, I think the patch looks good. Thanks, Yazen ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 2/2] x86/mce/amd: Fix memory leak when `threshold_create_bank()` fails 2022-03-09 20:55 ` Yazen Ghannam @ 2022-03-10 1:56 ` Ammar Faizi 0 siblings, 0 replies; 18+ messages in thread From: Ammar Faizi @ 2022-03-10 1:56 UTC (permalink / raw) To: Yazen Ghannam Cc: Borislav Petkov, Dave Hansen, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Tony Luck, linux-edac, linux-kernel, gwml, x86, stable, Alviro Iskandar Setiawan, Jiri Hladky, Greg Kroah-Hartman, Alviro Iskandar Setiawan On 3/10/22 3:55 AM, Yazen Ghannam wrote: > The commit message should use passive voice: no "I" or "we". > > Otherwise, I think the patch looks good. Fixed in the v5 revision, thanks! Link: https://lore.kernel.org/lkml/[email protected]/ -- Ammar Faizi ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2022-03-10 1:56 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-03-01 9:46 [PATCH v4 0/2] Two x86 fixes Ammar Faizi 2022-03-01 9:46 ` [PATCH v4 1/2] x86/delay: Fix the wrong asm constraint in `delay_loop()` Ammar Faizi 2022-03-01 9:54 ` David Laight 2022-03-03 0:14 ` Ammar Faizi 2022-03-01 11:33 ` Alviro Iskandar Setiawan 2022-03-03 0:06 ` Ammar Faizi 2022-03-03 0:35 ` David Laight 2022-03-01 9:46 ` [PATCH v4 2/2] x86/mce/amd: Fix memory leak when `threshold_create_bank()` fails Ammar Faizi 2022-03-02 17:26 ` Yazen Ghannam 2022-03-02 23:20 ` Ammar Faizi 2022-03-02 23:27 ` Ammar Faizi 2022-03-03 1:58 ` Alviro Iskandar Setiawan 2022-03-03 2:07 ` Ammar Faizi 2022-03-03 2:32 ` Ammar Faizi 2022-03-03 2:51 ` Alviro Iskandar Setiawan 2022-03-07 0:27 ` Ammar Faizi 2022-03-09 20:55 ` Yazen Ghannam 2022-03-10 1:56 ` Ammar Faizi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox