* [PATCH 0/1] x86 change mov 0, %reg to xor %reg, %reg @ 2022-08-04 15:26 Kanna Scarlet 2022-08-04 15:26 ` [PATCH 1/1] x86: Change mov $0, %reg with " Kanna Scarlet 0 siblings, 1 reply; 13+ messages in thread From: Kanna Scarlet @ 2022-08-04 15:26 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, x86 Cc: Kanna Scarlet, Ard Biesheuvel, Bill Metzenthen, Brijesh Singh, Joerg Roedel, Josh Poimboeuf, Kirill A. Shutemov, Mark Rutland, Michael Roth, Peter Zijlstra, Sean Christopherson, Steven Rostedt, Ammar Faizi, GNU/Weeb Mailing List, Linux Kernel Mailing List Hello Linux x86 maintainers, I'm an informatic student 19 y.o. I am still studying Linux kernel open source in GNU/Weeb community. I want to be a linux kerne dev in the future. This is my first time sending a patch to Linux Kernel, I am still learning the community. I may make a mistake in this email, please correct me if i am wrong I want to improve x86-64 assembly code with this patch. This patch changes mov $0, %reg with xor %reg, %reg because xor %reg, %reg is smaller so it is good to save space asm: ba 00 00 00 00 mov $0x0,%edx 31 d2 xor %edx,%edx Regards, Signed-off-by: Kanna Scarlet <[email protected]> --- Kanna Scarlet (1): x86: Change mov $0, %reg with xor %reg, %reg arch/x86/boot/compressed/head_64.S | 2 +- arch/x86/boot/compressed/mem_encrypt.S | 2 +- arch/x86/kernel/ftrace_32.S | 4 ++-- arch/x86/kernel/head_64.S | 2 +- arch/x86/math-emu/div_Xsig.S | 2 +- arch/x86/math-emu/reg_u_sub.S | 2 +- 6 files changed, 7 insertions(+), 7 deletions(-) base-commit: ff89dd08c0f0a3fd330c9ef9d775e880f82c291e -- Kanna Scarlet ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg 2022-08-04 15:26 [PATCH 0/1] x86 change mov 0, %reg to xor %reg, %reg Kanna Scarlet @ 2022-08-04 15:26 ` Kanna Scarlet 2022-08-04 15:53 ` Borislav Petkov 0 siblings, 1 reply; 13+ messages in thread From: Kanna Scarlet @ 2022-08-04 15:26 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, x86 Cc: Kanna Scarlet, Ard Biesheuvel, Bill Metzenthen, Brijesh Singh, Joerg Roedel, Josh Poimboeuf, Kirill A. Shutemov, Mark Rutland, Michael Roth, Peter Zijlstra, Sean Christopherson, Steven Rostedt, Ammar Faizi, GNU/Weeb Mailing List, Linux Kernel Mailing List Change mov $0, %reg with xor %reg, %reg because xor %reg, %reg is smaller so it is good to save space asm: ba 00 00 00 00 movl $0x0,%edx 31 d2 xorl %edx,%edx Suggested-by: Ammar Faizi <[email protected]> Signed-off-by: Kanna Scarlet <[email protected]> --- arch/x86/boot/compressed/head_64.S | 2 +- arch/x86/boot/compressed/mem_encrypt.S | 2 +- arch/x86/kernel/ftrace_32.S | 4 ++-- arch/x86/kernel/head_64.S | 2 +- arch/x86/math-emu/div_Xsig.S | 2 +- arch/x86/math-emu/reg_u_sub.S | 2 +- 6 files changed, 7 insertions(+), 7 deletions(-) diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S index d33f060900d2..39442e7f5993 100644 --- a/arch/x86/boot/compressed/head_64.S +++ b/arch/x86/boot/compressed/head_64.S @@ -666,7 +666,7 @@ SYM_CODE_START(trampoline_32bit_src) movl %cr4, %eax andl $X86_CR4_MCE, %eax #else - movl $0, %eax + xorl %eax, %eax #endif /* Enable PAE and LA57 (if required) paging modes */ diff --git a/arch/x86/boot/compressed/mem_encrypt.S b/arch/x86/boot/compressed/mem_encrypt.S index a73e4d783cae..d1e4d3aa8395 100644 --- a/arch/x86/boot/compressed/mem_encrypt.S +++ b/arch/x86/boot/compressed/mem_encrypt.S @@ -111,7 +111,7 @@ SYM_CODE_START(startup32_vc_handler) cmpl $0x72, 16(%esp) jne .Lfail - movl $0, %eax # Request CPUID[fn].EAX + xorl %eax, %eax # Request CPUID[fn].EAX movl %ebx, %edx # CPUID fn call sev_es_req_cpuid # Call helper testl %eax, %eax # Check return code diff --git a/arch/x86/kernel/ftrace_32.S b/arch/x86/kernel/ftrace_32.S index a0ed0e4a2c0c..cff7decb58be 100644 --- a/arch/x86/kernel/ftrace_32.S +++ b/arch/x86/kernel/ftrace_32.S @@ -171,7 +171,7 @@ SYM_CODE_START(ftrace_graph_caller) movl 3*4(%esp), %eax /* Even with frame pointers, fentry doesn't have one here */ lea 4*4(%esp), %edx - movl $0, %ecx + xorl %ecx, %ecx subl $MCOUNT_INSN_SIZE, %eax call prepare_ftrace_return popl %edx @@ -184,7 +184,7 @@ SYM_CODE_END(ftrace_graph_caller) return_to_handler: pushl %eax pushl %edx - movl $0, %eax + xorl %eax, %eax call ftrace_return_to_handler movl %eax, %ecx popl %edx diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index d860d437631b..eeb06047e30a 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -184,7 +184,7 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL) movq %cr4, %rcx andl $X86_CR4_MCE, %ecx #else - movl $0, %ecx + xorl %ecx, %ecx #endif /* Enable PAE mode, PGE and LA57 */ diff --git a/arch/x86/math-emu/div_Xsig.S b/arch/x86/math-emu/div_Xsig.S index 8c270ab415be..5767b4d23954 100644 --- a/arch/x86/math-emu/div_Xsig.S +++ b/arch/x86/math-emu/div_Xsig.S @@ -122,7 +122,7 @@ SYM_FUNC_START(div_Xsig) movl XsigLL(%esi),%eax rcrl %eax movl %eax,FPU_accum_1 - movl $0,%eax + xorl %eax,%eax rcrl %eax movl %eax,FPU_accum_0 diff --git a/arch/x86/math-emu/reg_u_sub.S b/arch/x86/math-emu/reg_u_sub.S index 4c900c29e4ff..130b49fa1ca2 100644 --- a/arch/x86/math-emu/reg_u_sub.S +++ b/arch/x86/math-emu/reg_u_sub.S @@ -212,7 +212,7 @@ L_must_be_zero: L_shift_32: movl %ebx,%eax movl %edx,%ebx - movl $0,%edx + xorl %edx,%edx subw $32,EXP(%edi) /* Can get underflow here */ /* We need to shift left by 1 - 31 bits */ -- Kanna Scarlet ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg 2022-08-04 15:26 ` [PATCH 1/1] x86: Change mov $0, %reg with " Kanna Scarlet @ 2022-08-04 15:53 ` Borislav Petkov 2022-08-04 18:08 ` Kanna Scarlet 0 siblings, 1 reply; 13+ messages in thread From: Borislav Petkov @ 2022-08-04 15:53 UTC (permalink / raw) To: Kanna Scarlet Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, H. Peter Anvin, x86, Ard Biesheuvel, Bill Metzenthen, Brijesh Singh, Joerg Roedel, Josh Poimboeuf, Kirill A. Shutemov, Mark Rutland, Michael Roth, Peter Zijlstra, Sean Christopherson, Steven Rostedt, Ammar Faizi, GNU/Weeb Mailing List, Linux Kernel Mailing List On Thu, Aug 04, 2022 at 03:26:55PM +0000, Kanna Scarlet wrote: > Change mov $0, %reg with xor %reg, %reg because xor %reg, %reg is > smaller so it is good to save space Bonus points if you find out what other advantage XOR reg,reg has when it comes to clearing integer registers. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg 2022-08-04 15:53 ` Borislav Petkov @ 2022-08-04 18:08 ` Kanna Scarlet 2022-08-05 9:26 ` David Laight 2022-08-05 9:54 ` Borislav Petkov 0 siblings, 2 replies; 13+ messages in thread From: Kanna Scarlet @ 2022-08-04 18:08 UTC (permalink / raw) To: Borislav Petkov Cc: Kanna Scarlet, Thomas Gleixner, Ingo Molnar, Dave Hansen, H. Peter Anvin, x86, Ard Biesheuvel, Bill Metzenthen, Brijesh Singh, Joerg Roedel, Josh Poimboeuf, Kirill A. Shutemov, Mark Rutland, Michael Roth, Peter Zijlstra, Sean Christopherson, Steven Rostedt, Ammar Faizi, GNU/Weeb Mailing List, Linux Kernel Mailing List On 8/4/22 10:53 PM, Borislav Petkov wrote: > Bonus points if you find out what other advantage > > XOR reg,reg > > has when it comes to clearing integer registers. Hello sir Borislav, Thank you for your response. I tried to find out other advantages of xor reg,reg on Google and found this: https://stackoverflow.com/a/33668295/7275114 "xor (being a recognized zeroing idiom, unlike mov reg, 0) has some obvious and some subtle advantages: 1. smaller code-size than mov reg,0. (All CPUs) 2. avoids partial-register penalties for later code. (Intel P6-family and SnB-family). 3. doesn't use an execution unit, saving power and freeing up execution resources. (Intel SnB-family) 4. smaller uop (no immediate data) leaves room in the uop cache-line for nearby instructions to borrow if needed. (Intel SnB-family). 5. doesn't use up entries in the physical register file. (Intel SnB-family (and P4) at least, possibly AMD as well since they use a similar PRF design instead of keeping register state in the ROB like Intel P6-family microarchitectures.)" Should I add all in the explanation sir? I will send v2 revision tomorrow. We also find more files to patch with this command: grep -rE "mov.?\s+\\$\\0\s*," arch/x86 it shows many immediate zero moves to 64-bit register in file arch/x86/crypto/curve25519-x86_64.c, but the next instruction may depend on the previous %rflags value, we are afraid to change this because xor touches %rflags. We will try to change it to movl $0, %r32 to reduce the code size. Example cmovc needs %rflags " adcx %1, %%r11;" " movq %%r11, 24(%2);" /* Step 3: Fold the carry bit back in; guaranteed not to carry at this point */ " mov $0, %%rax;" " cmovc %%rdx, %%rax;" Thanks. Regards, -- Kanna Scarlet ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg 2022-08-04 18:08 ` Kanna Scarlet @ 2022-08-05 9:26 ` David Laight 2022-08-05 9:42 ` Joerg Roedel ` (2 more replies) 2022-08-05 9:54 ` Borislav Petkov 1 sibling, 3 replies; 13+ messages in thread From: David Laight @ 2022-08-05 9:26 UTC (permalink / raw) To: 'Kanna Scarlet', Borislav Petkov Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, H. Peter Anvin, [email protected], Ard Biesheuvel, Bill Metzenthen, Brijesh Singh, Joerg Roedel, Josh Poimboeuf, Kirill A. Shutemov, Mark Rutland, Michael Roth, Peter Zijlstra, Sean Christopherson, Steven Rostedt, Ammar Faizi, GNU/Weeb Mailing List, Linux Kernel Mailing List From: Kanna Scarlet > Sent: 04 August 2022 19:08 > > On 8/4/22 10:53 PM, Borislav Petkov wrote: > > Bonus points if you find out what other advantage > > > > XOR reg,reg > > > > has when it comes to clearing integer registers. > > Hello sir Borislav, > > Thank you for your response. I tried to find out other advantages of > xor reg,reg on Google and found this: > https://stackoverflow.com/a/33668295/7275114 > > "xor (being a recognized zeroing idiom, unlike mov reg, 0) has some > obvious and some subtle advantages: > > 1. smaller code-size than mov reg,0. (All CPUs) > 2. avoids partial-register penalties for later code. > (Intel P6-family and SnB-family). > 3. doesn't use an execution unit, saving power and freeing up > execution resources. (Intel SnB-family) > 4. smaller uop (no immediate data) leaves room in the uop cache-line > for nearby instructions to borrow if needed. (Intel SnB-family). > 5. doesn't use up entries in the physical register file. (Intel > SnB-family (and P4) at least, possibly AMD as well since they use > a similar PRF design instead of keeping register state in the ROB > like Intel P6-family microarchitectures.)" You missed one, and an additional change: Use "xor %rax,%rax" instead of "xor %eax,%eax" to save the 'reg' prefix. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg 2022-08-05 9:26 ` David Laight @ 2022-08-05 9:42 ` Joerg Roedel 2022-08-08 16:45 ` Kanna Scarlet 2022-08-08 16:38 ` Kanna Scarlet 2022-08-08 18:59 ` H. Peter Anvin 2 siblings, 1 reply; 13+ messages in thread From: Joerg Roedel @ 2022-08-05 9:42 UTC (permalink / raw) To: David Laight Cc: 'Kanna Scarlet', Borislav Petkov, Thomas Gleixner, Ingo Molnar, Dave Hansen, H. Peter Anvin, [email protected], Ard Biesheuvel, Bill Metzenthen, Brijesh Singh, Josh Poimboeuf, Kirill A. Shutemov, Mark Rutland, Michael Roth, Peter Zijlstra, Sean Christopherson, Steven Rostedt, Ammar Faizi, GNU/Weeb Mailing List, Linux Kernel Mailing List On Fri, Aug 05, 2022 at 09:26:02AM +0000, David Laight wrote: > Use "xor %rax,%rax" instead of "xor %eax,%eax" to save > the 'reg' prefix. Also, some places explicitly use the mov variant to zero a register without touching rflags. Please be careful to not change those. Regards, -- Jörg Rödel [email protected] SUSE Software Solutions Germany GmbH Frankenstraße 146 90461 Nürnberg Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg 2022-08-05 9:42 ` Joerg Roedel @ 2022-08-08 16:45 ` Kanna Scarlet 2022-08-08 18:59 ` H. Peter Anvin 0 siblings, 1 reply; 13+ messages in thread From: Kanna Scarlet @ 2022-08-08 16:45 UTC (permalink / raw) To: Joerg Roedel Cc: David Laight, Borislav Petkov, Thomas Gleixner, Ingo Molnar, Dave Hansen, H. Peter Anvin, [email protected], Ard Biesheuvel, Bill Metzenthen, Brijesh Singh, Josh Poimboeuf, Kirill A. Shutemov, Mark Rutland, Michael Roth, Peter Zijlstra, Sean Christopherson, Steven Rostedt, Ammar Faizi, GNU/Weeb Mailing List, Linux Kernel Mailing List On 8/5/22 4:42 PM, Joerg Roedel wrote: > On Fri, Aug 05, 2022 at 09:26:02AM +0000, David Laight wrote: >> Use "xor %rax,%rax" instead of "xor %eax,%eax" to save >> the 'reg' prefix. > > Also, some places explicitly use the mov variant to zero a register > without touching rflags. Please be careful to not change those. thank you for reminder, i will check again to make myself more sure the patch doesn't break this %rflags dependency situation Regards, -- Kanna Scarlet ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg 2022-08-08 16:45 ` Kanna Scarlet @ 2022-08-08 18:59 ` H. Peter Anvin 0 siblings, 0 replies; 13+ messages in thread From: H. Peter Anvin @ 2022-08-08 18:59 UTC (permalink / raw) To: Kanna Scarlet, Joerg Roedel Cc: David Laight, Borislav Petkov, Thomas Gleixner, Ingo Molnar, Dave Hansen, [email protected], Ard Biesheuvel, Bill Metzenthen, Brijesh Singh, Josh Poimboeuf, Kirill A. Shutemov, Mark Rutland, Michael Roth, Peter Zijlstra, Sean Christopherson, Steven Rostedt, Ammar Faizi, GNU/Weeb Mailing List, Linux Kernel Mailing List On August 8, 2022 9:45:45 AM PDT, Kanna Scarlet <[email protected]> wrote: >On 8/5/22 4:42 PM, Joerg Roedel wrote: >> On Fri, Aug 05, 2022 at 09:26:02AM +0000, David Laight wrote: >>> Use "xor %rax,%rax" instead of "xor %eax,%eax" to save >>> the 'reg' prefix. >> >> Also, some places explicitly use the mov variant to zero a register >> without touching rflags. Please be careful to not change those. > >thank you for reminder, i will check again to make myself more sure >the patch doesn't break this %rflags dependency situation > >Regards, In some cases you can hoist the zeroing to avoid that (and sometimes improve performance in the process), but be very careful in general when messing with hand-optimized assembly code like crypto; for those pieces of code benchmarking the change is mandatory. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg 2022-08-05 9:26 ` David Laight 2022-08-05 9:42 ` Joerg Roedel @ 2022-08-08 16:38 ` Kanna Scarlet 2022-08-08 18:59 ` H. Peter Anvin 2 siblings, 0 replies; 13+ messages in thread From: Kanna Scarlet @ 2022-08-08 16:38 UTC (permalink / raw) To: David Laight Cc: Borislav Petkov, Thomas Gleixner, Ingo Molnar, Dave Hansen, H. Peter Anvin, [email protected], Ard Biesheuvel, Bill Metzenthen, Brijesh Singh, Joerg Roedel, Josh Poimboeuf, Kirill A. Shutemov, Mark Rutland, Michael Roth, Peter Zijlstra, Sean Christopherson, Steven Rostedt, Ammar Faizi, GNU/Weeb Mailing List, Linux Kernel Mailing List On 8/5/22 4:26 PM, David Laight wrote: > Use "xor %rax,%rax" instead of "xor %eax,%eax" to save > the 'reg' prefix. hello David Laight "xor %rax,%rax" is bigger because of rex prefix, "xor %eax,%eax" is smaller because it doesn't need rex prefix. asm: 0: 48 31 c0 xor %rax,%rax 3: 31 c0 xor %eax,%eax so i think to save from rex prefix, use xor %eax,%eax instead of xor %rax,%rax. Best regards, -- Kanna Scarlet ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg 2022-08-05 9:26 ` David Laight 2022-08-05 9:42 ` Joerg Roedel 2022-08-08 16:38 ` Kanna Scarlet @ 2022-08-08 18:59 ` H. Peter Anvin 2022-08-09 7:38 ` David Laight 2 siblings, 1 reply; 13+ messages in thread From: H. Peter Anvin @ 2022-08-08 18:59 UTC (permalink / raw) To: David Laight, 'Kanna Scarlet', Borislav Petkov Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, [email protected], Ard Biesheuvel, Bill Metzenthen, Brijesh Singh, Joerg Roedel, Josh Poimboeuf, Kirill A. Shutemov, Mark Rutland, Michael Roth, Peter Zijlstra, Sean Christopherson, Steven Rostedt, Ammar Faizi, GNU/Weeb Mailing List, Linux Kernel Mailing List On August 5, 2022 2:26:02 AM PDT, David Laight <[email protected]> wrote: >From: Kanna Scarlet >> Sent: 04 August 2022 19:08 >> >> On 8/4/22 10:53 PM, Borislav Petkov wrote: >> > Bonus points if you find out what other advantage >> > >> > XOR reg,reg >> > >> > has when it comes to clearing integer registers. >> >> Hello sir Borislav, >> >> Thank you for your response. I tried to find out other advantages of >> xor reg,reg on Google and found this: >> https://stackoverflow.com/a/33668295/7275114 >> >> "xor (being a recognized zeroing idiom, unlike mov reg, 0) has some >> obvious and some subtle advantages: >> >> 1. smaller code-size than mov reg,0. (All CPUs) >> 2. avoids partial-register penalties for later code. >> (Intel P6-family and SnB-family). >> 3. doesn't use an execution unit, saving power and freeing up >> execution resources. (Intel SnB-family) >> 4. smaller uop (no immediate data) leaves room in the uop cache-line >> for nearby instructions to borrow if needed. (Intel SnB-family). >> 5. doesn't use up entries in the physical register file. (Intel >> SnB-family (and P4) at least, possibly AMD as well since they use >> a similar PRF design instead of keeping register state in the ROB >> like Intel P6-family microarchitectures.)" > >You missed one, and an additional change: > >Use "xor %rax,%rax" instead of "xor %eax,%eax" to save >the 'reg' prefix. > > David > >- >Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK >Registration No: 1397386 (Wales) > > You mean the other way around... ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg 2022-08-08 18:59 ` H. Peter Anvin @ 2022-08-09 7:38 ` David Laight 0 siblings, 0 replies; 13+ messages in thread From: David Laight @ 2022-08-09 7:38 UTC (permalink / raw) To: 'H. Peter Anvin', 'Kanna Scarlet', Borislav Petkov Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, [email protected], Ard Biesheuvel, Bill Metzenthen, Brijesh Singh, Joerg Roedel, Josh Poimboeuf, Kirill A. Shutemov, Mark Rutland, Michael Roth, Peter Zijlstra, Sean Christopherson, Steven Rostedt, Ammar Faizi, GNU/Weeb Mailing List, Linux Kernel Mailing List From: H. Peter Anvin > Sent: 08 August 2022 20:00 > > On August 5, 2022 2:26:02 AM PDT, David Laight <[email protected]> > wrote: > >From: Kanna Scarlet > >> Sent: 04 August 2022 19:08 > >> > >> On 8/4/22 10:53 PM, Borislav Petkov wrote: > >> > Bonus points if you find out what other advantage > >> > > >> > XOR reg,reg > >> > > >> > has when it comes to clearing integer registers. > >> > >> Hello sir Borislav, > >> > >> Thank you for your response. I tried to find out other advantages of > >> xor reg,reg on Google and found this: > >> https://stackoverflow.com/a/33668295/7275114 > >> > >> "xor (being a recognized zeroing idiom, unlike mov reg, 0) has some > >> obvious and some subtle advantages: > >> > >> 1. smaller code-size than mov reg,0. (All CPUs) > >> 2. avoids partial-register penalties for later code. > >> (Intel P6-family and SnB-family). > >> 3. doesn't use an execution unit, saving power and freeing up > >> execution resources. (Intel SnB-family) > >> 4. smaller uop (no immediate data) leaves room in the uop cache-line > >> for nearby instructions to borrow if needed. (Intel SnB-family). > >> 5. doesn't use up entries in the physical register file. (Intel > >> SnB-family (and P4) at least, possibly AMD as well since they use > >> a similar PRF design instead of keeping register state in the ROB > >> like Intel P6-family microarchitectures.)" > > > >You missed one, and an additional change: > > > >Use "xor %rax,%rax" instead of "xor %eax,%eax" to save > >the 'reg' prefix. > > > > David > > > >- > >Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > >Registration No: 1397386 (Wales) > > > > > > You mean the other way around... Maybe :-( The 32bit versions are best. Somehow the register naming convention ended up getting sort of 'backwards'. 'register' is bigger than 'extended'. I've 'only' been writing x86 asm since 1982! David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg 2022-08-04 18:08 ` Kanna Scarlet 2022-08-05 9:26 ` David Laight @ 2022-08-05 9:54 ` Borislav Petkov 2022-08-08 16:57 ` Kanna Scarlet 1 sibling, 1 reply; 13+ messages in thread From: Borislav Petkov @ 2022-08-05 9:54 UTC (permalink / raw) To: Kanna Scarlet Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, H. Peter Anvin, x86, Ard Biesheuvel, Bill Metzenthen, Brijesh Singh, Joerg Roedel, Josh Poimboeuf, Kirill A. Shutemov, Mark Rutland, Michael Roth, Peter Zijlstra, Sean Christopherson, Steven Rostedt, Ammar Faizi, GNU/Weeb Mailing List, Linux Kernel Mailing List On Thu, Aug 04, 2022 at 06:08:05PM +0000, Kanna Scarlet wrote: > Hello sir Borislav, Please, no "sir" - just Boris or Borislav, > Thank you for your response. I tried to find out other advantages of > xor reg,reg on Google and found this: > https://stackoverflow.com/a/33668295/7275114 > > "xor (being a recognized zeroing idiom, unlike mov reg, 0) has some > obvious and some subtle advantages: > > 1. smaller code-size than mov reg,0. (All CPUs) > 2. avoids partial-register penalties for later code. > (Intel P6-family and SnB-family). > 3. doesn't use an execution unit, saving power and freeing up > execution resources. (Intel SnB-family) > 4. smaller uop (no immediate data) leaves room in the uop cache-line > for nearby instructions to borrow if needed. (Intel SnB-family). > 5. doesn't use up entries in the physical register file. (Intel > SnB-family (and P4) at least, possibly AMD as well since they use > a similar PRF design instead of keeping register state in the ROB > like Intel P6-family microarchitectures.)" > > Should I add all in the explanation sir? You should try to understand what this means and write the gist of it in your own words. This is how you can learn something. > We also find more files to patch with this command: > > grep -rE "mov.?\s+\\$\\0\s*," arch/x86 > > it shows many immediate zero moves to 64-bit register in file > arch/x86/crypto/curve25519-x86_64.c, but the next instruction may depend > on the previous %rflags value, we are afraid to change this because > xor touches %rflags. We will try to change it to movl $0, %r32 to > reduce the code size. I don't think you need to do that - you can do this one patch in order to go through the whole process of creating and submitting a patch but you should not go on a "let's convert everything" spree just for the sake of it. Because maintainers barely have time to look at patches, you don't have to send them more when they're not really needed. Rather, I'd suggest you go and try to fix real bugs. This has some ideas what to do: https://www.linux.com/news/three-ways-beginners-contribute-linux-kernel/ Looking at the kernel bugzilla and trying to understand and reproduce a bug from there would get you a long way. And you'll learn a lot. Also, you should peruse https://www.kernel.org/doc/html/latest/process/index.html which has a lot of information about how this whole community thing works. I sincerely hope that helps. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg 2022-08-05 9:54 ` Borislav Petkov @ 2022-08-08 16:57 ` Kanna Scarlet 0 siblings, 0 replies; 13+ messages in thread From: Kanna Scarlet @ 2022-08-08 16:57 UTC (permalink / raw) To: Borislav Petkov Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, H. Peter Anvin, x86, Ard Biesheuvel, Bill Metzenthen, Brijesh Singh, Joerg Roedel, Josh Poimboeuf, Kirill A. Shutemov, Mark Rutland, Michael Roth, Peter Zijlstra, Sean Christopherson, Steven Rostedt, Ammar Faizi, GNU/Weeb Mailing List, Linux Kernel Mailing List On 8/5/22 4:54 PM, Borislav Petkov wrote: > On Thu, Aug 04, 2022 at 06:08:05PM +0000, Kanna Scarlet wrote: >> Hello sir Borislav, > > Please, no "sir" - just Boris or Borislav, ok, sorry > I don't think you need to do that - you can do this one patch in order > to go through the whole process of creating and submitting a patch but > you should not go on a "let's convert everything" spree just for the > sake of it. ok, i will try to finish the process for this one patch for learning the submitting process. After that I will avoid touching similar small improvement and focus on real kernel bugs/issues, i'll send v2 revision with only commit message improvement > Because maintainers barely have time to look at patches, you don't have > to send them more when they're not really needed. > > Rather, I'd suggest you go and try to fix real bugs. This has some ideas > what to do: > > https://www.linux.com/news/three-ways-beginners-contribute-linux-kernel/ > > Looking at the kernel bugzilla and trying to understand and reproduce a > bug from there would get you a long way. And you'll learn a lot. > > Also, you should peruse > > https://www.kernel.org/doc/html/latest/process/index.html > > which has a lot of information about how this whole community thing > works. > > I sincerely hope that helps. > > Thx. thank you for the guide, I'm following it Regards, -- Kanna Scarlet ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2022-08-09 7:38 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-08-04 15:26 [PATCH 0/1] x86 change mov 0, %reg to xor %reg, %reg Kanna Scarlet 2022-08-04 15:26 ` [PATCH 1/1] x86: Change mov $0, %reg with " Kanna Scarlet 2022-08-04 15:53 ` Borislav Petkov 2022-08-04 18:08 ` Kanna Scarlet 2022-08-05 9:26 ` David Laight 2022-08-05 9:42 ` Joerg Roedel 2022-08-08 16:45 ` Kanna Scarlet 2022-08-08 18:59 ` H. Peter Anvin 2022-08-08 16:38 ` Kanna Scarlet 2022-08-08 18:59 ` H. Peter Anvin 2022-08-09 7:38 ` David Laight 2022-08-05 9:54 ` Borislav Petkov 2022-08-08 16:57 ` Kanna Scarlet
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox