From: Ammar Faizi <[email protected]>
To: Borislav Petkov <[email protected]>
Cc: Ammar Faizi <[email protected]>,
Andy Shevchenko <[email protected]>,
Dave Hansen <[email protected]>,
"H. Peter Anvin" <[email protected]>, Ingo Molnar <[email protected]>,
Josh Poimboeuf <[email protected]>,
Juergen Gross <[email protected]>,
Kees Cook <[email protected]>,
Peter Zijlstra <[email protected]>,
Thomas Gleixner <[email protected]>,
Tony Luck <[email protected]>,
Youquan Song <[email protected]>,
[email protected], [email protected],
[email protected], [email protected]
Subject: [PATCH v1 0/2] x86: Avoid using INC and DEC instructions on hot paths
Date: Mon, 7 Mar 2022 18:45:56 +0700 [thread overview]
Message-ID: <[email protected]> (raw)
Hi,
In order to take maximum advantage of out-of-order execution,
avoid using INC/DEC instructions when appropriate. INC/DEC only
writes to part of the flags register, which can cause a partial
flag register stall. This series replaces INC/DEC with ADD/SUB.
Agner Fog's optimization manual says [1]:
"""
The INC and DEC instructions are inefficient on some CPUs because they
write to only part of the flags register (excluding the carry flag).
Use ADD or SUB instead to avoid false dependences or inefficient
splitting of the flags register, especially if they are followed by
an instruction that reads the flags.
"""
Intel's optimization manual 3.5.1.1 says [2]:
"""
The INC and DEC instructions modify only a subset of the bits in the
flag register. This creates a dependence on all previous writes of
the flag register. This is especially problematic when these
instructions are on the critical path because they are used to change
an address for a load on which many other instructions depend.
Assembly/Compiler Coding Rule 33. (M impact, H generality) INC and DEC
instructions should be replaced with ADD or SUB instructions, because
ADD and SUB overwrite all flags, whereas INC and DEC do not, therefore
creating false dependencies on earlier instructions that set the flags.
"""
Newer compilers also do it for generic x86-64 CPU (https://godbolt.org/z/rjsfbdx54).
# C code:
int fy_inc(int a, int b, int c)
{
a++; b++; c++;
return a * b * c;
}
# ASM
## GCC 4.1.2 and older use INC (old).
fy_inc:
incl %edi
incl %esi
leal 1(%rdx), %eax
imull %esi, %edi
imull %edi, %eax
ret
## GCC 4.4.7 to GCC 11.2 use ADD (new).
fy_inc:
addl $1, %edi
addl $1, %esi
addl $1, %edx
imull %esi, %edi
movl %edi, %eax
imull %edx, %eax
ret
## Clang 5.0.2 and older use INC (old).
fy_inc:
incl %edi
leal 1(%rsi), %eax
imull %edi, %eax
incl %edx
imull %edx, %eax
retq
## Clang 6.0.0 to Clang 13.0.1 use ADD (new).
fy_inc:
addl $1, %edi
leal 1(%rsi), %eax
imull %edi, %eax
addl $1, %edx
imull %edx, %eax
retq
[1]: https://www.agner.org/optimize/optimizing_assembly.pdf
[2]: https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf
Signed-off-by: Ammar Faizi <[email protected]>
---
Ammar Faizi (2):
x86/include/asm: Avoid using INC and DEC instructions on hot paths
x86/lib: Avoid using INC and DEC instructions on hot paths
arch/x86/include/asm/xor_32.h | 16 ++++++++--------
arch/x86/lib/copy_mc_64.S | 14 +++++++-------
arch/x86/lib/copy_user_64.S | 26 +++++++++++++-------------
arch/x86/lib/memset_64.S | 6 +++---
arch/x86/lib/string_32.c | 20 ++++++++++----------
arch/x86/lib/strstr_32.c | 4 ++--
arch/x86/lib/usercopy_64.c | 12 ++++++------
7 files changed, 49 insertions(+), 49 deletions(-)
base-commit: ffb217a13a2eaf6d5bd974fc83036a53ca69f1e2
--
Ammar Faizi
next reply other threads:[~2022-03-07 11:46 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-07 11:45 Ammar Faizi [this message]
2022-03-07 11:45 ` [PATCH v1 1/2] x86/include/asm: Avoid using INC and DEC instructions on hot paths Ammar Faizi
2022-03-07 11:45 ` [PATCH v1 2/2] x86/lib: " Ammar Faizi
2022-03-07 12:38 ` [PATCH v1 0/2] x86: " Borislav Petkov
2022-03-07 13:37 ` Ammar Faizi
2022-03-09 9:33 ` Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox