* [PATCH v3 0/1] Fix a stack misalign bug on _start
@ 2023-08-30 1:02 Ammar Faizi
2023-08-30 1:02 ` [PATCH v3 1/1] tools/nolibc: i386: " Ammar Faizi
2023-08-30 3:45 ` [PATCH v3 0/1] " Willy Tarreau
0 siblings, 2 replies; 4+ messages in thread
From: Ammar Faizi @ 2023-08-30 1:02 UTC (permalink / raw)
To: Willy Tarreau, Thomas Weißschuh
Cc: Ammar Faizi, Zhangjin Wu, Nicholas Rosenberg,
Michael William Jonathan, GNU/Weeb Mailing List,
Linux Kernel Mailing List
Hi Willy,
This is a v3 revision.
The ABI mandates that the %esp register must be a multiple of 16 when
executing a 'call' instruction.
Commit 2ab446336b17 ("tools/nolibc: i386: shrink _start with _start_c")
simplified the _start function, but it didn't take care of the %esp
alignment, causing SIGSEGV on SSE and AVX programs that use aligned move
instruction (e.g., movdqa, movaps, and vmovdqa).
$eax : 0x56559000 → 0x00003f90
$ebx : 0x56559000 → 0x00003f90
$ecx : 0x1
$edx : 0xf7fcaaa0 → endbr32
$esp : 0xffffcdbc → 0x00000001
$ebp : 0x0
$esi : 0xffffce7c → 0xffffd096
$edi : 0x56556060 → <_start+0> xor %ebp, %ebp
$eip : 0x56556489 → <sse_pq_add+25> movaps %xmm0, 0x30(%esp)
<sse_pq_add+11> pop %eax
<sse_pq_add+12> add $0x2b85, %eax
<sse_pq_add+18> movups -0x1fd0(%eax), %xmm0
→ <sse_pq_add+25> movaps %xmm0, 0x30(%esp) <== trapping instruction
<sse_pq_add+30> movups -0x1fe0(%eax), %xmm1
<sse_pq_add+37> movaps %xmm1, 0x20(%esp)
<sse_pq_add+42> movups -0x1ff0(%eax), %xmm2
<sse_pq_add+49> movaps %xmm2, 0x10(%esp)
<sse_pq_add+54> movups -0x2000(%eax), %xmm3
[#0] Id 1, Name: "test", stopped 0x56556489 in sse_pq_add (), reason: SIGSEGV
(gdb) bt
#0 0x56556489 in sse_pq_add ()
Ensure the %esp is a multiple of 16 when executing the call instruction.
Changes since v2:
- Avoid over-estimating the stack size (per Willy).
- Add the link to a test program to validate the alignment (per Zhangjin).
Changes since v1:
- Change 'sub $12, %esp' to 'sub $(16 - 4), %esp' (per Zhangjin).
- Fix the reference format (per Thomas).
- Explain more about the logic behind the fix (per Thomas).
- Append an Acked-by tag from Thomas.
Signed-off-by: Ammar Faizi <[email protected]>
---
Ammar Faizi (1):
tools/nolibc: i386: Fix a stack misalign bug on _start
tools/include/nolibc/arch-i386.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
base-commit: 6269320850097903b30be8f07a5c61d9f7592393
--
Ammar Faizi
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v3 1/1] tools/nolibc: i386: Fix a stack misalign bug on _start
2023-08-30 1:02 [PATCH v3 0/1] Fix a stack misalign bug on _start Ammar Faizi
@ 2023-08-30 1:02 ` Ammar Faizi
2023-08-30 3:45 ` [PATCH v3 0/1] " Willy Tarreau
1 sibling, 0 replies; 4+ messages in thread
From: Ammar Faizi @ 2023-08-30 1:02 UTC (permalink / raw)
To: Willy Tarreau, Thomas Weißschuh
Cc: Ammar Faizi, Zhangjin Wu, Nicholas Rosenberg,
Michael William Jonathan, GNU/Weeb Mailing List,
Linux Kernel Mailing List
The ABI mandates that the %esp register must be a multiple of 16 when
executing a 'call' instruction.
Commit 2ab446336b17 ("tools/nolibc: i386: shrink _start with _start_c")
simplified the _start function, but it didn't take care of the %esp
alignment, causing SIGSEGV on SSE and AVX programs that use aligned move
instruction (e.g., movdqa, movaps, and vmovdqa).
The 'and $-16, %esp' aligns the %esp at a multiple of 16. Then 'push
%eax' will subtract the %esp by 4; thus, it breaks the 16-byte
alignment. Make sure the %esp is correctly aligned after the push by
subtracting 12 before the push.
Extra:
Add 'add $12, %esp' before the 'and $-16, %esp' to avoid over-estimating
for particular cases as suggested by Willy.
A test program to validate the %esp alignment on _start can be found at:
https://lore.kernel.org/lkml/[email protected]
Cc: Zhangjin Wu <[email protected]>
Fixes: 2ab446336b17aad362c6decee29b4efd83a01979 ("tools/nolibc: i386: shrink _start with _start_c")
Reported-by: Nicholas Rosenberg <[email protected]>
Acked-by: Thomas Weißschuh <[email protected]>
Signed-off-by: Ammar Faizi <[email protected]>
---
tools/include/nolibc/arch-i386.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/include/nolibc/arch-i386.h b/tools/include/nolibc/arch-i386.h
index 64415b9fac77f996..28c26a00a7625f55 100644
--- a/tools/include/nolibc/arch-i386.h
+++ b/tools/include/nolibc/arch-i386.h
@@ -167,7 +167,9 @@ void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) __no_
__asm__ volatile (
"xor %ebp, %ebp\n" /* zero the stack frame */
"mov %esp, %eax\n" /* save stack pointer to %eax, as arg1 of _start_c */
- "and $-16, %esp\n" /* last pushed argument must be 16-byte aligned */
+ "add $12, %esp\n" /* avoid over-estimating after the 'and' & 'sub' below */
+ "and $-16, %esp\n" /* the %esp must be 16-byte aligned on 'call' */
+ "sub $12, %esp\n" /* sub 12 to keep it aligned after the push %eax */
"push %eax\n" /* push arg1 on stack to support plain stack modes too */
"call _start_c\n" /* transfer to c runtime */
"hlt\n" /* ensure it does not return */
--
Ammar Faizi
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v3 0/1] Fix a stack misalign bug on _start
2023-08-30 1:02 [PATCH v3 0/1] Fix a stack misalign bug on _start Ammar Faizi
2023-08-30 1:02 ` [PATCH v3 1/1] tools/nolibc: i386: " Ammar Faizi
@ 2023-08-30 3:45 ` Willy Tarreau
2023-08-30 4:07 ` Alviro Iskandar Setiawan
1 sibling, 1 reply; 4+ messages in thread
From: Willy Tarreau @ 2023-08-30 3:45 UTC (permalink / raw)
To: Ammar Faizi
Cc: Thomas Weißschuh, Zhangjin Wu, Nicholas Rosenberg,
Michael William Jonathan, GNU/Weeb Mailing List,
Linux Kernel Mailing List
On Wed, Aug 30, 2023 at 08:02:22AM +0700, Ammar Faizi wrote:
> Hi Willy,
>
> This is a v3 revision.
>
> The ABI mandates that the %esp register must be a multiple of 16 when
> executing a 'call' instruction.
(...)
Thanks Ammar, now queued into the fixes branch.
Willy
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v3 0/1] Fix a stack misalign bug on _start
2023-08-30 3:45 ` [PATCH v3 0/1] " Willy Tarreau
@ 2023-08-30 4:07 ` Alviro Iskandar Setiawan
0 siblings, 0 replies; 4+ messages in thread
From: Alviro Iskandar Setiawan @ 2023-08-30 4:07 UTC (permalink / raw)
To: Willy Tarreau
Cc: Ammar Faizi, Thomas Weißschuh, Zhangjin Wu,
Nicholas Rosenberg, Michael William Jonathan,
GNU/Weeb Mailing List, Linux Kernel Mailing List
On Wed, 30 Aug 2023 05:45:16 +0200, Willy Tarreau wrote:
> Thanks Ammar, now queued into the fixes branch.
Reviewed-by: Alviro Iskandar Setiawan <[email protected]>
-- Viro
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-08-30 4:07 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-30 1:02 [PATCH v3 0/1] Fix a stack misalign bug on _start Ammar Faizi
2023-08-30 1:02 ` [PATCH v3 1/1] tools/nolibc: i386: " Ammar Faizi
2023-08-30 3:45 ` [PATCH v3 0/1] " Willy Tarreau
2023-08-30 4:07 ` Alviro Iskandar Setiawan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox