public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH v2 0/1] Fix a stack misalign bug on _start
@ 2023-08-28  7:02 Ammar Faizi
  2023-08-28  7:02 ` [PATCH v2 1/1] tools/nolibc: i386: " Ammar Faizi
  0 siblings, 1 reply; 7+ messages in thread
From: Ammar Faizi @ 2023-08-28  7:02 UTC (permalink / raw)
  To: Willy Tarreau, Thomas Weißschuh
  Cc: Ammar Faizi, Zhangjin Wu, Nicholas Rosenberg,
	Michael William Jonathan, GNU/Weeb Mailing List,
	Linux Kernel Mailing List

Hi Willy,

This is a v2 revision.

The ABI mandates that the %esp register must be a multiple of 16 when
executing a 'call' instruction.

Commit 2ab446336b17 ("tools/nolibc: i386: shrink _start with _start_c")
simplified the _start function, but it didn't take care of the %esp
alignment, causing SIGSEGV on SSE and AVX programs that use aligned move
instruction (e.g., movdqa, movaps, and vmovdqa).

  $eax   : 0x56559000  →  0x00003f90
  $ebx   : 0x56559000  →  0x00003f90
  $ecx   : 0x1
  $edx   : 0xf7fcaaa0  →   endbr32 
  $esp   : 0xffffcdbc  →  0x00000001
  $ebp   : 0x0
  $esi   : 0xffffce7c  →  0xffffd096
  $edi   : 0x56556060  →  <_start+0> xor %ebp, %ebp
  $eip   : 0x56556489  →  <sse_pq_add+25> movaps %xmm0, 0x30(%esp)

    <sse_pq_add+11>  pop    %eax
    <sse_pq_add+12>  add    $0x2b85, %eax
    <sse_pq_add+18>  movups -0x1fd0(%eax), %xmm0
  → <sse_pq_add+25>  movaps %xmm0, 0x30(%esp)     <== trapping instruction
    <sse_pq_add+30>  movups -0x1fe0(%eax), %xmm1
    <sse_pq_add+37>  movaps %xmm1, 0x20(%esp)
    <sse_pq_add+42>  movups -0x1ff0(%eax), %xmm2
    <sse_pq_add+49>  movaps %xmm2, 0x10(%esp)
    <sse_pq_add+54>  movups -0x2000(%eax), %xmm3

  [#0] Id 1, Name: "test", stopped 0x56556489 in sse_pq_add (), reason: SIGSEGV

  (gdb)  bt
  #0  0x56556489 in sse_pq_add ()

Ensure the %esp is a multiple of 16 when executing the call instruction.

Changes since v1:
  - Change 'sub $12, %esp' to 'sub $(16 - 4), %esp' (per Zhangjin).
  - Fix the reference format (per Thomas).
  - Explain more about the logic behind the fix (per Thomas).
  - Append an Acked-by tag from Thomas.

Signed-off-by: Ammar Faizi <[email protected]>
---

Ammar Faizi (1):
  tools/nolibc: i386: Fix a stack misalign bug on _start

 tools/include/nolibc/arch-i386.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)


base-commit: 6269320850097903b30be8f07a5c61d9f7592393
-- 
Ammar Faizi


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/1] tools/nolibc: i386: Fix a stack misalign bug on _start
  2023-08-28  7:02 [PATCH v2 0/1] Fix a stack misalign bug on _start Ammar Faizi
@ 2023-08-28  7:02 ` Ammar Faizi
  2023-08-29  6:21   ` Willy Tarreau
  2023-08-29  8:37   ` Zhangjin Wu
  0 siblings, 2 replies; 7+ messages in thread
From: Ammar Faizi @ 2023-08-28  7:02 UTC (permalink / raw)
  To: Willy Tarreau, Thomas Weißschuh
  Cc: Ammar Faizi, Zhangjin Wu, Nicholas Rosenberg,
	Michael William Jonathan, GNU/Weeb Mailing List,
	Linux Kernel Mailing List

The ABI mandates that the %esp register must be a multiple of 16 when
executing a 'call' instruction.

Commit 2ab446336b17 ("tools/nolibc: i386: shrink _start with _start_c")
simplified the _start function, but it didn't take care of the %esp
alignment, causing SIGSEGV on SSE and AVX programs that use aligned move
instruction (e.g., movdqa, movaps, and vmovdqa).

The 'and $-16, %esp' aligns the %esp at a multiple of 16. Then 'push
%eax' will subtract the %esp by 4; thus, it breaks the 16-byte
alignment. Make sure the %esp is correctly aligned after the push by
subtracting 12 before the push.

Cc: Zhangjin Wu <[email protected]>
Fixes: 2ab446336b17aad362c6decee29b4efd83a01979 ("tools/nolibc: i386: shrink _start with _start_c")
Reported-by: Nicholas Rosenberg <[email protected]>
Acked-by: Thomas Weißschuh <[email protected]>
Signed-off-by: Ammar Faizi <[email protected]>
---
 tools/include/nolibc/arch-i386.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/include/nolibc/arch-i386.h b/tools/include/nolibc/arch-i386.h
index 64415b9fac77f996..8079974424fa18b0 100644
--- a/tools/include/nolibc/arch-i386.h
+++ b/tools/include/nolibc/arch-i386.h
@@ -167,7 +167,8 @@ void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) __no_
 	__asm__ volatile (
 		"xor  %ebp, %ebp\n"       /* zero the stack frame                                */
 		"mov  %esp, %eax\n"       /* save stack pointer to %eax, as arg1 of _start_c     */
-		"and  $-16, %esp\n"       /* last pushed argument must be 16-byte aligned        */
+		"and  $-16, %esp\n"       /* align stack to 16 bytes                             */
+		"sub  $(16 - 4), %esp\n"  /* push %eax breaks 16-byte alignment, so sub 12 bytes */
 		"push %eax\n"             /* push arg1 on stack to support plain stack modes too */
 		"call _start_c\n"         /* transfer to c runtime                               */
 		"hlt\n"                   /* ensure it does not return                           */
-- 
Ammar Faizi


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/1] tools/nolibc: i386: Fix a stack misalign bug on _start
  2023-08-28  7:02 ` [PATCH v2 1/1] tools/nolibc: i386: " Ammar Faizi
@ 2023-08-29  6:21   ` Willy Tarreau
  2023-08-29 12:17     ` Ammar Faizi
  2023-08-29  8:37   ` Zhangjin Wu
  1 sibling, 1 reply; 7+ messages in thread
From: Willy Tarreau @ 2023-08-29  6:21 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Thomas Weißschuh, Zhangjin Wu, Nicholas Rosenberg,
	Michael William Jonathan, GNU/Weeb Mailing List,
	Linux Kernel Mailing List

Hi Ammar,

On Mon, Aug 28, 2023 at 02:02:40PM +0700, Ammar Faizi wrote:
> The ABI mandates that the %esp register must be a multiple of 16 when
> executing a 'call' instruction.
> 
> Commit 2ab446336b17 ("tools/nolibc: i386: shrink _start with _start_c")
> simplified the _start function, but it didn't take care of the %esp
> alignment, causing SIGSEGV on SSE and AVX programs that use aligned move
> instruction (e.g., movdqa, movaps, and vmovdqa).
> 
> The 'and $-16, %esp' aligns the %esp at a multiple of 16. Then 'push
> %eax' will subtract the %esp by 4; thus, it breaks the 16-byte
> alignment. Make sure the %esp is correctly aligned after the push by
> subtracting 12 before the push.

Indeed, good catch! However if we want to do it cleany (i.e not punch a
16 to 28 byes hole in the stack), we should do this instead:

		add  $12, %esp   /* the stack must be aligned to 16 ... */
		and  $-16, %esp  /* ... bytes after eax is pushed and ... */
		sub  $12, %esp   /* ... before the call. */

This will only add 0 to 12 bytes depending on the existing alignment.

What do you think ?

thanks,
Willy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/1] tools/nolibc: i386: Fix a stack misalign bug on _start
  2023-08-28  7:02 ` [PATCH v2 1/1] tools/nolibc: i386: " Ammar Faizi
  2023-08-29  6:21   ` Willy Tarreau
@ 2023-08-29  8:37   ` Zhangjin Wu
  2023-08-29 12:25     ` Ammar Faizi
  1 sibling, 1 reply; 7+ messages in thread
From: Zhangjin Wu @ 2023-08-29  8:37 UTC (permalink / raw)
  To: ammarfaizi2; +Cc: falcon, gwml, inori, linux-kernel, linux, moe, w

Hi, Ammar

> The ABI mandates that the %esp register must be a multiple of 16 when
> executing a 'call' instruction.
> 
> Commit 2ab446336b17 ("tools/nolibc: i386: shrink _start with _start_c")
> simplified the _start function, but it didn't take care of the %esp
> alignment, causing SIGSEGV on SSE and AVX programs that use aligned move
> instruction (e.g., movdqa, movaps, and vmovdqa).
>

As Thomas suggested and you replied, since there is no public report
link, what about uses the link which has test code at the end of above
paragraph.

    ...
    instruction (e.g., movdqa, movaps, and vmovdqa) [1].

> The 'and $-16, %esp' aligns the %esp at a multiple of 16. Then 'push
> %eax' will subtract the %esp by 4; thus, it breaks the 16-byte
> alignment. Make sure the %esp is correctly aligned after the push by
> subtracting 12 before the push.
> 

And at the end of commit message:

    [1]: https://lore.kernel.org/lkml/[email protected]/

This test code may be important for future change verification.

Thanks,
Zhangjin

> Cc: Zhangjin Wu <[email protected]>
> Fixes: 2ab446336b17aad362c6decee29b4efd83a01979 ("tools/nolibc: i386: shrink _start with _start_c")
> Reported-by: Nicholas Rosenberg <[email protected]>
> Acked-by: Thomas Weißschuh <[email protected]>
> Signed-off-by: Ammar Faizi <[email protected]>
> ---
>  tools/include/nolibc/arch-i386.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/include/nolibc/arch-i386.h b/tools/include/nolibc/arch-i386.h
> index 64415b9fac77f996..8079974424fa18b0 100644
> --- a/tools/include/nolibc/arch-i386.h
> +++ b/tools/include/nolibc/arch-i386.h
> @@ -167,7 +167,8 @@ void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) __no_
>  	__asm__ volatile (
>  		"xor  %ebp, %ebp\n"       /* zero the stack frame                                */
>  		"mov  %esp, %eax\n"       /* save stack pointer to %eax, as arg1 of _start_c     */
> -		"and  $-16, %esp\n"       /* last pushed argument must be 16-byte aligned        */
> +		"and  $-16, %esp\n"       /* align stack to 16 bytes                             */
> +		"sub  $(16 - 4), %esp\n"  /* push %eax breaks 16-byte alignment, so sub 12 bytes */
>  		"push %eax\n"             /* push arg1 on stack to support plain stack modes too */
>  		"call _start_c\n"         /* transfer to c runtime                               */
>  		"hlt\n"                   /* ensure it does not return                           */
> -- 
> Ammar Faizi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/1] tools/nolibc: i386: Fix a stack misalign bug on _start
  2023-08-29  6:21   ` Willy Tarreau
@ 2023-08-29 12:17     ` Ammar Faizi
  2023-08-29 12:24       ` Willy Tarreau
  0 siblings, 1 reply; 7+ messages in thread
From: Ammar Faizi @ 2023-08-29 12:17 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Thomas Weißschuh, Zhangjin Wu, Nicholas Rosenberg,
	Michael William Jonathan, GNU/Weeb Mailing List,
	Linux Kernel Mailing List

On Tue, Aug 29, 2023 at 08:21:47AM +0200, Willy Tarreau wrote:
> Indeed, good catch! However if we want to do it cleany (i.e not punch a
> 16 to 28 byes hole in the stack), we should do this instead:
> 
> 		add  $12, %esp   /* the stack must be aligned to 16 ... */
> 		and  $-16, %esp  /* ... bytes after eax is pushed and ... */
> 		sub  $12, %esp   /* ... before the call. */
> 
> This will only add 0 to 12 bytes depending on the existing alignment.
> 
> What do you think ?

Good point. I'll send a v3 revision tomorrow.

I just saw that Linus has pulled the PR from Shuah that contains this
bug. IOW, I missed this fix for the 6.6 merge window. Let's see if it
can go to 6.6-rc2. Or maybe sooner than that.

-- 
Ammar Faizi


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/1] tools/nolibc: i386: Fix a stack misalign bug on _start
  2023-08-29 12:17     ` Ammar Faizi
@ 2023-08-29 12:24       ` Willy Tarreau
  0 siblings, 0 replies; 7+ messages in thread
From: Willy Tarreau @ 2023-08-29 12:24 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Thomas Weißschuh, Zhangjin Wu, Nicholas Rosenberg,
	Michael William Jonathan, GNU/Weeb Mailing List,
	Linux Kernel Mailing List

On Tue, Aug 29, 2023 at 07:17:10PM +0700, Ammar Faizi wrote:
> On Tue, Aug 29, 2023 at 08:21:47AM +0200, Willy Tarreau wrote:
> > Indeed, good catch! However if we want to do it cleany (i.e not punch a
> > 16 to 28 byes hole in the stack), we should do this instead:
> > 
> > 		add  $12, %esp   /* the stack must be aligned to 16 ... */
> > 		and  $-16, %esp  /* ... bytes after eax is pushed and ... */
> > 		sub  $12, %esp   /* ... before the call. */
> > 
> > This will only add 0 to 12 bytes depending on the existing alignment.
> > 
> > What do you think ?
> 
> Good point. I'll send a v3 revision tomorrow.

OK thanks!

> I just saw that Linus has pulled the PR from Shuah that contains this
> bug. IOW, I missed this fix for the 6.6 merge window. Let's see if it
> can go to 6.6-rc2. Or maybe sooner than that.

No worries, we all know that -rc1 gets more exposure than individual
branches and raises bugs like this one.

Cheers,
Willy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/1] tools/nolibc: i386: Fix a stack misalign bug on _start
  2023-08-29  8:37   ` Zhangjin Wu
@ 2023-08-29 12:25     ` Ammar Faizi
  0 siblings, 0 replies; 7+ messages in thread
From: Ammar Faizi @ 2023-08-29 12:25 UTC (permalink / raw)
  To: Zhangjin Wu
  Cc: Willy Tarreau, Thomas Weißschuh, Nicholas Rosenberg,
	Michael William Jonathan, GNU/Weeb Mailing List,
	Linux Kernel Mailing List

On Tue, Aug 29, 2023 at 04:37:32PM +0800, Zhangjin Wu wrote:
> As Thomas suggested and you replied, since there is no public report
> link, what about uses the link which has test code at the end of above
> paragraph.
> 
>     ...
>     instruction (e.g., movdqa, movaps, and vmovdqa) [1].

OK, I'll add the link in v3.

-- 
Ammar Faizi


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-08-29 12:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-28  7:02 [PATCH v2 0/1] Fix a stack misalign bug on _start Ammar Faizi
2023-08-28  7:02 ` [PATCH v2 1/1] tools/nolibc: i386: " Ammar Faizi
2023-08-29  6:21   ` Willy Tarreau
2023-08-29 12:17     ` Ammar Faizi
2023-08-29 12:24       ` Willy Tarreau
2023-08-29  8:37   ` Zhangjin Wu
2023-08-29 12:25     ` Ammar Faizi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox