GNU/Weeb Mailing List <[email protected]>
 help / color / mirror / Atom feed
From: Willy Tarreau <[email protected]>
To: Ammar Faizi <[email protected]>
Cc: Shuah Khan <[email protected]>,
	"Paul E. McKenney" <[email protected]>,
	Gilang Fachrezy <[email protected]>,
	VNLX Kernel Department <[email protected]>,
	Alviro Iskandar Setiawan <[email protected]>,
	Kanna Scarlet <[email protected]>,
	Muhammad Rizki <[email protected]>,
	GNU/Weeb Mailing List <[email protected]>,
	Linux Kernel Mailing List <[email protected]>,
	Linux Kselftest Mailing List <[email protected]>
Subject: Re: [RFC PATCH v1 0/8] nolibc signal handling support
Date: Tue, 27 Dec 2022 07:26:40 +0100	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

Hi Ammar,

On Thu, Dec 22, 2022 at 08:46:15PM +0700, Ammar Faizi wrote:
> I agree with following the @envp pointer to get the auxv. I was
> trying to wire up a new function '__start' (with double underscores)
> written in C that accepts @argc, @argv and @envp. Then it calls 'main'.
> Then we call '__start' instead of 'main' from '_start'. This way, we
> can arrange nolibc-defined data without touching Assembly much in
> '__start' (before main).
> 
> But then I noticed that it wouldn't work because we may have users
> who define the 'main' function differently, e.g.:
> 
>     int main(void);
>     int main(int argc, char **argv);
>     int main(int argc, char **argv, char **envp);
> 
> So '__start' can't call main. We still need to call the main from the
> inline Assembly (from '_start').

Yes, and quite frankly I prefer to make that the least complicated.
Doing just a simple loop in the _start code is trivial. The main
concern was to store the data. Till now we had an optional .bss
section, we didn't save environ and errno was optional. But let's
be honest, while it does allow for writing the smallest programs,
most programs will have at least one global variable and will get
this section anyway, so we don't save anything in practice. This
concern used to be valid when I was making tiny executables when
running on floppies where each byte mattered, but now that's pointless.

Thus what I'm proposing is to switch to weak symbol definitions for
errno, environ, and auxv. I did a quick test to make sure that the same
symbol was properly used when accessed from two units and that's OK, I'm
seeing the same instance for all of them (which is better than the current
situation where errno is static, hence per-unit).

My quick-and-dirty test looks like this:

diff --git a/arch-x86_64.h b/arch-x86_64.h
index e780fdf..73f7b5f 100644
--- a/arch-x86_64.h
+++ b/arch-x86_64.h
@@ -209,6 +209,9 @@ struct sys_stat_struct {
        _ret;                                                                 \
 })
 
+char **environ __attribute__((weak,unused));
+long *auxv __attribute__((weak,unused));
+
 /* startup code */
 /*
  * x86-64 System V ABI mandates:
@@ -218,11 +221,17 @@ struct sys_stat_struct {
  */
 asm(".section .text\n"
     ".weak _start\n"
     "_start:\n"
     "pop %rdi\n"                // argc   (first arg, %rdi)
     "mov %rsp, %rsi\n"          // argv[] (second arg, %rsi)
     "lea 8(%rsi,%rdi,8),%rdx\n" // then a NULL then envp (third arg, %rdx)
+    "mov %rdx, environ\n"       // save environ
     "xor %ebp, %ebp\n"          // zero the stack frame
+    "mov %rdx, %rax\n"          // search for auxv (follows NULL after last en>
+    "0: add $8, %rax\n"
+    "   cmp -8(%rax), %rbp\n"
+    "   jnz 0b\n"
+    "mov %rax, auxv\n"          // save auxv
     "and $-16, %rsp\n"          // x86 ABI : esp must be 16-byte aligned befor>
     "call main\n"               // main() returns the status code, we'll exit >
     "mov %eax, %edi\n"          // retrieve exit code (32 bit)

diff --git a/errno.h b/errno.h
index df0e473..9781077 100644
--- a/errno.h
+++ b/errno.h
@@ -29,7 +29,8 @@
 #include <asm/errno.h>
 
 /* this way it will be removed if unused */
-static int errno;
+//static int errno;
+int errno __attribute__((weak));
 
 #ifndef NOLIBC_IGNORE_ERRNO
 #define SET_ERRNO(v) do { errno = (v); } while (0)

$ cat a.c
#include "nolibc.h"

extern void b(void);

int main(int argc, char **argv, char **envp)
{
        //environ = envp;
        errno = 1234;
        printf("main(): errno=%d env(TERM)=%s auxv=%p auxv[0].t=0x%lx auxv[0].v=0x%lx\n",
               errno, getenv("TERM"), auxv, auxv?auxv[0]:0, auxv?auxv[1]:0);
        b();
        return 0;
}

$ cat b.c
#include "nolibc.h"

void b(void)
{
        long *v = auxv;

        printf("b(): errno=%d env(TERM)=%s auxv=%p auxv[0].t=0x%lx auxv[0].v=0x%lx\n",
               errno, getenv("TERM"), auxv, auxv?auxv[0]:0, auxv?auxv[1]:0);

        printf("auxv:\n");
        while (v && v[0]) {
                printf("  0x%lx: 0x%lx\n", v[0], v[1]);
                v += 2;
        }
}

$ gcc -Os -fno-asynchronous-unwind-tables -include /g/public/nolibc/nolibc.h -Wall -nostdlib -static  -o ab a.c b.c

$ nm --size ab
0000000000000004 V errno
0000000000000008 V auxv
0000000000000008 V environ
0000000000000014 W memset
0000000000000018 W memcpy
0000000000000018 W raise
000000000000001b W abort
0000000000000030 W memmove
0000000000000053 t u64toa_r
0000000000000053 t u64toa_r
0000000000000082 T main
00000000000000a4 T b
0000000000000289 t printf
000000000000028c t printf.constprop.0

$ ./ab
main(): errno=1234 env(TERM)=xterm auxv=0x7ffdd0c31df8 auxv[0].t=0x21 auxv[0].v=0x7ffdd0d56000
b(): errno=1234 env(TERM)=xterm auxv=0x7ffdd0c31df8 auxv[0].t=0x21 auxv[0].v=0x7ffdd0d56000
auxv:
  0x21: 0x7ffdd0d56000
  0x10: 0xbfebfbff
  0x6: 0x1000
  0x11: 0x64
  0x3: 0x400040
  0x4: 0x38
  0x5: 0x7
  0x7: 0x0
  0x8: 0x0
  0x9: 0x401082
  0xb: 0x1fd
  0xc: 0x1fd
  0xd: 0x64
  0xe: 0x64
  0x17: 0x0
  0x19: 0x7ffdd0c31f39
  0x1a: 0x2
  0x1f: 0x7ffdd0c33ff3
  0xf: 0x7ffdd0c31f49

Note that I could verify that some of the entries above are valid
(e.g. "x86_64" in 0xf = AT_PLATFORM).

Thus now my focus will be on storing these variables where relevant
for all archs, so that your getauxval() implementation works on top
of it. It will be much cleaner and will also improve programs' ease
of implementation and reliability.

Cheers,
Willy

PS: maybe we should trim the Cc list for future exchanges.


  parent reply	other threads:[~2022-12-27  6:26 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-22  3:51 [RFC PATCH v1 0/8] nolibc signal handling support Ammar Faizi
2022-12-22  3:51 ` [RFC PATCH v1 1/8] nolibc/sys: Implement `sigaction(2)` function Ammar Faizi
2022-12-22  3:51 ` [RFC PATCH v1 2/8] nolibc/sys: Implement `signal(2)` function Ammar Faizi
2022-12-22  3:51 ` [RFC PATCH v1 3/8] nolibc/sys: Implement `getpagesize(2)` function Ammar Faizi
2022-12-22  3:51 ` [RFC PATCH v1 4/8] selftests/nolibc: Add `-Wall` and `-Wno-unsed-function` to the CFLAGS Ammar Faizi
2022-12-22  3:51 ` [RFC PATCH v1 5/8] selftests/nolibc: Add `fork(2)` selftest Ammar Faizi
2022-12-22  3:51 ` [RFC PATCH v1 6/8] selftests/nolibc: Add `sigaction(2)` selftest Ammar Faizi
2022-12-22  3:51 ` [RFC PATCH v1 7/8] selftests/nolibc: Add `signal(2)` selftest Ammar Faizi
2022-12-22  3:51 ` [RFC PATCH v1 8/8] selftests/nolibc: Add `getpagesize(2)` selftest Ammar Faizi
2022-12-22  4:34 ` [RFC PATCH v1 0/8] nolibc signal handling support Willy Tarreau
2022-12-22 13:46   ` Ammar Faizi
2022-12-22 14:55     ` Alviro Iskandar Setiawan
2022-12-27  6:26     ` Willy Tarreau [this message]
2022-12-27 13:32       ` Ammar Faizi
2022-12-27 13:36         ` Ammar Faizi
2022-12-27 18:58           ` Willy Tarreau
2022-12-28 12:23             ` Ammar Faizi
2022-12-27 18:49         ` Willy Tarreau
2022-12-28 12:01           ` Ammar Faizi
2022-12-28 13:35             ` Willy Tarreau
2022-12-29 11:41               ` Ammar Faizi
2023-01-03  3:51                 ` Alviro Iskandar Setiawan
2023-01-03  3:54                   ` Willy Tarreau
2023-01-03  3:59                     ` Ammar Faizi
2023-01-08 13:08                       ` [PATCH v1 0/3] nolibc auxiliary vector retrieval support Ammar Faizi
2023-01-08 13:08                         ` [PATCH v1 1/3] nolibc/stdlib: Implement `getauxval(3)` function Ammar Faizi
2023-01-08 13:08                         ` [PATCH v1 2/3] nolibc/sys: Implement `getpagesize(2)` function Ammar Faizi
2023-01-08 13:08                         ` [PATCH v1 3/3] selftests/nolibc: Add `getpagesize(2)` selftest Ammar Faizi
2023-01-08 13:10                       ` [PATCH v2 0/4] nolibc signal handling support Ammar Faizi
2023-01-08 13:10                         ` [PATCH v2 1/4] nolibc/sys: Implement `sigaction(2)` function Ammar Faizi
2023-01-08 13:10                         ` [PATCH v2 2/4] nolibc/sys: Implement `signal(2)` function Ammar Faizi
2023-01-08 13:10                         ` [PATCH v2 3/4] selftests/nolibc: Add `fork(2)` selftest Ammar Faizi
2023-01-08 13:10                         ` [PATCH v2 4/4] selftests/nolibc: Add `sigaction(2)` selftest Ammar Faizi
2023-01-08 13:28                         ` [PATCH v2 0/4] nolibc signal handling support Alviro Iskandar Setiawan
2023-01-08 13:31                           ` Ammar Faizi
2023-01-08 13:39                             ` Ammar Faizi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox