public inbox for [email protected]
 help / color / mirror / Atom feed
From: Ammar Faizi <[email protected]>
To: "Willy Tarreau" <[email protected]>, "Thomas Weißschuh" <[email protected]>
Cc: Ammar Faizi <[email protected]>,
	David Laight <[email protected]>,
	Nicholas Rosenberg <[email protected]>,
	Alviro Iskandar Setiawan <[email protected]>,
	Michael William Jonathan <[email protected]>,
	GNU/Weeb Mailing List <[email protected]>,
	Linux Kernel Mailing List <[email protected]>
Subject: [RFC PATCH v2 0/4] nolibc x86-64 string functions
Date: Sat,  2 Sep 2023 12:50:41 +0700	[thread overview]
Message-ID: <[email protected]> (raw)

Hi Willy,

This is an RFC patchset v2 for nolibc x86-64 string functions.

Changes in v2:
  - Shrink the memset code size:
      - Use pushq %rdi / popq %rax (Alviro).
      - Use xchg %eax, %esi (Willy).
  - Drop the memcmp patch (need more pondering).
  - Fix the broken memmove implementation (David).

There are 4 patches in this series:

## Patch 1-2: Use `rep movsb`, `rep stosb` for:
    - memcpy() and memmove()
    - memset()
respectively. They can simplify the generated ASM code.

Patch 3 and 4 are not related, just a small cleanup.

## Patch 3: Remove the `_nolibc_memcpy_down()` function
This nolibc internal function is not used. Delete it. It was probably
supposed to handle memmove(), but today the memmove() has its own
implementation.

## Patch 4: Remove the `_nolibc_memcpy_up()` function
This function is only called by memcpy(), there is no real reason to
have this wrapper. Delete this function and move the code to memcpy()
directly.

Before this series:
```
  0000000000401058 <memmove>:
    401058: 48 89 f8              movq   %rdi,%rax
    40105b: 31 c9                 xorl   %ecx,%ecx
    40105d: 48 39 f7              cmpq   %rsi,%rdi
    401060: 48 83 d1 ff           adcq   $0xffffffffffffffff,%rcx
    401064: 48 85 d2              testq  %rdx,%rdx
    401067: 74 25                 je     40108e <memmove+0x36>
    401069: 48 83 c9 01           orq    $0x1,%rcx
    40106d: 48 39 f0              cmpq   %rsi,%rax
    401070: 48 c7 c7 ff ff ff ff  movq   $0xffffffffffffffff,%rdi
    401077: 48 0f 43 fa           cmovaeq %rdx,%rdi
    40107b: 48 01 cf              addq   %rcx,%rdi
    40107e: 44 8a 04 3e           movb   (%rsi,%rdi,1),%r8b
    401082: 44 88 04 38           movb   %r8b,(%rax,%rdi,1)
    401086: 48 01 cf              addq   %rcx,%rdi
    401089: 48 ff ca              decq   %rdx
    40108c: 75 f0                 jne    40107e <memmove+0x26>
    40108e: c3                    retq

  000000000040108f <memcpy>:
    40108f: 48 89 f8              movq   %rdi,%rax
    401092: 48 85 d2              testq  %rdx,%rdx
    401095: 74 12                 je     4010a9 <memcpy+0x1a>
    401097: 31 c9                 xorl   %ecx,%ecx
    401099: 40 8a 3c 0e           movb   (%rsi,%rcx,1),%dil
    40109d: 40 88 3c 08           movb   %dil,(%rax,%rcx,1)
    4010a1: 48 ff c1              incq   %rcx
    4010a4: 48 39 ca              cmpq   %rcx,%rdx
    4010a7: 75 f0                 jne    401099 <memcpy+0xa>
    4010a9: c3                    retq

  00000000004010aa <memset>:
    4010aa: 48 89 f8              movq   %rdi,%rax
    4010ad: 48 85 d2              testq  %rdx,%rdx
    4010b0: 74 0e                 je     4010c0 <memset+0x16>
    4010b2: 31 c9                 xorl   %ecx,%ecx
    4010b4: 40 88 34 08           movb   %sil,(%rax,%rcx,1)
    4010b8: 48 ff c1              incq   %rcx
    4010bb: 48 39 ca              cmpq   %rcx,%rdx
    4010be: 75 f4                 jne    4010b4 <memset+0xa>
    4010c0: c3                    retq
```

After this series:
```
  0000000000401040 <memmove>:
    401040: 48 89 d1              movq   %rdx,%rcx
    401043: 48 89 fa              movq   %rdi,%rdx
    401046: 48 89 f8              movq   %rdi,%rax
    401049: 48 29 f2              subq   %rsi,%rdx
    40104c: 48 39 ca              cmpq   %rcx,%rdx
    40104f: 73 0f                 jae    401060 <memmove+0x20>
    401051: 48 8d 7c 0f ff        leaq   -0x1(%rdi,%rcx,1),%rdi
    401056: 48 8d 74 0e ff        leaq   -0x1(%rsi,%rcx,1),%rsi
    40105b: fd                    std
    40105c: f3 a4                 rep movsb %ds:(%rsi),%es:(%rdi)
    40105e: fc                    cld
    40105f: c3                    retq
    401060: f3 a4                 rep movsb %ds:(%rsi),%es:(%rdi)
    401062: c3                    retq

  0000000000401063 <memcpy>:
    401063: 48 89 f8              movq   %rdi,%rax
    401066: 48 89 d1              movq   %rdx,%rcx
    401069: f3 a4                 rep movsb %ds:(%rsi),%es:(%rdi)
    40106b: c3                    retq

  000000000040106c <memset>:
    40106c: 96                    xchgl  %eax,%esi
    40106d: 48 89 d1              movq   %rdx,%rcx
    401070: 57                    pushq  %rdi
    401071: f3 aa                 rep stosb %al,%es:(%rdi)
    401073: 58                    popq   %rax
    401074: c3                    retq
```

Signed-off-by: Ammar Faizi <[email protected]>
---

Ammar Faizi (4):
  tools/nolibc: x86-64: Use `rep movsb` for `memcpy()` and `memmove()`
  tools/nolibc: x86-64: Use `rep stosb` for `memset()`
  tools/nolibc: string: Remove the `_nolibc_memcpy_down()` function
  tools/nolibc: string: Remove the `_nolibc_memcpy_up()` function

 tools/include/nolibc/arch-x86_64.h | 48 ++++++++++++++++++++++++++++++
 tools/include/nolibc/string.h      | 36 ++++++++--------------
 2 files changed, 61 insertions(+), 23 deletions(-)


base-commit: 3c9b7c4a228bf8cca2f92abb65575cdd54065302
-- 
Ammar Faizi


             reply	other threads:[~2023-09-02  5:50 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-02  5:50 Ammar Faizi [this message]
2023-09-02  5:50 ` [RFC PATCH v2 1/4] tools/nolibc: x86-64: Use `rep movsb` for `memcpy()` and `memmove()` Ammar Faizi
2023-09-02  6:07   ` Alviro Iskandar Setiawan
2023-09-02  6:11     ` Ammar Faizi
2023-09-02  6:22       ` Willy Tarreau
2023-09-02  6:37         ` Ammar Faizi
2023-09-02 12:29           ` Alviro Iskandar Setiawan
2023-09-02 12:36             ` Ammar Faizi
2023-09-03 20:35               ` David Laight
2023-09-02  5:50 ` [RFC PATCH v2 2/4] tools/nolibc: x86-64: Use `rep stosb` for `memset()` Ammar Faizi
2023-09-02  5:50 ` [RFC PATCH v2 3/4] tools/nolibc: string: Remove the `_nolibc_memcpy_down()` function Ammar Faizi
2023-09-02  5:50 ` [RFC PATCH v2 4/4] tools/nolibc: string: Remove the `_nolibc_memcpy_up()` function Ammar Faizi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox