From: Ammar Faizi <[email protected]>
To: "Willy Tarreau" <[email protected]>, "Thomas Weißschuh" <[email protected]>
Cc: Ammar Faizi <[email protected]>,
David Laight <[email protected]>,
Nicholas Rosenberg <[email protected]>,
Alviro Iskandar Setiawan <[email protected]>,
Michael William Jonathan <[email protected]>,
GNU/Weeb Mailing List <[email protected]>,
Linux Kernel Mailing List <[email protected]>
Subject: [RFC PATCH v3 0/4] nolibc x86-64 string functions
Date: Sat, 2 Sep 2023 20:35:01 +0700 [thread overview]
Message-ID: <[email protected]> (raw)
Hi Willy,
This is an RFC patchset v3 for nolibc x86-64 string functions.
There are 4 patches in this series:
## Patch 1-2: Use `rep movsb`, `rep stosb` for:
- memcpy() and memmove()
- memset()
respectively. They can simplify the generated ASM code.
Patch 3 and 4 are not related, just a small cleanup.
## Patch 3: Remove the `_nolibc_memcpy_down()` function
This nolibc internal function is not used. Delete it. It was probably
supposed to handle memmove(), but today the memmove() has its own
implementation.
## Patch 4: Remove the `_nolibc_memcpy_up()` function
This function is only called by memcpy(), there is no real reason to
have this wrapper. Delete this function and move the code to memcpy()
directly.
Before this series:
```
00000000004013aa <memmove>:
4013aa: f3 0f 1e fa endbr64
4013ae: 48 39 f7 cmpq %rsi,%rdi
4013b1: 48 c7 c1 ff ff ff ff movq $0xffffffffffffffff,%rcx
4013b8: 48 89 f8 movq %rdi,%rax
4013bb: 48 0f 43 ca cmovaeq %rdx,%rcx
4013bf: 48 19 ff sbbq %rdi,%rdi
4013c2: 83 e7 02 andl $0x2,%edi
4013c5: 48 ff cf decq %rdi
4013c8: 48 85 d2 testq %rdx,%rdx
4013cb: 74 10 je 4013dd <memmove+0x33>
4013cd: 48 01 f9 addq %rdi,%rcx
4013d0: 48 ff ca decq %rdx
4013d3: 44 8a 04 0e movb (%rsi,%rcx,1),%r8b
4013d7: 44 88 04 08 movb %r8b,(%rax,%rcx,1)
4013db: eb eb jmp 4013c8 <memmove+0x1e>
4013dd: c3 retq
00000000004013de <memcpy>:
4013de: f3 0f 1e fa endbr64
4013e2: 48 89 f8 movq %rdi,%rax
4013e5: 31 c9 xorl %ecx,%ecx
4013e7: 48 39 ca cmpq %rcx,%rdx
4013ea: 74 0d je 4013f9 <memcpy+0x1b>
4013ec: 40 8a 3c 0e movb (%rsi,%rcx,1),%dil
4013f0: 40 88 3c 08 movb %dil,(%rax,%rcx,1)
4013f4: 48 ff c1 incq %rcx
4013f7: eb ee jmp 4013e7 <memcpy+0x9>
4013f9: c3 retq
00000000004013fa <memset>:
4013fa: f3 0f 1e fa endbr64
4013fe: 48 89 f8 movq %rdi,%rax
401401: 31 c9 xorl %ecx,%ecx
401403: 48 39 ca cmpq %rcx,%rdx
401406: 74 09 je 401411 <memset+0x17>
401408: 40 88 34 08 movb %sil,(%rax,%rcx,1)
40140c: 48 ff c1 incq %rcx
40140f: eb f2 jmp 401403 <memset+0x9>
401411: c3 retq
```
After this series:
```
// `memmove` is an alias for `memcpy`
000000000040149c <memcpy>:
40149c: 48 89 d1 movq %rdx,%rcx
40149f: 48 89 f8 movq %rdi,%rax
4014a2: 48 89 fa movq %rdi,%rdx
4014a5: 48 29 f2 subq %rsi,%rdx
4014a8: 48 39 ca cmpq %rcx,%rdx
4014ab: 72 03 jb 4014b0 <memcpy+0x14>
4014ad: f3 a4 rep movsb %ds:(%rsi),%es:(%rdi)
4014af: c3 retq
4014b0: 48 8d 7c 0f ff leaq -0x1(%rdi,%rcx,1),%rdi
4014b5: 48 8d 74 0e ff leaq -0x1(%rsi,%rcx,1),%rsi
4014ba: fd std
4014bb: f3 a4 rep movsb %ds:(%rsi),%es:(%rdi)
4014bd: fc cld
4014be: c3 retq
00000000004014bf <memset>:
4014bf: 96 xchgl %eax,%esi
4014c0: 48 89 d1 movq %rdx,%rcx
4014c3: 57 pushq %rdi
4014c4: f3 aa rep stosb %al,%es:(%rdi)
4014c6: 58 popq %rax
4014c7: c3 retq
```
## Changelog
Changes in v3:
- Make memmove as an alias for memcpy (Willy).
- Make the forward copy the likely case (Alviro).
Changes in v2:
- Shrink the memset code size:
- Use pushq %rdi / popq %rax (Alviro).
- Use xchg %eax, %esi (Willy).
- Drop the memcmp patch (need more pondering).
- Fix the broken memmove implementation (David).
Signed-off-by: Ammar Faizi <[email protected]>
---
Ammar Faizi (4):
tools/nolibc: x86-64: Use `rep movsb` for `memcpy()` and `memmove()`
tools/nolibc: x86-64: Use `rep stosb` for `memset()`
tools/nolibc: string: Remove the `_nolibc_memcpy_down()` function
tools/nolibc: string: Remove the `_nolibc_memcpy_up()` function
tools/include/nolibc/arch-x86_64.h | 42 ++++++++++++++++++++++++++++++
tools/include/nolibc/string.h | 36 +++++++++----------------
2 files changed, 55 insertions(+), 23 deletions(-)
base-commit: 3c9b7c4a228bf8cca2f92abb65575cdd54065302
--
Ammar Faizi
next reply other threads:[~2023-09-02 13:35 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-02 13:35 Ammar Faizi [this message]
2023-09-02 13:35 ` [RFC PATCH v3 1/4] tools/nolibc: x86-64: Use `rep movsb` for `memcpy()` and `memmove()` Ammar Faizi
2023-09-02 13:35 ` [RFC PATCH v3 2/4] tools/nolibc: x86-64: Use `rep stosb` for `memset()` Ammar Faizi
2023-09-02 19:28 ` Alviro Iskandar Setiawan
2023-09-02 19:34 ` Ammar Faizi
2023-09-02 19:38 ` Alviro Iskandar Setiawan
2023-09-02 19:39 ` Ammar Faizi
2023-09-03 8:17 ` Willy Tarreau
2023-09-03 8:34 ` Ammar Nofan Faizi
2023-09-03 8:39 ` Ammar Faizi
2023-09-03 9:55 ` Willy Tarreau
2023-09-02 13:35 ` [RFC PATCH v3 3/4] tools/nolibc: string: Remove the `_nolibc_memcpy_down()` function Ammar Faizi
2023-09-02 19:24 ` Alviro Iskandar Setiawan
2023-09-02 13:35 ` [RFC PATCH v3 4/4] tools/nolibc: string: Remove the `_nolibc_memcpy_up()` function Ammar Faizi
2023-09-02 19:26 ` Alviro Iskandar Setiawan
2023-09-03 20:38 ` [RFC PATCH v3 0/4] nolibc x86-64 string functions David Laight
2023-09-03 21:19 ` Willy Tarreau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox