From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on gnuweeb.org X-Spam-Level: X-Spam-Status: No, score=1.0 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_PASS,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 Authentication-Results: gnuweeb.org; dmarc=none (p=none dis=none) header.from=1wt.eu Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=163.172.96.212; helo=1wt.eu; envelope-from=w@1wt.eu; receiver= Received: from 1wt.eu (ded1.1wt.eu [163.172.96.212]) by gnuweeb.org (Postfix) with ESMTP id CE81D24B2B0 for ; Thu, 31 Aug 2023 04:27:11 +0700 (WIB) Received: (from willy@localhost) by mail.home.local (8.17.1/8.17.1/Submit) id 37ULQvnh001046; Wed, 30 Aug 2023 23:26:57 +0200 Date: Wed, 30 Aug 2023 23:26:57 +0200 From: Willy Tarreau To: Ammar Faizi Cc: Thomas =?iso-8859-1?Q?Wei=DFschuh?= , Nicholas Rosenberg , Alviro Iskandar Setiawan , Michael William Jonathan , GNU/Weeb Mailing List , Linux Kernel Mailing List Subject: Re: [RFC PATCH v1 3/5] tools/nolibc: x86-64: Use `rep cmpsb` for `memcmp()` Message-ID: References: <20230830135726.1939997-1-ammarfaizi2@gnuweeb.org> <20230830135726.1939997-4-ammarfaizi2@gnuweeb.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230830135726.1939997-4-ammarfaizi2@gnuweeb.org> List-Id: On Wed, Aug 30, 2023 at 08:57:24PM +0700, Ammar Faizi wrote: > Simplify memcmp() on the x86-64 arch. > > The x86-64 arch has a 'rep cmpsb' instruction, which can be used to > implement the memcmp() function. > > %rdi = source 1 > %rsi = source 2 > %rcx = length > > Signed-off-by: Ammar Faizi > --- > tools/include/nolibc/arch-x86_64.h | 19 +++++++++++++++++++ > tools/include/nolibc/string.h | 2 ++ > 2 files changed, 21 insertions(+) > > diff --git a/tools/include/nolibc/arch-x86_64.h b/tools/include/nolibc/arch-x86_64.h > index 42f2674ad1ecdd64..6c1b54ba9f774e7b 100644 > --- a/tools/include/nolibc/arch-x86_64.h > +++ b/tools/include/nolibc/arch-x86_64.h > @@ -214,4 +214,23 @@ __asm__ ( > "retq\n" > ); > > +#define NOLIBC_ARCH_HAS_MEMCMP > +static int memcmp(const void *s1, const void *s2, size_t n) > +{ > + const unsigned char *p1 = s1; > + const unsigned char *p2 = s2; > + > + if (!n) > + return 0; > + > + __asm__ volatile ( > + "rep cmpsb" > + : "+D"(p2), "+S"(p1), "+c"(n) > + : "m"(*(const unsigned char (*)[n])s1), > + "m"(*(const unsigned char (*)[n])s2) > + ); > + > + return p1[-1] - p2[-1]; > +} Out of curiosity, given that you implemented the 3 other ones directly in an asm statement, is there a particular reason this one mixes a bit of C and asm ? It would probably be something around this, in the same vein: memcmp: xchg %esi,%eax // source1 mov %rdx,%rcx // count rep cmpsb // source2 in rdi; sets ZF on equal, CF if src1 src1 sbb $0, %al // 0 if src2 == src1, -1 if src2 < src1, 1 if src2 > src1 movsx %al, %eax // sign extend to %eax ret Note that the output logic could have to be revisited, I'm not certain but at first glance it looks valid. Regards, Willy