From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on gnuweeb.org X-Spam-Level: X-Spam-Status: No, score=1.0 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_PASS,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 Authentication-Results: gnuweeb.org; dmarc=none (p=none dis=none) header.from=1wt.eu Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=163.172.96.212; helo=1wt.eu; envelope-from=w@1wt.eu; receiver= Received: from 1wt.eu (ded1.1wt.eu [163.172.96.212]) by gnuweeb.org (Postfix) with ESMTP id 02D9024B363 for ; Fri, 1 Sep 2023 10:35:24 +0700 (WIB) Received: (from willy@localhost) by mail.home.local (8.17.1/8.17.1/Submit) id 3813Z8tH015864; Fri, 1 Sep 2023 05:35:08 +0200 Date: Fri, 1 Sep 2023 05:35:08 +0200 From: Willy Tarreau To: Ammar Faizi Cc: Thomas =?iso-8859-1?Q?Wei=DFschuh?= , Nicholas Rosenberg , Alviro Iskandar Setiawan , Michael William Jonathan , GNU/Weeb Mailing List , Linux Kernel Mailing List Subject: Re: [RFC PATCH v1 3/5] tools/nolibc: x86-64: Use `rep cmpsb` for `memcmp()` Message-ID: References: <20230830135726.1939997-1-ammarfaizi2@gnuweeb.org> <20230830135726.1939997-4-ammarfaizi2@gnuweeb.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: List-Id: On Fri, Sep 01, 2023 at 10:24:42AM +0700, Ammar Faizi wrote: > On Wed, Aug 30, 2023 at 11:26:57PM +0200, Willy Tarreau wrote: > > Out of curiosity, given that you implemented the 3 other ones directly > > in an asm statement, is there a particular reason this one mixes a bit > > of C and asm ? > > Because this one maybe unused. The other are explicitly exported. Makes sense, indeed. > > It would probably be something around this, in the same vein: > > > > memcmp: > > xchg %esi,%eax // source1 > > mov %rdx,%rcx // count > > rep cmpsb // source2 in rdi; sets ZF on equal, CF if src1 > seta %al // 0 if src2 <= src1, 1 if src2 > src1 > > sbb $0, %al // 0 if src2 == src1, -1 if src2 < src1, 1 if src2 > src1 > > movsx %al, %eax // sign extend to %eax > > ret > > > > Note that the output logic could have to be revisited, I'm not certain but > > at first glance it looks valid. > > After thinking about this more, I think I'll drop the memcmp() patch > because it will prevent optimization when comparing a small value. > > For example, without __asm__: > > memcmp(var, "abcd", 4); > > may compile to: > > cmpl $0x64636261, %reg > ...something... > > But with __asm__, the compiler can't do that. Thus, it's not worth > optimizing the memcmp() in this case. Ah you're totally right! Willy