From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on gnuweeb.org X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=ALL_TRUSTED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,NO_DNS_FOR_FROM, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 Received: from [10.7.7.5] (unknown [182.253.183.247]) by gnuweeb.org (Postfix) with ESMTPSA id B76BC804D1; Fri, 21 Oct 2022 06:53:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gnuweeb.org; s=default; t=1666335193; bh=qRJQSqjL1w7QguOztrtG1iDAXVYGZeAcaL9vTFHgqgY=; h=Date:To:Cc:References:From:Subject:In-Reply-To:From; b=cqpQOJFKIzSW588kP9VGEnK906rnXqB/Phx0YuRX8SUJvEZu4PHKkNYpRPZryOOx7 yXgKGPne7RLaXvKosBIJgqFGb1O1FbICJPRjzIoixtCKYpwCigzI9EOgtW9U4poUQx jDinaiz70MUX0PS0mUUCDwiE28KA3ifliLAV+usBTlOQurmZRb3vmcy+R2VtqrinrM b6ovDinG0YVVHkkEUooZ/weKHdAw2sL+px+vqQkj4GbJXh3gh5w7UTuajn8GMFr8rk HFB1G1jJ2Xndj9NguKl3tZQIsljfomYcn+DP64Hj0I5bQT0UGuLsJw4/99iwt8TJ3X f0ZDk/1jSEyhA== Message-ID: Date: Fri, 21 Oct 2022 13:53:08 +0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.2.2 Content-Language: en-US To: Muhammad Rizki Cc: Alviro Iskandar Setiawan , GNU/Weeb Mailing List References: <20221020083845.907-1-kiizuha@gnuweeb.org> <20221020083845.907-5-kiizuha@gnuweeb.org> From: Ammar Faizi Subject: Re: [PATCH v2 4/8] atom: Small change for fix_utf8_char() In-Reply-To: <20221020083845.907-5-kiizuha@gnuweeb.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit List-Id: On 10/20/22 3:38 PM, Muhammad Rizki wrote: > Change the parameter to unescape with boolean type and change from > html.escape to html.unescape for the Discord bot. > > Signed-off-by: Muhammad Rizki > --- > daemon/atom/utils.py | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/daemon/atom/utils.py b/daemon/atom/utils.py > index a30d5cb..c95612e 100644 > --- a/daemon/atom/utils.py > +++ b/daemon/atom/utils.py > @@ -206,7 +206,7 @@ def create_template(thread: Message, platform: str, to=None, cc=None): > if len(ret) >= substr: > ret = ret[:substr] + "..." > > - ret = fix_utf8_char(ret, platform == "telegram") > + ret = fix_utf8_char(ret, platform == "discord") > ret += border > > return ret, files, is_patch > @@ -242,10 +242,10 @@ def remove_patch(tmp): > shutil.rmtree(tmp) > > > -def fix_utf8_char(text: str, html_escape: bool = True): > +def fix_utf8_char(text: str, unescape: bool = True): > t = text.rstrip().replace("�"," ") > - if html_escape: > - t = html.escape(html.escape(text)) > + if unescape: > + t = html.unescape(html.unescape(text)) > return t This is broken. I tested this series, but didn't have time to bisect it. After I managed to bisect the issue, I found that the issue is introduced by this patch. See before and after below, anything inside the angle brackets is gone. The sample is using this email: https://lore.kernel.org/io-uring/f905c8cb-702f-6b2c-8954-1a736feb1ee7@kernel.dk/raw ------------- Before this patch: #ml From: Jens Axboe To: Ammar Faizi To: Dylan Yudaken Cc: Pavel Begunkov Cc: GNU/Weeb Mailing List Cc: io-uring Mailing List Cc: Facebook Kernel Team Cc: Dylan Yudaken ------------- After this patch: #ml From: Jens Axboe To: Ammar Faizi To: Dylan Yudaken Cc: Pavel Begunkov Cc: GNU/Weeb Mailing List Cc: io-uring Mailing List Cc: Facebook Kernel Team Cc: Dylan Yudaken -- Ammar Faizi