From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on gnuweeb.org X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=ALL_TRUSTED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,NO_DNS_FOR_FROM, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 Received: from [10.7.7.5] (unknown [182.253.183.240]) by gnuweeb.org (Postfix) with ESMTPSA id 0B0DC81663; Sun, 20 Nov 2022 05:23:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gnuweeb.org; s=default; t=1668921830; bh=ZQ7gwgAKC/Inh3HsGQd0gDCR+IO75f1Cavp1KN7v8tU=; h=Date:To:Cc:References:From:Subject:In-Reply-To:From; b=jlifs+ppp71Z+xFcn5VkEf0Ji3h3/ThuRGlm/8O+zC/gbLWqTfmvOuAfe68pSX4Ub mzaRu8MTbbMNML5DdLFPQeaFfmH0Qy2LrSprxdfMK2e9Xo1sxXvMLuxiYOPhxxVteL OoCVri5psqnxstJyk4kNW4c4f2Z7WnkIqN8EgJxuP5qVUR9mv4RXLDRSvR0on95Kh8 ddpK7+iOZ33UJ6aRoI6lKip2fS/ofJEKIW+ds5nlIYFC1UxwzeR4hDfcFbid9IkEih rWTRHANjXr7FPVaFUBTZXUve1MjL4S6MrOxikD/4unTS/Dlx0qPGDfnYFtgFWCJt4X ej2cVXCeU3T0g== Message-ID: <6fd38326-a7b1-38ea-d9f1-1da90ed6ff19@gnuweeb.org> Date: Sun, 20 Nov 2022 12:23:46 +0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 Content-Language: en-US To: Muhammad Rizki Cc: Alviro Iskandar Setiawan , GNU/Weeb Mailing List References: <20221109025002.258-1-kiizuha@gnuweeb.org> <20221109025002.258-7-kiizuha@gnuweeb.org> From: Ammar Faizi Subject: Re: [PATCH v2 06/17] utils: Improve fix_utf8_char() In-Reply-To: <20221109025002.258-7-kiizuha@gnuweeb.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit List-Id: On 11/9/22 9:49 AM, Muhammad Rizki wrote: > Improvement for the fix_utf8_char() to ensure the `>` will be > unescaped, because if not use the html.unescape(), the email payload > will contain `>` for the Discord bot. > > Also, change on the html.escape() to use it only once. From the past > issue bb8855bf, some email message doesn't escaped correctly, so I use > the html.escape() twice. Within the current version, this issue should > be fixed and can call the html.escape() just once. > > Fixes: bb8855bf ("Fix the storage management after the refactor was happened") > Signed-off-by: Muhammad Rizki > --- > daemon/atom/utils.py | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/daemon/atom/utils.py b/daemon/atom/utils.py > index dd9e1a6..c21a4b5 100644 > --- a/daemon/atom/utils.py > +++ b/daemon/atom/utils.py > @@ -258,8 +258,8 @@ def remove_patch(tmp: Union[str, list]): > def fix_utf8_char(text: str, html_escape: bool = True): > t = text.rstrip().replace("�"," ") > if html_escape: > - t = html.escape(html.escape(text)) > - return t > + return html.escape(text) > + return html.unescape(t) Please stop trying random things to make your output looks good without understanding what went wrong. This stupid path has been turning on and off forever since the beginning. What is exactly the underlying issue behind this? I want to get a real understanding of why such an issue happens. I will start rejecting fixes that can't be well-understood start from now on. For this one patch, I want you: 1. Understand went wrong from the past. 2. Explain how did it go wrong. 3. Explain how does this patch act as a real fix. Double escape was just your random attempt and it didn't actually fix the issue well enough. Why? Because your fix is not based on an understanding, your fix is only respecting particular output and you hacked it to make it looks good, but throw away generic cases. You can't explain the technical reason of why you did double escape. Just like this patch does. I don't want we work this way forever. -- Ammar Faizi