From: Ammar Faizi <[email protected]>
To: Muhammad Rizki <[email protected]>
Cc: Alviro Iskandar Setiawan <[email protected]>,
GNU/Weeb Mailing List <[email protected]>
Subject: Re: [PATCH v1 4/7] atom: Improve fix_utf8_char()
Date: Wed, 19 Oct 2022 23:59:24 +0700 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 10/18/22 3:16 PM, Muhammad Rizki wrote:
> -def fix_utf8_char(text: str, html_escape: bool = True):
> +def fix_utf8_char(text: str, unescape: bool = True):
> t = text.rstrip().replace("�"," ")
> - if html_escape:
> - t = html.escape(html.escape(text))
> + if unescape:
> + t = html.unescape(html.unescape(text))
> + reg = re.compile('<.*?>|&([a-z0-9]+|#[0-9]{1,6}|#x[0-9a-f]{1,6});')
> + t = reg.sub('', t)
> return t
You do html.unescape() twice, then remove all HTML special chars and
tags. I don't understand why we should do that. Can you explain a bit
on what is going on here?
--
Ammar Faizi
next prev parent reply other threads:[~2022-10-19 16:59 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-18 8:16 [PATCH v1 0/7] Fix some bugs and add some features Muhammad Rizki
2022-10-18 8:16 ` [PATCH v1 1/7] discord: Add send_text_mail_interaction() Muhammad Rizki
2022-10-18 8:16 ` [PATCH v1 2/7] discord: Add send_patch_mail_interaction() Muhammad Rizki
2022-10-18 8:16 ` [PATCH v1 3/7] discord: Add get lore mail slash command Muhammad Rizki
2022-10-18 8:16 ` [PATCH v1 4/7] atom: Improve fix_utf8_char() Muhammad Rizki
2022-10-19 16:59 ` Ammar Faizi [this message]
2022-10-19 17:23 ` Muhammad Rizki
2022-10-19 17:27 ` Ammar Faizi
2022-10-19 17:35 ` Muhammad Rizki
2022-10-19 17:42 ` Ammar Faizi
2022-10-19 17:46 ` Ammar Faizi
2022-10-19 17:51 ` Muhammad Rizki
2022-10-19 17:53 ` Ammar Faizi
2022-10-19 17:55 ` Muhammad Rizki
2022-10-19 18:11 ` Ammar Faizi
2022-10-19 22:34 ` Alviro Iskandar Setiawan
2022-10-20 4:26 ` Muhammad Rizki
2022-10-20 5:02 ` Ammar Faizi
2022-10-20 5:06 ` Muhammad Rizki
2022-10-20 5:10 ` Ammar Faizi
2022-10-20 5:10 ` Muhammad Rizki
2022-10-20 5:16 ` Ammar Faizi
2022-10-19 18:04 ` Muhammad Rizki
2022-10-19 18:14 ` Ammar Faizi
2022-10-19 22:44 ` Alviro Iskandar Setiawan
2022-10-20 4:24 ` Muhammad Rizki
2022-10-21 11:31 ` Ammar Faizi
2022-10-18 8:16 ` [PATCH v1 5/7] atom: Improve remove_patch() Muhammad Rizki
2022-10-18 8:16 ` [PATCH v1 6/7] atom: add manage_payload() Muhammad Rizki
2022-10-19 17:04 ` Ammar Faizi
2022-10-19 17:23 ` Muhammad Rizki
2022-10-19 17:28 ` Ammar Faizi
2022-10-21 7:04 ` Ammar Faizi
2022-10-21 7:37 ` Muhammad Rizki
2022-10-21 7:40 ` Ammar Faizi
2022-10-21 8:22 ` Muhammad Rizki
2022-10-21 8:33 ` Ammar Faizi
2022-10-21 9:58 ` Muhammad Rizki
2022-10-21 10:47 ` Muhammad Rizki
2022-10-21 10:53 ` Ammar Faizi
2022-10-21 10:54 ` Muhammad Rizki
2022-10-18 8:16 ` [PATCH v1 7/7] telegram: Fix get lore command Muhammad Rizki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox