* [PATCH v2 0/1 RESEND] block: fix conversion of GPT partition name to 7-bit
@ 2025-02-18 13:59 Olivier Gayot
2025-02-18 14:01 ` [PATCH v2 1/1 " Olivier Gayot
0 siblings, 1 reply; 2+ messages in thread
From: Olivier Gayot @ 2025-02-18 13:59 UTC (permalink / raw)
To: Davidlohr Bueso, Jens Axboe, Ming Lei, Pavel Begunkov, linux-efi,
linux-block, linux-kernel, io-uring
Cc: olivier.gayot, daniel.bungert
Dear maintainers,
This is a resend of a patch that I originally sent in May 2023.
Resending with an updated list of recipients since the list has been updated.
Original submission:
https://lore.kernel.org/linux-efi/[email protected]/T/#t
http://www.uwsg.indiana.edu/hypermail/linux/kernel/2305.2/08638.html
--
While investigating a userspace issue, we noticed that the PARTNAME udev
property for GPT partitions is not always valid ASCII / UTF-8.
The value of the PARTNAME property for GPT partitions is initially set
by the kernel using the utf16_le_to_7bit function.
This function does a very basic conversion from UTF-16 to 7-bit ASCII by
dropping the first byte of each UTF-16 character and replacing the
remaining byte by "!" if it is not printable.
Essentially, it means that characters outside the ASCII range get
"converted" to other characters which are unrelated. Using this function
for data that is presented in userspace feels questionable and using a
proper conversion to UTF-8 would probably be preferable. However, the
patch attached does not attempt to change this design.
The patch attached actually addresses an implementation issue in the
utf16_le_to_7bit function, which causes the output of the function to
not always be valid 7-bit ASCII.
Olivier Gayot (1):
block: fix conversion of GPT partition name to 7-bit ASCII
block/partitions/efi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Thanks,
Olivier
^ permalink raw reply [flat|nested] 2+ messages in thread
* [PATCH v2 1/1 RESEND] block: fix conversion of GPT partition name to 7-bit
2025-02-18 13:59 [PATCH v2 0/1 RESEND] block: fix conversion of GPT partition name to 7-bit Olivier Gayot
@ 2025-02-18 14:01 ` Olivier Gayot
0 siblings, 0 replies; 2+ messages in thread
From: Olivier Gayot @ 2025-02-18 14:01 UTC (permalink / raw)
To: Davidlohr Bueso, Jens Axboe, Ming Lei, Pavel Begunkov, linux-efi,
linux-block, linux-kernel, io-uring
Cc: daniel.bungert, Olivier Gayot
The utf16_le_to_7bit function claims to, naively, convert a UTF-16
string to a 7-bit ASCII string. By naively, we mean that it:
* drops the first byte of every character in the original UTF-16 string
* checks if all characters are printable, and otherwise replaces them
by exclamation mark "!".
This means that theoretically, all characters outside the 7-bit ASCII
range should be replaced by another character. Examples:
* lower-case alpha (ɒ) 0x0252 becomes 0x52 (R)
* ligature OE (œ) 0x0153 becomes 0x53 (S)
* hangul letter pieup (ㅂ) 0x3142 becomes 0x42 (B)
* upper-case gamma (Ɣ) 0x0194 becomes 0x94 (not printable) so gets
replaced by "!"
The result of this conversion for the GPT partition name is passed to
user-space as PARTNAME via udev, which is confusing and feels questionable.
However, there is a flaw in the conversion function itself. By dropping
one byte of each character and using isprint() to check if the remaining
byte corresponds to a printable character, we do not actually guarantee
that the resulting character is 7-bit ASCII.
This happens because we pass 8-bit characters to isprint(), which
in the kernel returns 1 for many values > 0x7f - as defined in ctype.c.
This results in many values which should be replaced by "!" to be kept
as-is, despite not being valid 7-bit ASCII. Examples:
* e with acute accent (é) 0x00E9 becomes 0xE9 - kept as-is because
isprint(0xE9) returns 1.
* euro sign (€) 0x20AC becomes 0xAC - kept as-is because isprint(0xAC)
returns 1.
Fixed by using a mask of 7 bits instead of 8 bits before calling
isprint.
Signed-off-by: Olivier Gayot <[email protected]>
---
V1 -> V2: No change - resubmitted with subsystem maintainers in CC
block/partitions/efi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/partitions/efi.c b/block/partitions/efi.c
index 5e9be13a56a8..7acba66eed48 100644
--- a/block/partitions/efi.c
+++ b/block/partitions/efi.c
@@ -682,7 +682,7 @@ static void utf16_le_to_7bit(const __le16 *in, unsigned int size, u8 *out)
out[size] = 0;
while (i < size) {
- u8 c = le16_to_cpu(in[i]) & 0xff;
+ u8 c = le16_to_cpu(in[i]) & 0x7f;
if (c && !isprint(c))
c = '!';
^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-02-18 14:01 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-18 13:59 [PATCH v2 0/1 RESEND] block: fix conversion of GPT partition name to 7-bit Olivier Gayot
2025-02-18 14:01 ` [PATCH v2 1/1 " Olivier Gayot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox