From: matoro <[email protected]>
To: Helge Deller <[email protected]>
Cc: Jens Axboe <[email protected]>,
[email protected], Linux Ia64 <[email protected]>,
[email protected], Sam James <[email protected]>
Subject: Re: [PATCH 1/5] io_uring: Adjust mapping wrt architecture aliasing requirements
Date: Wed, 12 Jul 2023 13:28:45 -0400 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <ZK7TrdVXc0O6HMpQ@ls3530>
On 2023-07-12 12:24, Helge Deller wrote:
> Hi Matoro,
>
> * matoro <[email protected]>:
>> On 2023-03-14 13:16, Jens Axboe wrote:
>> > From: Helge Deller <[email protected]>
>> >
>> > Some architectures have memory cache aliasing requirements (e.g. parisc)
>> > if memory is shared between userspace and kernel. This patch fixes the
>> > kernel to return an aliased address when asked by userspace via mmap().
>> >
>> > Signed-off-by: Helge Deller <[email protected]>
>> > Signed-off-by: Jens Axboe <[email protected]>
>> > ---
>> > io_uring/io_uring.c | 51 +++++++++++++++++++++++++++++++++++++++++++++
>> > 1 file changed, 51 insertions(+)
>> >
>> > diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
>> > index 722624b6d0dc..3adecebbac71 100644
>> > --- a/io_uring/io_uring.c
>> > +++ b/io_uring/io_uring.c
>> > @@ -72,6 +72,7 @@
>> > #include <linux/io_uring.h>
>> > #include <linux/audit.h>
>> > #include <linux/security.h>
>> > +#include <asm/shmparam.h>
>> >
>> > #define CREATE_TRACE_POINTS
>> > #include <trace/events/io_uring.h>
>> > @@ -3317,6 +3318,54 @@ static __cold int io_uring_mmap(struct file
>> > *file, struct vm_area_struct *vma)
>> > return remap_pfn_range(vma, vma->vm_start, pfn, sz,
>> > vma->vm_page_prot);
>> > }
>> >
>> > +static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp,
>> > + unsigned long addr, unsigned long len,
>> > + unsigned long pgoff, unsigned long flags)
>> > +{
>> > + const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags);
>> > + struct vm_unmapped_area_info info;
>> > + void *ptr;
>> > +
>> > + /*
>> > + * Do not allow to map to user-provided address to avoid breaking the
>> > + * aliasing rules. Userspace is not able to guess the offset address
>> > of
>> > + * kernel kmalloc()ed memory area.
>> > + */
>> > + if (addr)
>> > + return -EINVAL;
>> > +
>> > + ptr = io_uring_validate_mmap_request(filp, pgoff, len);
>> > + if (IS_ERR(ptr))
>> > + return -ENOMEM;
>> > +
>> > + info.flags = VM_UNMAPPED_AREA_TOPDOWN;
>> > + info.length = len;
>> > + info.low_limit = max(PAGE_SIZE, mmap_min_addr);
>> > + info.high_limit = arch_get_mmap_base(addr, current->mm->mmap_base);
>> > +#ifdef SHM_COLOUR
>> > + info.align_mask = PAGE_MASK & (SHM_COLOUR - 1UL);
>> > +#else
>> > + info.align_mask = PAGE_MASK & (SHMLBA - 1UL);
>> > +#endif
>> > + info.align_offset = (unsigned long) ptr;
>> > +
>> > + /*
>> > + * A failed mmap() very likely causes application failure,
>> > + * so fall back to the bottom-up function here. This scenario
>> > + * can happen with large stack limits and large mmap()
>> > + * allocations.
>> > + */
>> > + addr = vm_unmapped_area(&info);
>> > + if (offset_in_page(addr)) {
>> > + info.flags = 0;
>> > + info.low_limit = TASK_UNMAPPED_BASE;
>> > + info.high_limit = mmap_end;
>> > + addr = vm_unmapped_area(&info);
>> > + }
>> > +
>> > + return addr;
>> > +}
>> > +
>> > #else /* !CONFIG_MMU */
>> >
>> > static int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
>> > @@ -3529,6 +3578,8 @@ static const struct file_operations io_uring_fops
>> > = {
>> > #ifndef CONFIG_MMU
>> > .get_unmapped_area = io_uring_nommu_get_unmapped_area,
>> > .mmap_capabilities = io_uring_nommu_mmap_capabilities,
>> > +#else
>> > + .get_unmapped_area = io_uring_mmu_get_unmapped_area,
>> > #endif
>> > .poll = io_uring_poll,
>> > #ifdef CONFIG_PROC_FS
>>
>> Hi Jens, Helge - I've bisected a regression with io_uring on ia64 to
>> this
>> patch in 6.4. Unfortunately this breaks userspace programs using
>> io_uring,
>> the easiest one to test is cmake with an io_uring enabled libuv (i.e.,
>> libuv
>> >= 1.45.0) which will hang.
>>
>> I am aware that ia64 is in a vulnerable place right now which I why I
>> am
>> keeping this spread limited. Since this clearly involves
>> architecture-specific changes for parisc,
>
> it isn't so much architecture-specific... (just one ifdef)
>
>> is there any chance of looking at
>> what is required to do the same for ia64? I looked at
>> 0ef36bd2b37815719e31a72d2beecc28ca8ecd26 ("parisc: change value of
>> SHMLBA
>> from 0x00400000 to PAGE_SIZE") and tried to replicate the SHMLBA ->
>> SHM_COLOUR change, but it made no difference.
>>
>> If hardware is necessary for testing, I can provide it, including
>> remote BMC
>> access for restarts/kernel debugging. Any takers?
>
> I won't have time to test myself, but maybe you could test?
>
> Basically we should try to find out why
> io_uring_mmu_get_unmapped_area()
> doesn't return valid addresses, while arch_get_unmapped_area()
> [in arch/ia64/kernel/sys_ia64.c] does.
>
> You could apply this patch first:
> It introduces a memory leak (as it requests memory twice), but maybe we
> get an idea?
> The ia64 arch_get_unmapped_area() searches for memory from bottom
> (flags=0), while io_uring function tries top-down first. Maybe that's
> the problem. And I don't understand the offset_in_page() check right
> now.
>
> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> index 3bca7a79efda..93b1964d2bbb 100644
> --- a/io_uring/io_uring.c
> +++ b/io_uring/io_uring.c
> @@ -3431,13 +3431,17 @@ static unsigned long
> io_uring_mmu_get_unmapped_area(struct file *filp,
> * can happen with large stack limits and large mmap()
> * allocations.
> */
> +/* compare to arch_get_unmapped_area() in arch/ia64/kernel/sys_ia64.c
> */
> addr = vm_unmapped_area(&info);
> - if (offset_in_page(addr)) {
> +printk("io_uring_mmu_get_unmapped_area() address 1 is: %px\n", addr);
> + addr = NULL;
> + if (!addr) {
> info.flags = 0;
> info.low_limit = TASK_UNMAPPED_BASE;
> info.high_limit = mmap_end;
> addr = vm_unmapped_area(&info);
> }
> +printk("io_uring_mmu_get_unmapped_area() returns address %px\n",
> addr);
>
> return addr;
> }
>
>
> Another option is to disable the call to
> io_uring_nommu_get_unmapped_area())
> with the next patch. Maybe you could add printks() to ia64's
> arch_get_unmapped_area()
> and check what it returns there?
>
> @@ -3654,6 +3658,8 @@ static const struct file_operations io_uring_fops
> = {
> #ifndef CONFIG_MMU
> .get_unmapped_area = io_uring_nommu_get_unmapped_area,
> .mmap_capabilities = io_uring_nommu_mmap_capabilities,
> +#elif 0 /* IS_ENABLED(CONFIG_IA64) */
> + .get_unmapped_area = NULL,
> #else
> .get_unmapped_area = io_uring_mmu_get_unmapped_area,
> #endif
>
> Helge
Thanks Helge. Sample output from that first patch:
[Wed Jul 12 13:09:50 2023] io_uring_mmu_get_unmapped_area() address 1
is: 1ffffffffff40000
[Wed Jul 12 13:09:50 2023] io_uring_mmu_get_unmapped_area() returns
address 2000000001e40000
[Wed Jul 12 13:09:50 2023] io_uring_mmu_get_unmapped_area() address 1
is: 1ffffffffff20000
[Wed Jul 12 13:09:50 2023] io_uring_mmu_get_unmapped_area() returns
address 2000000001f20000
[Wed Jul 12 13:09:50 2023] io_uring_mmu_get_unmapped_area() address 1
is: 1ffffffffff30000
[Wed Jul 12 13:09:50 2023] io_uring_mmu_get_unmapped_area() returns
address 2000000001f30000
[Wed Jul 12 13:09:50 2023] io_uring_mmu_get_unmapped_area() address 1
is: 1ffffffffff90000
[Wed Jul 12 13:09:50 2023] io_uring_mmu_get_unmapped_area() returns
address 2000000001f90000
This pattern seems to be pretty stable, I tried instead just directly
returning the result of a call to arch_get_unmapped_area() at the end of
the function and it seems similar:
[Wed Jul 12 13:27:07 2023] io_uring_mmu_get_unmapped_area() would return
address 1ffffffffffd0000
[Wed Jul 12 13:27:07 2023] but arch_get_unmapped_area() would return
address 2000000001f00000
[Wed Jul 12 13:27:07 2023] io_uring_mmu_get_unmapped_area() would return
address 1ffffffffff00000
[Wed Jul 12 13:27:07 2023] but arch_get_unmapped_area() would return
address 1ffffffffff00000
[Wed Jul 12 13:27:07 2023] io_uring_mmu_get_unmapped_area() would return
address 1fffffffffe20000
[Wed Jul 12 13:27:07 2023] but arch_get_unmapped_area() would return
address 2000000002000000
[Wed Jul 12 13:27:07 2023] io_uring_mmu_get_unmapped_area() would return
address 1fffffffffe30000
[Wed Jul 12 13:27:07 2023] but arch_get_unmapped_area() would return
address 2000000002100000
Is that enough of a clue to go on?
next prev parent reply other threads:[~2023-07-12 17:29 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-14 17:16 [PATCHSET 0/5] User mapped provided buffer rings Jens Axboe
2023-03-14 17:16 ` [PATCH 1/5] io_uring: Adjust mapping wrt architecture aliasing requirements Jens Axboe
2023-07-12 4:43 ` matoro
2023-07-12 16:24 ` Helge Deller
2023-07-12 17:28 ` matoro [this message]
2023-07-12 19:05 ` Helge Deller
2023-07-12 20:30 ` Helge Deller
2023-07-13 0:35 ` matoro
2023-07-13 7:27 ` Helge Deller
2023-07-13 23:57 ` matoro
2023-07-16 6:54 ` Helge Deller
2023-07-16 18:03 ` matoro
2023-07-16 20:54 ` Helge Deller
2023-03-14 17:16 ` [PATCH 2/5] io_uring/kbuf: move pinning of provided buffer ring into helper Jens Axboe
2023-03-14 17:16 ` [PATCH 3/5] io_uring/kbuf: add buffer_list->is_mapped member Jens Axboe
2023-03-14 17:16 ` [PATCH 4/5] io_uring/kbuf: rename struct io_uring_buf_reg 'pad' to'flags' Jens Axboe
2023-03-14 17:16 ` [PATCH 5/5] io_uring: add support for user mapped provided buffer ring Jens Axboe
2023-03-16 18:07 ` Ammar Faizi
2023-03-16 18:42 ` Jens Axboe
2023-03-15 20:03 ` [PATCHSET 0/5] User mapped provided buffer rings Helge Deller
2023-03-15 20:07 ` Helge Deller
2023-03-15 20:38 ` Jens Axboe
2023-03-15 21:04 ` John David Anglin
2023-03-15 21:08 ` Jens Axboe
2023-03-15 21:18 ` Jens Axboe
2023-03-16 10:18 ` Helge Deller
2023-03-16 17:00 ` Jens Axboe
2023-03-16 19:08 ` John David Anglin
2023-03-16 19:46 ` Jens Axboe
2023-03-17 2:09 ` Jens Axboe
2023-03-17 2:17 ` Jens Axboe
2023-03-17 15:36 ` John David Anglin
2023-03-17 15:57 ` Jens Axboe
2023-03-17 16:15 ` John David Anglin
2023-03-17 16:37 ` Jens Axboe
2023-03-15 20:11 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox