* [PATCH v2] io_uring: Adjust mapping wrt architecture aliasing requirements @ 2023-02-16 8:09 Helge Deller 2023-02-16 16:11 ` Jens Axboe 2023-06-27 14:14 ` Jiri Slaby 0 siblings, 2 replies; 8+ messages in thread From: Helge Deller @ 2023-02-16 8:09 UTC (permalink / raw) To: io-uring, Jens Axboe, linux-parisc, John David Anglin Some architectures have memory cache aliasing requirements (e.g. parisc) if memory is shared between userspace and kernel. This patch fixes the kernel to return an aliased address when asked by userspace via mmap(). Signed-off-by: Helge Deller <[email protected]> --- v2: Do not allow to map to a user-provided addresss. This forces programs to write portable code, as usually on x86 mapping to any address will succeed, while it will fail for most provided address if used on stricter architectures. diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 862e05e6691d..01fe7437a071 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -72,6 +72,7 @@ #include <linux/io_uring.h> #include <linux/audit.h> #include <linux/security.h> +#include <asm/shmparam.h> #define CREATE_TRACE_POINTS #include <trace/events/io_uring.h> @@ -3059,6 +3060,54 @@ static __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma) return remap_pfn_range(vma, vma->vm_start, pfn, sz, vma->vm_page_prot); } +static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp, + unsigned long addr, unsigned long len, + unsigned long pgoff, unsigned long flags) +{ + const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags); + struct vm_unmapped_area_info info; + void *ptr; + + /* + * Do not allow to map to user-provided address to avoid breaking the + * aliasing rules. Userspace is not able to guess the offset address of + * kernel kmalloc()ed memory area. + */ + if (addr) + return -EINVAL; + + ptr = io_uring_validate_mmap_request(filp, pgoff, len); + if (IS_ERR(ptr)) + return -ENOMEM; + + info.flags = VM_UNMAPPED_AREA_TOPDOWN; + info.length = len; + info.low_limit = max(PAGE_SIZE, mmap_min_addr); + info.high_limit = arch_get_mmap_base(addr, current->mm->mmap_base); +#ifdef SHM_COLOUR + info.align_mask = PAGE_MASK & (SHM_COLOUR - 1UL); +#else + info.align_mask = PAGE_MASK & (SHMLBA - 1UL); +#endif + info.align_offset = (unsigned long) ptr; + + /* + * A failed mmap() very likely causes application failure, + * so fall back to the bottom-up function here. This scenario + * can happen with large stack limits and large mmap() + * allocations. + */ + addr = vm_unmapped_area(&info); + if (offset_in_page(addr)) { + info.flags = 0; + info.low_limit = TASK_UNMAPPED_BASE; + info.high_limit = mmap_end; + addr = vm_unmapped_area(&info); + } + + return addr; +} + #else /* !CONFIG_MMU */ static int io_uring_mmap(struct file *file, struct vm_area_struct *vma) @@ -3273,6 +3322,8 @@ static const struct file_operations io_uring_fops = { #ifndef CONFIG_MMU .get_unmapped_area = io_uring_nommu_get_unmapped_area, .mmap_capabilities = io_uring_nommu_mmap_capabilities, +#else + .get_unmapped_area = io_uring_mmu_get_unmapped_area, #endif .poll = io_uring_poll, #ifdef CONFIG_PROC_FS ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2] io_uring: Adjust mapping wrt architecture aliasing requirements 2023-02-16 8:09 [PATCH v2] io_uring: Adjust mapping wrt architecture aliasing requirements Helge Deller @ 2023-02-16 16:11 ` Jens Axboe 2023-02-16 16:33 ` Helge Deller 2023-06-27 14:14 ` Jiri Slaby 1 sibling, 1 reply; 8+ messages in thread From: Jens Axboe @ 2023-02-16 16:11 UTC (permalink / raw) To: Helge Deller, io-uring, linux-parisc, John David Anglin On 2/16/23 1:09?AM, Helge Deller wrote: > Some architectures have memory cache aliasing requirements (e.g. parisc) > if memory is shared between userspace and kernel. This patch fixes the > kernel to return an aliased address when asked by userspace via mmap(). > > Signed-off-by: Helge Deller <[email protected]> > --- > v2: Do not allow to map to a user-provided addresss. This forces > programs to write portable code, as usually on x86 mapping to any > address will succeed, while it will fail for most provided address if > used on stricter architectures. > > diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c > index 862e05e6691d..01fe7437a071 100644 > --- a/io_uring/io_uring.c > +++ b/io_uring/io_uring.c > @@ -72,6 +72,7 @@ > #include <linux/io_uring.h> > #include <linux/audit.h> > #include <linux/security.h> > +#include <asm/shmparam.h> > > #define CREATE_TRACE_POINTS > #include <trace/events/io_uring.h> > @@ -3059,6 +3060,54 @@ static __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma) > return remap_pfn_range(vma, vma->vm_start, pfn, sz, vma->vm_page_prot); > } > > +static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp, > + unsigned long addr, unsigned long len, > + unsigned long pgoff, unsigned long flags) > +{ > + const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags); > + struct vm_unmapped_area_info info; > + void *ptr; > + > + /* > + * Do not allow to map to user-provided address to avoid breaking the > + * aliasing rules. Userspace is not able to guess the offset address of > + * kernel kmalloc()ed memory area. > + */ > + if (addr) > + return -EINVAL; Can we relax this so that if the address is correctly aligned, it will allow it? The reported issue with sqpoll-cancel-hang.t is due to it crashing because it's a weird syzbot thing that does mmap() with MAP_FIXED and an address given. -- Jens Axboe ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] io_uring: Adjust mapping wrt architecture aliasing requirements 2023-02-16 16:11 ` Jens Axboe @ 2023-02-16 16:33 ` Helge Deller 2023-02-16 16:46 ` Jens Axboe 0 siblings, 1 reply; 8+ messages in thread From: Helge Deller @ 2023-02-16 16:33 UTC (permalink / raw) To: Jens Axboe, io-uring, linux-parisc, John David Anglin On 2/16/23 17:11, Jens Axboe wrote: > On 2/16/23 1:09?AM, Helge Deller wrote: >> Some architectures have memory cache aliasing requirements (e.g. parisc) >> if memory is shared between userspace and kernel. This patch fixes the >> kernel to return an aliased address when asked by userspace via mmap(). >> >> Signed-off-by: Helge Deller <[email protected]> >> --- >> v2: Do not allow to map to a user-provided addresss. This forces >> programs to write portable code, as usually on x86 mapping to any >> address will succeed, while it will fail for most provided address if >> used on stricter architectures. >> >> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c >> index 862e05e6691d..01fe7437a071 100644 >> --- a/io_uring/io_uring.c >> +++ b/io_uring/io_uring.c >> @@ -72,6 +72,7 @@ >> #include <linux/io_uring.h> >> #include <linux/audit.h> >> #include <linux/security.h> >> +#include <asm/shmparam.h> >> >> #define CREATE_TRACE_POINTS >> #include <trace/events/io_uring.h> >> @@ -3059,6 +3060,54 @@ static __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma) >> return remap_pfn_range(vma, vma->vm_start, pfn, sz, vma->vm_page_prot); >> } >> >> +static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp, >> + unsigned long addr, unsigned long len, >> + unsigned long pgoff, unsigned long flags) >> +{ >> + const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags); >> + struct vm_unmapped_area_info info; >> + void *ptr; >> + >> + /* >> + * Do not allow to map to user-provided address to avoid breaking the >> + * aliasing rules. Userspace is not able to guess the offset address of >> + * kernel kmalloc()ed memory area. >> + */ >> + if (addr) >> + return -EINVAL; > > Can we relax this so that if the address is correctly aligned, it will > allow it? My previous patch had it relaxed, but after some more thoughts I removed it in this v2-version again. The idea behind it is good, but I see a huge disadvantage in allowing correctly aligned addresses: People develop their code usually on x86 which has no such alignment requirements, as it just needs to be PAGE_SIZE aligned. So their code will always work fine on x86, but as soon as the same code is built on other platforms it will break. As you know, on parisc it's pure luck if the program chooses an address which is correctly aligned. I'm one of the debian maintainers for parisc, and I've seen similiar mmap-issues in other programs as well. Everytime I've found it to be wrong, you have to explain to the developers what's wrong and sometimes it's not easy to fix it. So, if we can educate people from assuming their code to be correct, I think we can save a lot of additional work afterwards. That said, I think it's better to be strict now, unless someone comes up with a really good reason why it needs to be less strict. > The reported issue with sqpoll-cancel-hang.t is due to it > crashing because it's a weird syzbot thing that does mmap() with > MAP_FIXED and an address given. Ok, but nevertheless I think it's better to be strict. Helge ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] io_uring: Adjust mapping wrt architecture aliasing requirements 2023-02-16 16:33 ` Helge Deller @ 2023-02-16 16:46 ` Jens Axboe 2023-02-16 17:52 ` Helge Deller 0 siblings, 1 reply; 8+ messages in thread From: Jens Axboe @ 2023-02-16 16:46 UTC (permalink / raw) To: Helge Deller, io-uring, linux-parisc, John David Anglin On 2/16/23 9:33?AM, Helge Deller wrote: > On 2/16/23 17:11, Jens Axboe wrote: >> On 2/16/23 1:09?AM, Helge Deller wrote: >>> Some architectures have memory cache aliasing requirements (e.g. parisc) >>> if memory is shared between userspace and kernel. This patch fixes the >>> kernel to return an aliased address when asked by userspace via mmap(). >>> >>> Signed-off-by: Helge Deller <[email protected]> >>> --- >>> v2: Do not allow to map to a user-provided addresss. This forces >>> programs to write portable code, as usually on x86 mapping to any >>> address will succeed, while it will fail for most provided address if >>> used on stricter architectures. >>> >>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c >>> index 862e05e6691d..01fe7437a071 100644 >>> --- a/io_uring/io_uring.c >>> +++ b/io_uring/io_uring.c >>> @@ -72,6 +72,7 @@ >>> #include <linux/io_uring.h> >>> #include <linux/audit.h> >>> #include <linux/security.h> >>> +#include <asm/shmparam.h> >>> >>> #define CREATE_TRACE_POINTS >>> #include <trace/events/io_uring.h> >>> @@ -3059,6 +3060,54 @@ static __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma) >>> return remap_pfn_range(vma, vma->vm_start, pfn, sz, vma->vm_page_prot); >>> } >>> >>> +static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp, >>> + unsigned long addr, unsigned long len, >>> + unsigned long pgoff, unsigned long flags) >>> +{ >>> + const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags); >>> + struct vm_unmapped_area_info info; >>> + void *ptr; >>> + >>> + /* >>> + * Do not allow to map to user-provided address to avoid breaking the >>> + * aliasing rules. Userspace is not able to guess the offset address of >>> + * kernel kmalloc()ed memory area. >>> + */ >>> + if (addr) >>> + return -EINVAL; >> >> Can we relax this so that if the address is correctly aligned, it will >> allow it? > > My previous patch had it relaxed, but after some more thoughts I removed > it in this v2-version again. > > The idea behind it is good, but I see a huge disadvantage in allowing > correctly aligned addresses: People develop their code usually on x86 > which has no such alignment requirements, as it just needs to be PAGE_SIZE aligned. > So their code will always work fine on x86, but as soon as the same code > is built on other platforms it will break. As you know, on parisc it's pure luck > if the program chooses an address which is correctly aligned. > I'm one of the debian maintainers for parisc, and I've seen similiar > mmap-issues in other programs as well. Everytime I've found it to be wrong, > you have to explain to the developers what's wrong and sometimes it's > not easy to fix it. > So, if we can educate people from assuming their code to be correct, I think > we can save a lot of additional work afterwards. > That said, I think it's better to be strict now, unless someone comes > up with a really good reason why it needs to be less strict. I don't disagree with the reasoning at all, but the problem is that it may introduce breakage if someone IS doing the right thing. Is it guaranteed to be true? No, certainly not. But someone could very well be writing perfectly portable code and mapping a ring into a specific address, and this will now break. AFAICT, this is actually the case with the syzbot case. In fact, with the patch applied, it'll obviously start crashing on all archs as the mmaps will now return -EINVAL rather than work. -- Jens Axboe ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] io_uring: Adjust mapping wrt architecture aliasing requirements 2023-02-16 16:46 ` Jens Axboe @ 2023-02-16 17:52 ` Helge Deller 2023-02-16 18:00 ` Jens Axboe 0 siblings, 1 reply; 8+ messages in thread From: Helge Deller @ 2023-02-16 17:52 UTC (permalink / raw) To: Jens Axboe, io-uring, linux-parisc, John David Anglin On 2/16/23 17:46, Jens Axboe wrote: > On 2/16/23 9:33?AM, Helge Deller wrote: >> On 2/16/23 17:11, Jens Axboe wrote: >>> On 2/16/23 1:09?AM, Helge Deller wrote: >>>> Some architectures have memory cache aliasing requirements (e.g. parisc) >>>> if memory is shared between userspace and kernel. This patch fixes the >>>> kernel to return an aliased address when asked by userspace via mmap(). >>>> >>>> Signed-off-by: Helge Deller <[email protected]> >>>> --- >>>> v2: Do not allow to map to a user-provided addresss. This forces >>>> programs to write portable code, as usually on x86 mapping to any >>>> address will succeed, while it will fail for most provided address if >>>> used on stricter architectures. >>>> >>>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c >>>> index 862e05e6691d..01fe7437a071 100644 >>>> --- a/io_uring/io_uring.c >>>> +++ b/io_uring/io_uring.c >>>> @@ -72,6 +72,7 @@ >>>> #include <linux/io_uring.h> >>>> #include <linux/audit.h> >>>> #include <linux/security.h> >>>> +#include <asm/shmparam.h> >>>> >>>> #define CREATE_TRACE_POINTS >>>> #include <trace/events/io_uring.h> >>>> @@ -3059,6 +3060,54 @@ static __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma) >>>> return remap_pfn_range(vma, vma->vm_start, pfn, sz, vma->vm_page_prot); >>>> } >>>> >>>> +static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp, >>>> + unsigned long addr, unsigned long len, >>>> + unsigned long pgoff, unsigned long flags) >>>> +{ >>>> + const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags); >>>> + struct vm_unmapped_area_info info; >>>> + void *ptr; >>>> + >>>> + /* >>>> + * Do not allow to map to user-provided address to avoid breaking the >>>> + * aliasing rules. Userspace is not able to guess the offset address of >>>> + * kernel kmalloc()ed memory area. >>>> + */ >>>> + if (addr) >>>> + return -EINVAL; >>> >>> Can we relax this so that if the address is correctly aligned, it will >>> allow it? >> >> My previous patch had it relaxed, but after some more thoughts I removed >> it in this v2-version again. >> >> The idea behind it is good, but I see a huge disadvantage in allowing >> correctly aligned addresses: People develop their code usually on x86 >> which has no such alignment requirements, as it just needs to be PAGE_SIZE aligned. >> So their code will always work fine on x86, but as soon as the same code >> is built on other platforms it will break. As you know, on parisc it's pure luck >> if the program chooses an address which is correctly aligned. >> I'm one of the debian maintainers for parisc, and I've seen similiar >> mmap-issues in other programs as well. Everytime I've found it to be wrong, >> you have to explain to the developers what's wrong and sometimes it's >> not easy to fix it. >> So, if we can educate people from assuming their code to be correct, I think >> we can save a lot of additional work afterwards. >> That said, I think it's better to be strict now, unless someone comes >> up with a really good reason why it needs to be less strict. > > I don't disagree with the reasoning at all, but the problem is that it > may introduce breakage if someone IS doing the right thing. Is it > guaranteed to be true? No, certainly not. But someone could very well be > writing perfectly portable code and mapping a ring into a specific > address, and this will now break. We will find out if there are such users if we keep it strict now and open it up if it's really necessary. If you open it up now, you won't be able to turn it stricter later. > AFAICT, this is actually the case with the syzbot case. In fact, with > the patch applied, it'll obviously start crashing on all archs as the > mmaps will now return -EINVAL rather than work. Yes, but it's not a real user and just a (invalid) testcase. For that I think it's OK to just disable it. Helge ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] io_uring: Adjust mapping wrt architecture aliasing requirements 2023-02-16 17:52 ` Helge Deller @ 2023-02-16 18:00 ` Jens Axboe 0 siblings, 0 replies; 8+ messages in thread From: Jens Axboe @ 2023-02-16 18:00 UTC (permalink / raw) To: Helge Deller, io-uring, linux-parisc, John David Anglin On 2/16/23 10:52?AM, Helge Deller wrote: > On 2/16/23 17:46, Jens Axboe wrote: >> On 2/16/23 9:33?AM, Helge Deller wrote: >>> On 2/16/23 17:11, Jens Axboe wrote: >>>> On 2/16/23 1:09?AM, Helge Deller wrote: >>>>> Some architectures have memory cache aliasing requirements (e.g. parisc) >>>>> if memory is shared between userspace and kernel. This patch fixes the >>>>> kernel to return an aliased address when asked by userspace via mmap(). >>>>> >>>>> Signed-off-by: Helge Deller <[email protected]> >>>>> --- >>>>> v2: Do not allow to map to a user-provided addresss. This forces >>>>> programs to write portable code, as usually on x86 mapping to any >>>>> address will succeed, while it will fail for most provided address if >>>>> used on stricter architectures. >>>>> >>>>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c >>>>> index 862e05e6691d..01fe7437a071 100644 >>>>> --- a/io_uring/io_uring.c >>>>> +++ b/io_uring/io_uring.c >>>>> @@ -72,6 +72,7 @@ >>>>> #include <linux/io_uring.h> >>>>> #include <linux/audit.h> >>>>> #include <linux/security.h> >>>>> +#include <asm/shmparam.h> >>>>> >>>>> #define CREATE_TRACE_POINTS >>>>> #include <trace/events/io_uring.h> >>>>> @@ -3059,6 +3060,54 @@ static __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma) >>>>> return remap_pfn_range(vma, vma->vm_start, pfn, sz, vma->vm_page_prot); >>>>> } >>>>> >>>>> +static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp, >>>>> + unsigned long addr, unsigned long len, >>>>> + unsigned long pgoff, unsigned long flags) >>>>> +{ >>>>> + const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags); >>>>> + struct vm_unmapped_area_info info; >>>>> + void *ptr; >>>>> + >>>>> + /* >>>>> + * Do not allow to map to user-provided address to avoid breaking the >>>>> + * aliasing rules. Userspace is not able to guess the offset address of >>>>> + * kernel kmalloc()ed memory area. >>>>> + */ >>>>> + if (addr) >>>>> + return -EINVAL; >>>> >>>> Can we relax this so that if the address is correctly aligned, it will >>>> allow it? >>> >>> My previous patch had it relaxed, but after some more thoughts I removed >>> it in this v2-version again. >>> >>> The idea behind it is good, but I see a huge disadvantage in allowing >>> correctly aligned addresses: People develop their code usually on x86 >>> which has no such alignment requirements, as it just needs to be PAGE_SIZE aligned. >>> So their code will always work fine on x86, but as soon as the same code >>> is built on other platforms it will break. As you know, on parisc it's pure luck >>> if the program chooses an address which is correctly aligned. >>> I'm one of the debian maintainers for parisc, and I've seen similiar >>> mmap-issues in other programs as well. Everytime I've found it to be wrong, >>> you have to explain to the developers what's wrong and sometimes it's >>> not easy to fix it. >>> So, if we can educate people from assuming their code to be correct, I think >>> we can save a lot of additional work afterwards. >>> That said, I think it's better to be strict now, unless someone comes >>> up with a really good reason why it needs to be less strict. >> >> I don't disagree with the reasoning at all, but the problem is that it >> may introduce breakage if someone IS doing the right thing. Is it >> guaranteed to be true? No, certainly not. But someone could very well be >> writing perfectly portable code and mapping a ring into a specific >> address, and this will now break. > > We will find out if there are such users if we keep it strict now and > open it up if it's really necessary. If you open it up now, you won't > be able to turn it stricter later. But it has been open up until now, that's the issue. And you're now trying to make it stricter, which is indeed later... >> AFAICT, this is actually the case with the syzbot case. In fact, with >> the patch applied, it'll obviously start crashing on all archs as the >> mmaps will now return -EINVAL rather than work. > > Yes, but it's not a real user and just a (invalid) testcase. > For that I think it's OK to just disable it. Totally agree, and I did just disable it, but that part of the test is not invalid. I don't care about this particular test case, it's more of a general concern. -- Jens Axboe ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] io_uring: Adjust mapping wrt architecture aliasing requirements 2023-02-16 8:09 [PATCH v2] io_uring: Adjust mapping wrt architecture aliasing requirements Helge Deller 2023-02-16 16:11 ` Jens Axboe @ 2023-06-27 14:14 ` Jiri Slaby 2023-06-27 19:24 ` Helge Deller 1 sibling, 1 reply; 8+ messages in thread From: Jiri Slaby @ 2023-06-27 14:14 UTC (permalink / raw) To: Helge Deller, io-uring, Jens Axboe, linux-parisc, John David Anglin On 16. 02. 23, 9:09, Helge Deller wrote: > Some architectures have memory cache aliasing requirements (e.g. parisc) > if memory is shared between userspace and kernel. This patch fixes the > kernel to return an aliased address when asked by userspace via mmap(). > > Signed-off-by: Helge Deller <[email protected]> > --- > v2: Do not allow to map to a user-provided addresss. This forces > programs to write portable code, as usually on x86 mapping to any > address will succeed, while it will fail for most provided address if > used on stricter architectures. > > diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c > index 862e05e6691d..01fe7437a071 100644 > --- a/io_uring/io_uring.c > +++ b/io_uring/io_uring.c > @@ -72,6 +72,7 @@ > #include <linux/io_uring.h> > #include <linux/audit.h> > #include <linux/security.h> > +#include <asm/shmparam.h> > > #define CREATE_TRACE_POINTS > #include <trace/events/io_uring.h> > @@ -3059,6 +3060,54 @@ static __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma) > return remap_pfn_range(vma, vma->vm_start, pfn, sz, vma->vm_page_prot); > } > > +static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp, > + unsigned long addr, unsigned long len, > + unsigned long pgoff, unsigned long flags) > +{ > + const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags); > + struct vm_unmapped_area_info info; > + void *ptr; > + > + /* > + * Do not allow to map to user-provided address to avoid breaking the > + * aliasing rules. Userspace is not able to guess the offset address of > + * kernel kmalloc()ed memory area. > + */ > + if (addr) > + return -EINVAL; > + > + ptr = io_uring_validate_mmap_request(filp, pgoff, len); > + if (IS_ERR(ptr)) > + return -ENOMEM; > + > + info.flags = VM_UNMAPPED_AREA_TOPDOWN; > + info.length = len; > + info.low_limit = max(PAGE_SIZE, mmap_min_addr); > + info.high_limit = arch_get_mmap_base(addr, current->mm->mmap_base); Hi, this breaks compat (x86_32) on x86_64 in 6.4. When you run most liburing tests, you'll get ENOMEM, as this high_limit is something in 64-bit space... > +#ifdef SHM_COLOUR > + info.align_mask = PAGE_MASK & (SHM_COLOUR - 1UL); > +#else > + info.align_mask = PAGE_MASK & (SHMLBA - 1UL); > +#endif > + info.align_offset = (unsigned long) ptr; > + > + /* > + * A failed mmap() very likely causes application failure, > + * so fall back to the bottom-up function here. This scenario > + * can happen with large stack limits and large mmap() > + * allocations. > + */ > + addr = vm_unmapped_area(&info); So the found addr here is > TASK_SIZE - len for 32-bit bins. And get_unmapped_area() returns ENOMEM. > + if (offset_in_page(addr)) { > + info.flags = 0; > + info.low_limit = TASK_UNMAPPED_BASE; > + info.high_limit = mmap_end; > + addr = vm_unmapped_area(&info); > + } > + > + return addr; > +} Reverting the whole commit helps of course. Even this completely incorrect hack helps: --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3398,7 +3398,7 @@ static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) { - const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags); + const unsigned long mmap_end = in_32bit_syscall() ? task_size_32bit() : arch_get_mmap_end(addr, len, flags); struct vm_unmapped_area_info info; void *ptr; @@ -3417,7 +3417,7 @@ static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp, info.flags = VM_UNMAPPED_AREA_TOPDOWN; info.length = len; info.low_limit = max(PAGE_SIZE, mmap_min_addr); - info.high_limit = arch_get_mmap_base(addr, current->mm->mmap_base); + info.high_limit = in_32bit_syscall() ? task_size_32bit() : arch_get_mmap_base(addr, current->mm->mmap_base); #ifdef SHM_COLOUR info.align_mask = PAGE_MASK & (SHM_COLOUR - 1UL); #else Any ideas? Note that the compat mmap apparently uses bottomup expansion. See: if (!in_32bit_syscall() && (flags & MAP_32BIT)) goto bottomup; in arch_get_unmapped_area_topdown(). thanks, -- js suse labs ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] io_uring: Adjust mapping wrt architecture aliasing requirements 2023-06-27 14:14 ` Jiri Slaby @ 2023-06-27 19:24 ` Helge Deller 0 siblings, 0 replies; 8+ messages in thread From: Helge Deller @ 2023-06-27 19:24 UTC (permalink / raw) To: Jiri Slaby, io-uring, Jens Axboe, linux-parisc, John David Anglin On 6/27/23 16:14, Jiri Slaby wrote: > On 16. 02. 23, 9:09, Helge Deller wrote: >> Some architectures have memory cache aliasing requirements (e.g. parisc) >> if memory is shared between userspace and kernel. This patch fixes the >> kernel to return an aliased address when asked by userspace via mmap(). >> >> Signed-off-by: Helge Deller <[email protected]> >> --- >> v2: Do not allow to map to a user-provided addresss. This forces >> programs to write portable code, as usually on x86 mapping to any >> address will succeed, while it will fail for most provided address if >> used on stricter architectures. >> >> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c >> index 862e05e6691d..01fe7437a071 100644 >> --- a/io_uring/io_uring.c >> +++ b/io_uring/io_uring.c >> @@ -72,6 +72,7 @@ >> #include <linux/io_uring.h> >> #include <linux/audit.h> >> #include <linux/security.h> >> +#include <asm/shmparam.h> >> >> #define CREATE_TRACE_POINTS >> #include <trace/events/io_uring.h> >> @@ -3059,6 +3060,54 @@ static __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma) >> return remap_pfn_range(vma, vma->vm_start, pfn, sz, vma->vm_page_prot); >> } >> >> +static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp, >> + unsigned long addr, unsigned long len, >> + unsigned long pgoff, unsigned long flags) >> +{ >> + const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags); >> + struct vm_unmapped_area_info info; >> + void *ptr; >> + >> + /* >> + * Do not allow to map to user-provided address to avoid breaking the >> + * aliasing rules. Userspace is not able to guess the offset address of >> + * kernel kmalloc()ed memory area. >> + */ >> + if (addr) >> + return -EINVAL; >> + >> + ptr = io_uring_validate_mmap_request(filp, pgoff, len); >> + if (IS_ERR(ptr)) >> + return -ENOMEM; >> + >> + info.flags = VM_UNMAPPED_AREA_TOPDOWN; >> + info.length = len; >> + info.low_limit = max(PAGE_SIZE, mmap_min_addr); >> + info.high_limit = arch_get_mmap_base(addr, current->mm->mmap_base); > > Hi, > > this breaks compat (x86_32) on x86_64 in 6.4. When you run most liburing tests, you'll get ENOMEM, as this high_limit is something in 64-bit space... > >> +#ifdef SHM_COLOUR >> + info.align_mask = PAGE_MASK & (SHM_COLOUR - 1UL); >> +#else >> + info.align_mask = PAGE_MASK & (SHMLBA - 1UL); >> +#endif >> + info.align_offset = (unsigned long) ptr; >> + >> + /* >> + * A failed mmap() very likely causes application failure, >> + * so fall back to the bottom-up function here. This scenario >> + * can happen with large stack limits and large mmap() >> + * allocations. >> + */ >> + addr = vm_unmapped_area(&info); > > So the found addr here is > TASK_SIZE - len for 32-bit bins. And get_unmapped_area() returns ENOMEM. > >> + if (offset_in_page(addr)) { >> + info.flags = 0; >> + info.low_limit = TASK_UNMAPPED_BASE; >> + info.high_limit = mmap_end; >> + addr = vm_unmapped_area(&info); >> + } >> + >> + return addr; >> +} > > Reverting the whole commit helps of course. Even this completely incorrect hack helps: > --- a/io_uring/io_uring.c > +++ b/io_uring/io_uring.c > @@ -3398,7 +3398,7 @@ static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp, > unsigned long addr, unsigned long len, > unsigned long pgoff, unsigned long flags) > { > - const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags); > + const unsigned long mmap_end = in_32bit_syscall() ? task_size_32bit() : arch_get_mmap_end(addr, len, flags); > struct vm_unmapped_area_info info; > void *ptr; > > @@ -3417,7 +3417,7 @@ static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp, > info.flags = VM_UNMAPPED_AREA_TOPDOWN; > info.length = len; > info.low_limit = max(PAGE_SIZE, mmap_min_addr); > - info.high_limit = arch_get_mmap_base(addr, current->mm->mmap_base); > + info.high_limit = in_32bit_syscall() ? task_size_32bit() : arch_get_mmap_base(addr, current->mm->mmap_base); I think your "incorrect hack" is actually correct. If it's the compat case which breaks, then task_size_32bit() might be right. Maybe adding arch_get_mmap_base() and arch_get_mmap_end() macros to handle the compat case in to arch/x86/include/asm/* does work, e.g. #define arch_get_mmap_base(addr, base) \ (in_32bit_syscall() ? task_size_32bit() : base) ? Helge ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-06-27 19:24 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-02-16 8:09 [PATCH v2] io_uring: Adjust mapping wrt architecture aliasing requirements Helge Deller 2023-02-16 16:11 ` Jens Axboe 2023-02-16 16:33 ` Helge Deller 2023-02-16 16:46 ` Jens Axboe 2023-02-16 17:52 ` Helge Deller 2023-02-16 18:00 ` Jens Axboe 2023-06-27 14:14 ` Jiri Slaby 2023-06-27 19:24 ` Helge Deller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox