From: Helge Deller <[email protected]>
To: Jens Axboe <[email protected]>,
John David Anglin <[email protected]>,
[email protected], [email protected],
James Bottomley <[email protected]>
Subject: Re: io_uring failure on parisc with VIPT caches
Date: Wed, 15 Feb 2023 17:02:11 +0100 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 2/15/23 16:56, Jens Axboe wrote:
> On 2/15/23 8:52?AM, Helge Deller wrote:
>> On 2/15/23 16:16, Jens Axboe wrote:
>>> On 2/14/23 7:12?PM, John David Anglin wrote:
>>>> On 2023-02-14 6:29 p.m., Jens Axboe wrote:
>>>>> On 2/14/23 4:09?PM, Helge Deller wrote:
>>>>>> * John David Anglin<[email protected]>:
>>>>>>> On 2023-02-13 5:05 p.m., Helge Deller wrote:
>>>>>>>> On 2/13/23 22:05, Jens Axboe wrote:
>>>>>>>>> On 2/13/23 1:59?PM, Helge Deller wrote:
>>>>>>>>>>> Yep sounds like it. What's the caching architecture of parisc?
>>>>>>>>>> parisc is Virtually Indexed, Physically Tagged (VIPT).
>>>>>>>>> That's what I assumed, so virtual aliasing is what we're dealing with
>>>>>>>>> here.
>>>>>>>>>
>>>>>>>>>> Thanks for the patch!
>>>>>>>>>> Sadly it doesn't fix the problem, as the kernel still sees
>>>>>>>>>> ctx->rings->sq.tail as being 0.
>>>>>>>>>> Interestingly it worked once (not reproduceable) directly after bootup,
>>>>>>>>>> which indicates that we at least look at the right address from kernel side.
>>>>>>>>>>
>>>>>>>>>> So, still needs more debugging/testing.
>>>>>>>>> It's not like this is untested stuff, so yeah it'll generally be
>>>>>>>>> correct, it just seems that parisc is a bit odd in that the virtual
>>>>>>>>> aliasing occurs between the kernel and userspace addresses too. At least
>>>>>>>>> that's what it seems like.
>>>>>>>> True.
>>>>>>>>
>>>>>>>>> But I wonder if what needs flushing is the user side, not the kernel
>>>>>>>>> side? Either that, or my patch is not flushing the right thing on the
>>>>>>>>> kernel side.
>>>>>> The patch below seems to fix the issue.
>>>>>>
>>>>>> I've successfuly tested it with the io_uring-test testcase on
>>>>>> physical parisc machines with 32- and 64-bit 6.1.11 kernels.
>>>>>>
>>>>>> The idea is similiar on how a file is mmapped shared by two
>>>>>> userspace processes by keeping the lower bits of the virtual address
>>>>>> the same.
>>>>>>
>>>>>> Cache flushes from userspace don't seem to be needed.
>>>>> Are they from the kernel side, if the lower bits mean we end up
>>>>> with the same coloring? Because I think this is a bit of a big
>>>>> hammer, in terms of overhead for flushing. As an example, on arm64
>>>>> that is perfectly fine with the existing code, it's about a 20-25%
>>>>> performance hit.
>>>>
>>>> The io_uring-test testcase still works on rp3440 with the kernel
>>>> flushes removed.
>>>
>>> That's what I suspected, the important bit here is just aligning it for
>>> identical coloring. Can you confirm if the below works for you? Had to
>>> fiddle it a bit to get it to work without coloring.
>>
>> Yes, the patch works for me on 32- and 64-bit, even with PA8900 CPUs...
>>
>> Is there maybe somewhere a more detailled testcase which I could try too?
>
> Just git clone liburing:
>
> git clone git://git.kernel.dk/liburing
>
> and run make && make runtests in there, that'll go through the whole
> regression suite.
Thanks!
I'll test.
>> Some nits below...
>>
>>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
>>> index db623b3185c8..1d4562067949 100644
>>> --- a/io_uring/io_uring.c
>>> +++ b/io_uring/io_uring.c
>>> @@ -72,6 +72,7 @@
>>> #include <linux/io_uring.h>
>>> #include <linux/audit.h>
>>> #include <linux/security.h>
>>> +#include <asm/shmparam.h>
>>>
>>> #define CREATE_TRACE_POINTS
>>> #include <trace/events/io_uring.h>
>>> @@ -3200,6 +3201,51 @@ static __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
>>> return remap_pfn_range(vma, vma->vm_start, pfn, sz, vma->vm_page_prot);
>>> }
>>>
>>> +static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp,
>>> + unsigned long addr, unsigned long len,
>>> + unsigned long pgoff, unsigned long flags)
>>> +{
>>> + const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags);
>>> + struct vm_unmapped_area_info info;
>>> + void *ptr;
>>> +
>>> + ptr = io_uring_validate_mmap_request(filp, pgoff, len);
>>> + if (IS_ERR(ptr))
>>> + return -ENOMEM;
>>> +
>>> + /* we do not support requesting a specific address */
>>> + if (addr)
>>> + return -EINVAL;
>>
>> With this ^ we disallow users to provide a proposed address.
>> I think this is ok and I suggest to keep it that way.
>>
>> Alternatively one could check the given address against the
>> alignment which is calculated below, but this will make the
>> code IMHO unnecessary bigger.
>
> liburing won't provide an address, so I'd say let's just keep it as-is.
Good.
>>> +
>>> + info.flags = VM_UNMAPPED_AREA_TOPDOWN;
>>> + info.length = len;
>>> + info.low_limit = max(PAGE_SIZE, mmap_min_addr);
>>> + info.high_limit = arch_get_mmap_base(addr, current->mm->mmap_base);
>>> + info.align_mask = PAGE_MASK;
>>> + info.align_offset = (unsigned long) ptr;
>>
>> For parisc I introduced SHM_COLOUR because it allows userspace
>> to map a shared file initially at any PAGE_SIZE-aligned address.
>> Only if then a second user maps the same file, the aliasing will be enforced.
>>
>> Other platforms just have SHMLBA, and for some SHMLBA is > PAGE_SIZE.
>> So, instead of above code, this untested code might be better for those other
>> platforms ?
>> info.align_mask = PAGE_MASK & (SHMLBA - 1);
>> info.align_offset = (unsigned long)ptr & (SHMLBA - 1);
>
> Yeah, I did peek at SHMLBA as well and it seems more common. Could you
> test that and send out a "real" patch so we can get it queued up?
Sure, I'll do.
Helge
next prev parent reply other threads:[~2023-02-15 16:02 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-12 9:47 io_uring failure on parisc (32-bit userspace and 64-bit kernel) Helge Deller
2023-02-12 13:16 ` Jens Axboe
2023-02-12 13:28 ` Helge Deller
2023-02-12 13:35 ` Jens Axboe
2023-02-12 14:00 ` Jens Axboe
2023-02-12 14:03 ` Helge Deller
2023-02-12 19:35 ` Helge Deller
2023-02-12 19:42 ` Jens Axboe
2023-02-12 20:01 ` Helge Deller
2023-02-12 21:48 ` Jens Axboe
2023-02-12 22:20 ` Helge Deller
2023-02-12 22:31 ` Helge Deller
2023-02-13 16:15 ` Jens Axboe
2023-02-13 20:59 ` Helge Deller
2023-02-13 21:05 ` Jens Axboe
2023-02-13 22:05 ` Helge Deller
2023-02-13 22:50 ` John David Anglin
2023-02-14 23:09 ` io_uring failure on parisc with VIPT caches Helge Deller
2023-02-14 23:29 ` Jens Axboe
2023-02-15 2:12 ` John David Anglin
2023-02-15 15:16 ` Jens Axboe
2023-02-15 15:52 ` Helge Deller
2023-02-15 15:56 ` Jens Axboe
2023-02-15 16:02 ` Helge Deller [this message]
2023-02-15 16:04 ` Jens Axboe
2023-02-15 21:40 ` Helge Deller
2023-02-15 23:04 ` Jens Axboe
2023-02-15 16:38 ` John David Anglin
2023-02-15 17:01 ` Jens Axboe
2023-02-15 19:00 ` Jens Axboe
2023-02-15 19:16 ` Jens Axboe
2023-02-15 20:27 ` John David Anglin
2023-02-15 20:37 ` Jens Axboe
2023-02-15 21:06 ` John David Anglin
2023-02-15 21:38 ` Jens Axboe
2023-02-15 21:39 ` John David Anglin
2023-02-15 22:10 ` John David Anglin
2023-02-15 23:02 ` Jens Axboe
2023-02-15 23:43 ` John David Anglin
2023-02-16 2:40 ` John David Anglin
2023-02-16 2:50 ` Jens Axboe
2023-02-16 8:24 ` Helge Deller
2023-02-16 15:22 ` Jens Axboe
2023-02-16 20:35 ` John David Anglin
2023-02-15 23:03 ` Jens Axboe
2023-02-15 19:20 ` John David Anglin
2023-02-15 19:24 ` Jens Axboe
2023-02-15 16:18 ` John David Anglin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox