public inbox for [email protected]
 help / color / mirror / Atom feed
From: Helge Deller <[email protected]>
To: Jens Axboe <[email protected]>,
	[email protected], [email protected],
	John David Anglin <[email protected]>,
	James Bottomley <[email protected]>
Subject: Re: io_uring failure on parisc with VIPT caches
Date: Wed, 15 Feb 2023 00:09:53 +0100	[thread overview]
Message-ID: <Y+wUwVxeN87gqN6o@p100> (raw)
In-Reply-To: <[email protected]>

* John David Anglin <[email protected]>:
> On 2023-02-13 5:05 p.m., Helge Deller wrote:
> > On 2/13/23 22:05, Jens Axboe wrote:
> > > On 2/13/23 1:59?PM, Helge Deller wrote:
> > > > > Yep sounds like it. What's the caching architecture of parisc?
> > > >
> > > > parisc is Virtually Indexed, Physically Tagged (VIPT).
> > >
> > > That's what I assumed, so virtual aliasing is what we're dealing with
> > > here.
> > >
> > > > Thanks for the patch!
> > > > Sadly it doesn't fix the problem, as the kernel still sees
> > > > ctx->rings->sq.tail as being 0.
> > > > Interestingly it worked once (not reproduceable) directly after bootup,
> > > > which indicates that we at least look at the right address from kernel side.
> > > >
> > > > So, still needs more debugging/testing.
> > >
> > > It's not like this is untested stuff, so yeah it'll generally be
> > > correct, it just seems that parisc is a bit odd in that the virtual
> > > aliasing occurs between the kernel and userspace addresses too. At least
> > > that's what it seems like.
> >
> > True.
> >
> > > But I wonder if what needs flushing is the user side, not the kernel
> > > side? Either that, or my patch is not flushing the right thing on the
> > > kernel side.


The patch below seems to fix the issue.

I've successfuly tested it with the io_uring-test testcase on
physical parisc machines with 32- and 64-bit 6.1.11 kernels.

The idea is similiar on how a file is mmapped shared by two
userspace processes by keeping the lower bits of the virtual address
the same.

Cache flushes from userspace don't seem to be needed.

I think similiar code is needed for mips (uses SHMLBA 0x40000) and
some other architectures....

Helge


From efde7ed7ad380a924448b8ab8ea30d52782aa8e6 Mon Sep 17 00:00:00 2001
From: Helge Deller <[email protected]>
Date: Tue, 14 Feb 2023 23:41:14 +0100
Subject: [PATCH] io_uring: DRAFT Fix io_uring on machines with VIPT caches

This is a DRAFT patch to fix io_uring to function on machines
with VIPT caches (like PA-RISC).
It will currently only compile on parisc, because of the usage
of the SHM_COLOUR constant.

Basic idea is to ensure that the page colour matches between the kernel
ring address and mmap'ed userspace address and by flushing the caches
before accessing the rings.

Signed-off-by: Helge Deller <[email protected]>

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 862e05e6691d..606e23671453 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2208,6 +2208,8 @@ static const struct io_uring_sqe *io_get_sqe(struct io_ring_ctx *ctx)
 	unsigned head, mask = ctx->sq_entries - 1;
 	unsigned sq_idx = ctx->cached_sq_head++ & mask;

+	flush_dcache_page(virt_to_page(ctx->sq_array + sq_idx));
+
 	/*
 	 * The cached sq head (or cq tail) serves two purposes:
 	 *
@@ -2238,6 +2240,9 @@ int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr)
 	unsigned int left;
 	int ret;

+	struct io_rings *rings = ctx->rings;
+	flush_dcache_page(virt_to_page(rings));
+
 	if (unlikely(!entries))
 		return 0;
 	/* make sure SQ entry isn't read before tail */
@@ -3059,6 +3067,51 @@ static __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
 	return remap_pfn_range(vma, vma->vm_start, pfn, sz, vma->vm_page_prot);
 }

+unsigned long
+io_uring_mmu_get_unmapped_area(struct file *filp, unsigned long addr,
+				  unsigned long len, unsigned long pgoff,
+				  unsigned long flags)
+{
+	struct mm_struct *mm = current->mm;
+	const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags);
+	struct vm_unmapped_area_info info;
+	void *ptr;
+
+	ptr = io_uring_validate_mmap_request(filp, pgoff, len);
+	if (IS_ERR(ptr))
+		return -ENOMEM;
+
+
+	/* we do not support requesting a specific address */
+	if (addr)
+		return -EINVAL;
+
+	info.flags = VM_UNMAPPED_AREA_TOPDOWN;
+	info.length = len;
+	info.low_limit = max(PAGE_SIZE, mmap_min_addr);
+	info.high_limit = arch_get_mmap_base(addr, mm->mmap_base);
+	info.align_mask = PAGE_MASK & (SHM_COLOUR - 1);
+	info.align_offset = (unsigned long)ptr & (SHM_COLOUR - 1);
+
+	addr = vm_unmapped_area(&info);
+
+	/*
+	 * A failed mmap() very likely causes application failure,
+	 * so fall back to the bottom-up function here. This scenario
+	 * can happen with large stack limits and large mmap()
+	 * allocations.
+	 */
+	if (offset_in_page(addr)) {
+		VM_BUG_ON(addr != -ENOMEM);
+		info.flags = 0;
+		info.low_limit = TASK_UNMAPPED_BASE;
+		info.high_limit = mmap_end;
+		addr = vm_unmapped_area(&info);
+	}
+
+	return addr;
+}
+
 #else /* !CONFIG_MMU */

 static int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
@@ -3273,6 +3326,8 @@ static const struct file_operations io_uring_fops = {
 #ifndef CONFIG_MMU
 	.get_unmapped_area = io_uring_nommu_get_unmapped_area,
 	.mmap_capabilities = io_uring_nommu_mmap_capabilities,
+#else
+	.get_unmapped_area = io_uring_mmu_get_unmapped_area,
 #endif
 	.poll		= io_uring_poll,
 #ifdef CONFIG_PROC_FS
diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index 90b675c65b84..b8bc682ef240 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -204,6 +204,7 @@ static inline void io_ring_submit_lock(struct io_ring_ctx *ctx,
 static inline void io_commit_cqring(struct io_ring_ctx *ctx)
 {
 	/* order cqe stores with ring update */
+	flush_dcache_page(virt_to_page(ctx->rings));
 	smp_store_release(&ctx->rings->cq.tail, ctx->cached_cq_tail);
 }


  reply	other threads:[~2023-02-14 23:10 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-12  9:47 io_uring failure on parisc (32-bit userspace and 64-bit kernel) Helge Deller
2023-02-12 13:16 ` Jens Axboe
2023-02-12 13:28   ` Helge Deller
2023-02-12 13:35     ` Jens Axboe
2023-02-12 14:00       ` Jens Axboe
2023-02-12 14:03       ` Helge Deller
2023-02-12 19:35         ` Helge Deller
2023-02-12 19:42           ` Jens Axboe
2023-02-12 20:01             ` Helge Deller
2023-02-12 21:48               ` Jens Axboe
2023-02-12 22:20                 ` Helge Deller
2023-02-12 22:31                   ` Helge Deller
2023-02-13 16:15                     ` Jens Axboe
2023-02-13 20:59                       ` Helge Deller
2023-02-13 21:05                         ` Jens Axboe
2023-02-13 22:05                           ` Helge Deller
2023-02-13 22:50                             ` John David Anglin
2023-02-14 23:09                               ` Helge Deller [this message]
2023-02-14 23:29                                 ` io_uring failure on parisc with VIPT caches Jens Axboe
2023-02-15  2:12                                   ` John David Anglin
2023-02-15 15:16                                     ` Jens Axboe
2023-02-15 15:52                                       ` Helge Deller
2023-02-15 15:56                                         ` Jens Axboe
2023-02-15 16:02                                           ` Helge Deller
2023-02-15 16:04                                             ` Jens Axboe
2023-02-15 21:40                                               ` Helge Deller
2023-02-15 23:04                                                 ` Jens Axboe
2023-02-15 16:38                                           ` John David Anglin
2023-02-15 17:01                                             ` Jens Axboe
2023-02-15 19:00                                               ` Jens Axboe
2023-02-15 19:16                                                 ` Jens Axboe
2023-02-15 20:27                                                   ` John David Anglin
2023-02-15 20:37                                                     ` Jens Axboe
2023-02-15 21:06                                                       ` John David Anglin
2023-02-15 21:38                                                         ` Jens Axboe
2023-02-15 21:39                                                         ` John David Anglin
2023-02-15 22:10                                                           ` John David Anglin
2023-02-15 23:02                                                             ` Jens Axboe
2023-02-15 23:43                                                               ` John David Anglin
2023-02-16  2:40                                                               ` John David Anglin
2023-02-16  2:50                                                                 ` Jens Axboe
2023-02-16  8:24                                                                   ` Helge Deller
2023-02-16 15:22                                                                     ` Jens Axboe
2023-02-16 20:35                                                                     ` John David Anglin
2023-02-15 23:03                                                           ` Jens Axboe
2023-02-15 19:20                                                 ` John David Anglin
2023-02-15 19:24                                                   ` Jens Axboe
2023-02-15 16:18                                         ` John David Anglin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y+wUwVxeN87gqN6o@p100 \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox