public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: io-uring@vger.kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: Re: [PATCH] io_uring: take page references for NOMMU pbuf_ring mmaps
Date: Tue, 21 Apr 2026 19:56:13 -0600	[thread overview]
Message-ID: <f1b43e56-4724-4635-b18b-bae2add37936@kernel.dk> (raw)
In-Reply-To: <dec29d85-9e79-42df-ae3d-9af65134283c@kernel.dk>

On 4/21/26 7:17 PM, Jens Axboe wrote:
> On 4/21/26 11:39 AM, Jens Axboe wrote:
>>
>> On Tue, 21 Apr 2026 15:46:16 +0200, Greg Kroah-Hartman wrote:
>>> Under !CONFIG_MMU, io_uring_get_unmapped_area() returns the kernel
>>> virtual address of the io_mapped_region's backing pages directly;
>>> the user's VMA aliases the kernel allocation. io_uring_mmap() then
>>> just returns 0 -- it takes no page references.
>>>
>>> The CONFIG_MMU path uses vm_insert_pages(), which takes a reference on
>>> each inserted page.  Those references are released when the VMA is torn
>>> down (zap_pte_range -> put_page). io_free_region() -> release_pages()
>>> drops the io_uring-side references, but the pages survive until munmap
>>> drops the VMA-side references.
>>>
>>> [...]
>>
>> Applied, thanks!
>>
>> [1/1] io_uring: take page references for NOMMU pbuf_ring mmaps
>>       commit: d9b7b3d9c5286a786c7fe8220c55a6e012088c2e
> 
> Actually, I take that back - what prevents the io_mmap_get_region()
> in the newly added io_uring_nommu_vm_close() from getting the same
> region that we initially referenced the pages from in the nommu
> variant of io_uring_mmap()?

I think we can get rid of that and simplify the code at the same
time. Rather than need to re-lookup the buffer list, we can just iterate
the pages mapped in the vma. Since this is a file backed mapping and
io_uring doesn't allow remaps, that should always be the same.

Greg, can you test this? I will fold this in.


diff --git a/io_uring/memmap.c b/io_uring/memmap.c
index 6818e9abf3b3..e80f9eed6efc 100644
--- a/io_uring/memmap.c
+++ b/io_uring/memmap.c
@@ -367,45 +367,18 @@ unsigned long io_uring_get_unmapped_area(struct file *filp, unsigned long addr,
 #else /* !CONFIG_MMU */
 
 /*
- * Under NOMMU, get_unmapped_area returns the kernel virtual address of
- * the io_mapped_region's backing pages directly -- the user's VMA
- * aliases the kernel allocation rather than holding its own copy or
- * page-table entries. The CONFIG_MMU path's vm_insert_pages() takes
- * page references that survive until munmap; this path takes none, so
- * io_unregister_pbuf_ring() -> io_free_region() -> release_pages()
- * frees the pages while the user's VMA still maps them. The user can
- * then write into whatever the buddy allocator hands out next.
- *
- * Mirror the MMU lifetime by taking page references in io_uring_mmap()
- * and releasing them in vm_ops->close. We re-derive the region from
- * vm_pgoff (same lookup get_unmapped_area used) so we know which pages
- * to grab.
+ * Drop the pages that were initially referenced and added in
+ * io_uring_mmap(). We cannot have had a mremap() as that isnt supported,
+ * hence the vma should be identical to the one we initially referenced and
+ * mapped, and partial unmaps and splitting isn't possible on a file backed
+ * mapping.
  */
-
 static void io_uring_nommu_vm_close(struct vm_area_struct *vma)
 {
-	struct io_ring_ctx *ctx = vma->vm_file->private_data;
-	struct io_mapped_region *region;
-	unsigned long i;
+	unsigned long index;
 
-	guard(mutex)(&ctx->mmap_lock);
-	region = io_mmap_get_region(ctx, vma->vm_pgoff);
-	/*
-	 * The region may have been unregistered (memset to zero in
-	 * io_free_region()) between mmap and munmap. The page refs we
-	 * took in io_uring_mmap() are what kept the pages alive; release
-	 * them via the VMA range since the region->pages array is gone.
-	 */
-	if (region && region->pages) {
-		for (i = 0; i < region->nr_pages; i++)
-			put_page(region->pages[i]);
-	} else {
-		/* Region cleared; walk the VMA range. */
-		unsigned long a;
-
-		for (a = vma->vm_start; a < vma->vm_end; a += PAGE_SIZE)
-			put_page(virt_to_page((void *)a));
-	}
+	for (index = vma->vm_start; index < vma->vm_end; index += PAGE_SIZE)
+		put_page(virt_to_page((void *) index);
 }
 
 static const struct vm_operations_struct io_uring_nommu_vm_ops = {

-- 
Jens Axboe

  reply	other threads:[~2026-04-22  1:56 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-21 13:46 [PATCH] io_uring: take page references for NOMMU pbuf_ring mmaps Greg Kroah-Hartman
2026-04-21 13:50 ` Jens Axboe
2026-04-21 13:55   ` Greg Kroah-Hartman
2026-04-21 14:02     ` Jens Axboe
2026-04-21 16:01     ` Greg Kroah-Hartman
2026-04-21 16:05       ` Jens Axboe
2026-04-21 16:21         ` Jens Axboe
2026-04-21 16:24           ` Greg Kroah-Hartman
2026-04-21 16:41             ` Jens Axboe
2026-04-21 17:04               ` Jens Axboe
2026-04-21 17:38                 ` Jens Axboe
2026-04-21 17:39 ` Jens Axboe
2026-04-22  1:17   ` Jens Axboe
2026-04-22  1:56     ` Jens Axboe [this message]
2026-04-22  2:26       ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f1b43e56-4724-4635-b18b-bae2add37936@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=gregkh@linuxfoundation.org \
    --cc=io-uring@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox