From: John Hubbard <jhubbard@nvidia.com>
To: David Hildenbrand <david@redhat.com>, linux-kernel@vger.kernel.org
Cc: Alexander Potapenko <glider@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Brendan Jackman <jackmanb@google.com>,
Christoph Lameter <cl@gentwo.org>,
Dennis Zhou <dennis@kernel.org>,
Dmitry Vyukov <dvyukov@google.com>,
dri-devel@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
iommu@lists.linux.dev, io-uring@vger.kernel.org,
Jason Gunthorpe <jgg@nvidia.com>, Jens Axboe <axboe@kernel.dk>,
Johannes Weiner <hannes@cmpxchg.org>,
kasan-dev@googlegroups.com, kvm@vger.kernel.org,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
linux-arm-kernel@axis.com, linux-arm-kernel@lists.infradead.org,
linux-crypto@vger.kernel.org, linux-ide@vger.kernel.org,
linux-kselftest@vger.kernel.org, linux-mips@vger.kernel.org,
linux-mmc@vger.kernel.org, linux-mm@kvack.org,
linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org,
linux-scsi@vger.kernel.org,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Marco Elver <elver@google.com>,
Marek Szyprowski <m.szyprowski@samsung.com>,
Michal Hocko <mhocko@suse.com>, Mike Rapoport <rppt@kernel.org>,
Muchun Song <muchun.song@linux.dev>,
netdev@vger.kernel.org, Oscar Salvador <osalvador@suse.de>,
Peter Xu <peterx@redhat.com>, Robin Murphy <robin.murphy@arm.com>,
Suren Baghdasaryan <surenb@google.com>, Tejun Heo <tj@kernel.org>,
virtualization@lists.linux.dev, Vlastimil Babka <vbabka@suse.cz>,
wireguard@lists.zx2c4.com, x86@kernel.org,
Zi Yan <ziy@nvidia.com>
Subject: Re: [PATCH v2 19/37] mm/gup: remove record_subpages()
Date: Sat, 6 Sep 2025 22:14:19 -0700 [thread overview]
Message-ID: <0a28adde-acaf-4d55-96ba-c32d6113285f@nvidia.com> (raw)
In-Reply-To: <85e760cf-b994-40db-8d13-221feee55c60@redhat.com>
On 9/5/25 11:56 PM, David Hildenbrand wrote:
> On 06.09.25 03:05, John Hubbard wrote:
>> On 9/1/25 8:03 AM, David Hildenbrand wrote:
...> Well, there is a lot I dislike about record_subpages() to go back
there.
> Starting with "as Willy keeps explaining, the concept of subpages do
> not exist and ending with "why do we fill out the array even on failure".
>
> :)
I am also very glad to see the entire concept of subpages disappear.
>>
>> Now it's been returned to it's original, cryptic form.
>>
>
> The code in the caller was so uncryptic that both me and Lorenzo missed
> that magical addition. :P
>
>> Just my take on it, for whatever that's worth. :)
>
> As always, appreciated.
>
> I could of course keep the simple loop in some "record_folio_pages"
> function and clean up what I dislike about record_subpages().
>
> But I much rather want the call chain to be cleaned up instead, if
> possible.
>
Right! The primary way that record_subpages() helped was in showing
what was going on: a function call helps a lot to self-document,
sometimes.
>
> Roughly, what I am thinking (limiting it to pte+pmd case) about is the
> following:
The code below looks much cleaner, that's great!
thanks,
--
John Hubbard
>
>
> From d6d6d21dbf435d8030782a627175e36e6c7b2dfb Mon Sep 17 00:00:00 2001
> From: David Hildenbrand <david@redhat.com>
> Date: Sat, 6 Sep 2025 08:33:42 +0200
> Subject: [PATCH] tmp
>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
> mm/gup.c | 79 ++++++++++++++++++++++++++------------------------------
> 1 file changed, 36 insertions(+), 43 deletions(-)
>
> diff --git a/mm/gup.c b/mm/gup.c
> index 22420f2069ee1..98907ead749c0 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -2845,12 +2845,11 @@ static void __maybe_unused
> gup_fast_undo_dev_pagemap(int *nr, int nr_start,
> * also check pmd here to make sure pmd doesn't change (corresponds to
> * pmdp_collapse_flush() in the THP collapse code path).
> */
> -static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr,
> - unsigned long end, unsigned int flags, struct page **pages,
> - int *nr)
> +static unsigned long gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp,
> unsigned long addr,
> + unsigned long end, unsigned int flags, struct page **pages)
> {
> struct dev_pagemap *pgmap = NULL;
> - int ret = 0;
> + unsigned long nr_pages = 0;
> pte_t *ptep, *ptem;
>
> ptem = ptep = pte_offset_map(&pmd, addr);
> @@ -2908,24 +2907,20 @@ static int gup_fast_pte_range(pmd_t pmd, pmd_t
> *pmdp, unsigned long addr,
> * details.
> */
> if (flags & FOLL_PIN) {
> - ret = arch_make_folio_accessible(folio);
> - if (ret) {
> + if (arch_make_folio_accessible(folio)) {
> gup_put_folio(folio, 1, flags);
> goto pte_unmap;
> }
> }
> folio_set_referenced(folio);
> - pages[*nr] = page;
> - (*nr)++;
> + pages[nr_pages++] = page;
> } while (ptep++, addr += PAGE_SIZE, addr != end);
>
> - ret = 1;
> -
> pte_unmap:
> if (pgmap)
> put_dev_pagemap(pgmap);
> pte_unmap(ptem);
> - return ret;
> + return nr_pages;
> }
> #else
>
> @@ -2938,21 +2933,24 @@ static int gup_fast_pte_range(pmd_t pmd, pmd_t
> *pmdp, unsigned long addr,
> * get_user_pages_fast_only implementation that can pin pages. Thus
> it's still
> * useful to have gup_fast_pmd_leaf even if we can't operate on ptes.
> */
> -static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr,
> - unsigned long end, unsigned int flags, struct page **pages,
> - int *nr)
> +static unsigned long gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp,
> unsigned long addr,
> + unsigned long end, unsigned int flags, struct page **pages)
> {
> return 0;
> }
> #endif /* CONFIG_ARCH_HAS_PTE_SPECIAL */
>
> -static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr,
> - unsigned long end, unsigned int flags, struct page **pages,
> - int *nr)
> +static unsigned long gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp,
> unsigned long addr,
> + unsigned long end, unsigned int flags, struct page **pages)
> {
> + const unsigned long nr_pages = (end - addr) >> PAGE_SHIFT;
> struct page *page;
> struct folio *folio;
> - int refs;
> + unsigned long i;
> +
> + /* See gup_fast_pte_range() */
> + if (pmd_protnone(orig))
> + return 0;
>
> if (!pmd_access_permitted(orig, flags & FOLL_WRITE))
> return 0;
> @@ -2960,33 +2958,30 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t
> *pmdp, unsigned long addr,
> if (pmd_special(orig))
> return 0;
>
> - refs = (end - addr) >> PAGE_SHIFT;
> page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
>
> - folio = try_grab_folio_fast(page, refs, flags);
> + folio = try_grab_folio_fast(page, nr_pages, flags);
> if (!folio)
> return 0;
>
> if (unlikely(pmd_val(orig) != pmd_val(*pmdp))) {
> - gup_put_folio(folio, refs, flags);
> + gup_put_folio(folio, nr_pages, flags);
> return 0;
> }
>
> if (!gup_fast_folio_allowed(folio, flags)) {
> - gup_put_folio(folio, refs, flags);
> + gup_put_folio(folio, nr_pages, flags);
> return 0;
> }
> if (!pmd_write(orig) && gup_must_unshare(NULL, flags, &folio-
> >page)) {
> - gup_put_folio(folio, refs, flags);
> + gup_put_folio(folio, nr_pages, flags);
> return 0;
> }
>
> - pages += *nr;
> - *nr += refs;
> - for (; refs; refs--)
> + for (i = 0; i < nr_pages; i++)
> *(pages++) = page++;
> folio_set_referenced(folio);
> - return 1;
> + return nr_pages;
> }
>
> static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr,
> @@ -3033,11 +3028,11 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t
> *pudp, unsigned long addr,
> return 1;
> }
>
> -static int gup_fast_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr,
> - unsigned long end, unsigned int flags, struct page **pages,
> - int *nr)
> +static unsigned long gup_fast_pmd_range(pud_t *pudp, pud_t pud,
> unsigned long addr,
> + unsigned long end, unsigned int flags, struct page **pages)
> {
> - unsigned long next;
> + unsigned long cur_nr_pages, next;
> + unsigned long nr_pages = 0;
> pmd_t *pmdp;
>
> pmdp = pmd_offset_lockless(pudp, pud, addr);
> @@ -3046,23 +3041,21 @@ static int gup_fast_pmd_range(pud_t *pudp, pud_t
> pud, unsigned long addr,
>
> next = pmd_addr_end(addr, end);
> if (!pmd_present(pmd))
> - return 0;
> + break;
>
> - if (unlikely(pmd_leaf(pmd))) {
> - /* See gup_fast_pte_range() */
> - if (pmd_protnone(pmd))
> - return 0;
> + if (unlikely(pmd_leaf(pmd)))
> + cur_nr_pages = gup_fast_pmd_leaf(pmd, pmdp, addr, next,
> flags, pages);
> + else
> + cur_nr_pages = gup_fast_pte_range(pmd, pmdp, addr, next,
> flags, pages);
>
> - if (!gup_fast_pmd_leaf(pmd, pmdp, addr, next, flags,
> - pages, nr))
> - return 0;
> + nr_pages += cur_nr_pages;
> + pages += cur_nr_pages;
>
> - } else if (!gup_fast_pte_range(pmd, pmdp, addr, next, flags,
> - pages, nr))
> - return 0;
> + if (nr_pages != (next - addr) >> PAGE_SIZE)
> + break;
> } while (pmdp++, addr = next, addr != end);
>
> - return 1;
> + return nr_pages;
> }
>
> static int gup_fast_pud_range(p4d_t *p4dp, p4d_t p4d, unsigned long addr,
next prev parent reply other threads:[~2025-09-07 5:14 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-01 15:03 [PATCH v2 00/37] mm: remove nth_page() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 01/37] mm: stop making SPARSEMEM_VMEMMAP user-selectable David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 02/37] arm64: Kconfig: drop superfluous "select SPARSEMEM_VMEMMAP" David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 03/37] s390/Kconfig: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 04/37] x86/Kconfig: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 05/37] wireguard: selftests: remove CONFIG_SPARSEMEM_VMEMMAP=y from qemu kernel config David Hildenbrand
2025-09-08 16:48 ` Jason A. Donenfeld
2025-09-01 15:03 ` [PATCH v2 06/37] mm/page_alloc: reject unreasonable folio/compound page sizes in alloc_contig_range_noprof() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 07/37] mm/memremap: reject unreasonable folio/compound page sizes in memremap_pages() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 08/37] mm/hugetlb: check for unreasonable folio sizes when registering hstate David Hildenbrand
2025-10-09 7:14 ` (bisected) " Christophe Leroy
2025-10-09 7:22 ` David Hildenbrand
2025-10-09 7:44 ` Christophe Leroy
2025-10-09 8:04 ` Christophe Leroy
2025-10-09 8:14 ` David Hildenbrand
2025-10-09 9:16 ` Christophe Leroy
2025-10-09 9:20 ` David Hildenbrand
2025-10-09 10:01 ` Christophe Leroy
2025-10-09 10:27 ` David Hildenbrand
2025-10-09 12:08 ` Christophe Leroy
2025-10-09 13:05 ` David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 09/37] mm/mm_init: make memmap_init_compound() look more like prep_compound_page() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 10/37] mm: sanity-check maximum folio size in folio_set_order() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 11/37] mm: limit folio/compound page sizes in problematic kernel configs David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 12/37] mm: simplify folio_page() and folio_page_idx() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 13/37] mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 14/37] mm/mm/percpu-km: drop nth_page() usage within single allocation David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 15/37] fs: hugetlbfs: remove nth_page() usage within folio in adjust_range_hwpoison() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 16/37] fs: hugetlbfs: cleanup " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 17/37] mm/pagewalk: drop nth_page() usage within folio in folio_walk_start() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 18/37] mm/gup: drop nth_page() usage within folio when recording subpages David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 19/37] mm/gup: remove record_subpages() David Hildenbrand
2025-09-05 6:41 ` David Hildenbrand
2025-09-05 11:26 ` Jens Axboe
2025-09-05 11:34 ` Lorenzo Stoakes
2025-09-05 11:38 ` David Hildenbrand
2025-09-05 23:00 ` Eric Biggers
2025-09-06 6:57 ` David Hildenbrand
2025-09-09 4:25 ` Andrew Morton
2025-09-06 1:05 ` John Hubbard
2025-09-06 6:56 ` David Hildenbrand
2025-09-06 7:00 ` David Hildenbrand
2025-09-07 5:14 ` John Hubbard [this message]
2025-09-08 8:00 ` David Hildenbrand
2025-09-08 12:25 ` Lorenzo Stoakes
2025-09-08 12:53 ` David Hildenbrand
2025-09-08 17:12 ` John Hubbard
2025-09-08 15:16 ` Mark Brown
2025-09-08 15:22 ` David Hildenbrand
2025-09-08 15:28 ` Mark Brown
2025-09-01 15:03 ` [PATCH v2 20/37] io_uring/zcrx: remove nth_page() usage within folio David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 21/37] mips: mm: convert __flush_dcache_pages() to __flush_dcache_folio_pages() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 22/37] mm/cma: refuse handing out non-contiguous page ranges David Hildenbrand
2025-09-09 9:55 ` David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 23/37] dma-remap: drop nth_page() in dma_common_contiguous_remap() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 24/37] scatterlist: disallow non-contigous page ranges in a single SG entry David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 25/37] ata: libata-sff: drop nth_page() usage within " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 26/37] drm/i915/gem: " David Hildenbrand
2025-09-02 9:22 ` Tvrtko Ursulin
2025-09-02 9:42 ` David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 27/37] mspro_block: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 28/37] memstick: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 29/37] mmc: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 30/37] scsi: scsi_lib: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 31/37] scsi: sg: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 32/37] vfio/pci: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 33/37] crypto: remove " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 34/37] mm/gup: drop nth_page() usage in unpin_user_page_range_dirty_lock() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 35/37] kfence: drop nth_page() usage David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 36/37] block: update comment of "struct bio_vec" regarding nth_page() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 37/37] mm: remove nth_page() David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0a28adde-acaf-4d55-96ba-c32d6113285f@nvidia.com \
--to=jhubbard@nvidia.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=cl@gentwo.org \
--cc=david@redhat.com \
--cc=dennis@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=dvyukov@google.com \
--cc=elver@google.com \
--cc=glider@google.com \
--cc=hannes@cmpxchg.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=io-uring@vger.kernel.org \
--cc=iommu@lists.linux.dev \
--cc=jackmanb@google.com \
--cc=jgg@nvidia.com \
--cc=kasan-dev@googlegroups.com \
--cc=kvm@vger.kernel.org \
--cc=linux-arm-kernel@axis.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-mmc@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linux-s390@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=m.szyprowski@samsung.com \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=netdev@vger.kernel.org \
--cc=osalvador@suse.de \
--cc=peterx@redhat.com \
--cc=robin.murphy@arm.com \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vbabka@suse.cz \
--cc=virtualization@lists.linux.dev \
--cc=wireguard@lists.zx2c4.com \
--cc=x86@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox