public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: David Hildenbrand <david@redhat.com>
Cc: linux-kernel@vger.kernel.org,
	Alexander Potapenko <glider@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Brendan Jackman <jackmanb@google.com>,
	Christoph Lameter <cl@gentwo.org>,
	Dennis Zhou <dennis@kernel.org>,
	Dmitry Vyukov <dvyukov@google.com>,
	dri-devel@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
	iommu@lists.linux.dev, io-uring@vger.kernel.org,
	Jason Gunthorpe <jgg@nvidia.com>, Jens Axboe <axboe@kernel.dk>,
	Johannes Weiner <hannes@cmpxchg.org>,
	John Hubbard <jhubbard@nvidia.com>,
	kasan-dev@googlegroups.com, kvm@vger.kernel.org,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-arm-kernel@axis.com, linux-arm-kernel@lists.infradead.org,
	linux-crypto@vger.kernel.org, linux-ide@vger.kernel.org,
	linux-kselftest@vger.kernel.org, linux-mips@vger.kernel.org,
	linux-mmc@vger.kernel.org, linux-mm@kvack.org,
	linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org,
	linux-scsi@vger.kernel.org, Marco Elver <elver@google.com>,
	Marek Szyprowski <m.szyprowski@samsung.com>,
	Michal Hocko <mhocko@suse.com>, Mike Rapoport <rppt@kernel.org>,
	Muchun Song <muchun.song@linux.dev>,
	netdev@vger.kernel.org, Oscar Salvador <osalvador@suse.de>,
	Peter Xu <peterx@redhat.com>, Robin Murphy <robin.murphy@arm.com>,
	Suren Baghdasaryan <surenb@google.com>, Tejun Heo <tj@kernel.org>,
	virtualization@lists.linux.dev, Vlastimil Babka <vbabka@suse.cz>,
	wireguard@lists.zx2c4.com, x86@kernel.org,
	Zi Yan <ziy@nvidia.com>
Subject: Re: [PATCH v2 19/37] mm/gup: remove record_subpages()
Date: Fri, 5 Sep 2025 12:34:39 +0100	[thread overview]
Message-ID: <b7544f6d-beac-46af-aa43-27da6d96467e@lucifer.local> (raw)
In-Reply-To: <5090355d-546a-4d06-99e1-064354d156b5@redhat.com>

On Fri, Sep 05, 2025 at 08:41:23AM +0200, David Hildenbrand wrote:
> On 01.09.25 17:03, David Hildenbrand wrote:
> > We can just cleanup the code by calculating the #refs earlier,
> > so we can just inline what remains of record_subpages().
> >
> > Calculate the number of references/pages ahead of times, and record them
> > only once all our tests passed.
> >
> > Signed-off-by: David Hildenbrand <david@redhat.com>

So strange I thought I looked at this...!

> > ---
> >   mm/gup.c | 25 ++++++++-----------------
> >   1 file changed, 8 insertions(+), 17 deletions(-)
> >
> > diff --git a/mm/gup.c b/mm/gup.c
> > index c10cd969c1a3b..f0f4d1a68e094 100644
> > --- a/mm/gup.c
> > +++ b/mm/gup.c
> > @@ -484,19 +484,6 @@ static inline void mm_set_has_pinned_flag(struct mm_struct *mm)
> >   #ifdef CONFIG_MMU
> >   #ifdef CONFIG_HAVE_GUP_FAST
> > -static int record_subpages(struct page *page, unsigned long sz,
> > -			   unsigned long addr, unsigned long end,
> > -			   struct page **pages)
> > -{
> > -	int nr;
> > -
> > -	page += (addr & (sz - 1)) >> PAGE_SHIFT;
> > -	for (nr = 0; addr != end; nr++, addr += PAGE_SIZE)
> > -		pages[nr] = page++;
> > -
> > -	return nr;
> > -}
> > -
> >   /**
> >    * try_grab_folio_fast() - Attempt to get or pin a folio in fast path.
> >    * @page:  pointer to page to be grabbed
> > @@ -2967,8 +2954,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr,
> >   	if (pmd_special(orig))
> >   		return 0;
> > -	page = pmd_page(orig);
> > -	refs = record_subpages(page, PMD_SIZE, addr, end, pages + *nr);
> > +	refs = (end - addr) >> PAGE_SHIFT;
> > +	page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
> >   	folio = try_grab_folio_fast(page, refs, flags);
> >   	if (!folio)
> > @@ -2989,6 +2976,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr,
> >   	}
> >   	*nr += refs;
> > +	for (; refs; refs--)
> > +		*(pages++) = page++;
> >   	folio_set_referenced(folio);
> >   	return 1;
> >   }
> > @@ -3007,8 +2996,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr,
> >   	if (pud_special(orig))
> >   		return 0;
> > -	page = pud_page(orig);
> > -	refs = record_subpages(page, PUD_SIZE, addr, end, pages + *nr);
> > +	refs = (end - addr) >> PAGE_SHIFT;
> > +	page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT);
> >   	folio = try_grab_folio_fast(page, refs, flags);
> >   	if (!folio)
> > @@ -3030,6 +3019,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr,
> >   	}
> >   	*nr += refs;
> > +	for (; refs; refs--)
> > +		*(pages++) = page++;
> >   	folio_set_referenced(folio);
> >   	return 1;
> >   }
>
> Okay, this code is nasty. We should rework this code to just return the nr and receive a the proper
> pages pointer, getting rid of the "*nr" parameter.
>
> For the time being, the following should do the trick:
>
> commit bfd07c995814354f6b66c5b6a72e96a7aa9fb73b (HEAD -> nth_page)
> Author: David Hildenbrand <david@redhat.com>
> Date:   Fri Sep 5 08:38:43 2025 +0200
>
>     fixup: mm/gup: remove record_subpages()
>     pages is not adjusted by the caller, but idnexed by existing *nr.
>     Signed-off-by: David Hildenbrand <david@redhat.com>
>
> diff --git a/mm/gup.c b/mm/gup.c
> index 010fe56f6e132..22420f2069ee1 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -2981,6 +2981,7 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr,
>                 return 0;
>         }
> +       pages += *nr;
>         *nr += refs;
>         for (; refs; refs--)
>                 *(pages++) = page++;
> @@ -3024,6 +3025,7 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr,
>                 return 0;
>         }
> +       pages += *nr;
>         *nr += refs;
>         for (; refs; refs--)
>                 *(pages++) = page++;

This looks correct.

But.

This is VERY nasty. Before we'd call record_subpages() with pages + *nr, where
it was clear we were offsetting by this, now we're making things imo way more
confusing.

This makes me less in love with this approach to be honest.

But perhaps it's the least worst thing for now until we can do a bigger
refactor...

So since this seems correct to me, and for the sake of moving things forward
(was this one patch dropped from mm-new or does mm-new just have an old version?
Confused):

Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

For this patch obviously with the fix applied.

But can we PLEASE revisit this :)

>
>
> --
>
> Cheers
>
> David / dhildenb
>

Cheers, Lorenzo

  parent reply	other threads:[~2025-09-05 11:35 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-01 15:03 [PATCH v2 00/37] mm: remove nth_page() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 01/37] mm: stop making SPARSEMEM_VMEMMAP user-selectable David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 02/37] arm64: Kconfig: drop superfluous "select SPARSEMEM_VMEMMAP" David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 03/37] s390/Kconfig: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 04/37] x86/Kconfig: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 05/37] wireguard: selftests: remove CONFIG_SPARSEMEM_VMEMMAP=y from qemu kernel config David Hildenbrand
2025-09-08 16:48   ` Jason A. Donenfeld
2025-09-01 15:03 ` [PATCH v2 06/37] mm/page_alloc: reject unreasonable folio/compound page sizes in alloc_contig_range_noprof() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 07/37] mm/memremap: reject unreasonable folio/compound page sizes in memremap_pages() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 08/37] mm/hugetlb: check for unreasonable folio sizes when registering hstate David Hildenbrand
2025-10-09  7:14   ` (bisected) " Christophe Leroy
2025-10-09  7:22     ` David Hildenbrand
2025-10-09  7:44       ` Christophe Leroy
2025-10-09  8:04       ` Christophe Leroy
2025-10-09  8:14         ` David Hildenbrand
2025-10-09  9:16           ` Christophe Leroy
2025-10-09  9:20             ` David Hildenbrand
2025-10-09 10:01               ` Christophe Leroy
2025-10-09 10:27                 ` David Hildenbrand
2025-10-09 12:08                   ` Christophe Leroy
2025-10-09 13:05                     ` David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 09/37] mm/mm_init: make memmap_init_compound() look more like prep_compound_page() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 10/37] mm: sanity-check maximum folio size in folio_set_order() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 11/37] mm: limit folio/compound page sizes in problematic kernel configs David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 12/37] mm: simplify folio_page() and folio_page_idx() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 13/37] mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 14/37] mm/mm/percpu-km: drop nth_page() usage within single allocation David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 15/37] fs: hugetlbfs: remove nth_page() usage within folio in adjust_range_hwpoison() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 16/37] fs: hugetlbfs: cleanup " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 17/37] mm/pagewalk: drop nth_page() usage within folio in folio_walk_start() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 18/37] mm/gup: drop nth_page() usage within folio when recording subpages David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 19/37] mm/gup: remove record_subpages() David Hildenbrand
2025-09-05  6:41   ` David Hildenbrand
2025-09-05 11:26     ` Jens Axboe
2025-09-05 11:34     ` Lorenzo Stoakes [this message]
2025-09-05 11:38       ` David Hildenbrand
2025-09-05 23:00     ` Eric Biggers
2025-09-06  6:57       ` David Hildenbrand
2025-09-09  4:25         ` Andrew Morton
2025-09-06  1:05   ` John Hubbard
2025-09-06  6:56     ` David Hildenbrand
2025-09-06  7:00       ` David Hildenbrand
2025-09-07  5:14       ` John Hubbard
2025-09-08  8:00         ` David Hildenbrand
2025-09-08 12:25       ` Lorenzo Stoakes
2025-09-08 12:53         ` David Hildenbrand
2025-09-08 17:12           ` John Hubbard
2025-09-08 15:16   ` Mark Brown
2025-09-08 15:22     ` David Hildenbrand
2025-09-08 15:28       ` Mark Brown
2025-09-01 15:03 ` [PATCH v2 20/37] io_uring/zcrx: remove nth_page() usage within folio David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 21/37] mips: mm: convert __flush_dcache_pages() to __flush_dcache_folio_pages() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 22/37] mm/cma: refuse handing out non-contiguous page ranges David Hildenbrand
2025-09-09  9:55   ` David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 23/37] dma-remap: drop nth_page() in dma_common_contiguous_remap() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 24/37] scatterlist: disallow non-contigous page ranges in a single SG entry David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 25/37] ata: libata-sff: drop nth_page() usage within " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 26/37] drm/i915/gem: " David Hildenbrand
2025-09-02  9:22   ` Tvrtko Ursulin
2025-09-02  9:42     ` David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 27/37] mspro_block: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 28/37] memstick: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 29/37] mmc: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 30/37] scsi: scsi_lib: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 31/37] scsi: sg: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 32/37] vfio/pci: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 33/37] crypto: remove " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 34/37] mm/gup: drop nth_page() usage in unpin_user_page_range_dirty_lock() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 35/37] kfence: drop nth_page() usage David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 36/37] block: update comment of "struct bio_vec" regarding nth_page() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 37/37] mm: remove nth_page() David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b7544f6d-beac-46af-aa43-27da6d96467e@lucifer.local \
    --to=lorenzo.stoakes@oracle.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=cl@gentwo.org \
    --cc=david@redhat.com \
    --cc=dennis@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=dvyukov@google.com \
    --cc=elver@google.com \
    --cc=glider@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=io-uring@vger.kernel.org \
    --cc=iommu@lists.linux.dev \
    --cc=jackmanb@google.com \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-arm-kernel@axis.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-mmc@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=mhocko@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=netdev@vger.kernel.org \
    --cc=osalvador@suse.de \
    --cc=peterx@redhat.com \
    --cc=robin.murphy@arm.com \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=virtualization@lists.linux.dev \
    --cc=wireguard@lists.zx2c4.com \
    --cc=x86@kernel.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox