From: Sidhartha Kumar <[email protected]>
To: "Matthew Wilcox (Oracle)" <[email protected]>,
Andrew Morton <[email protected]>
Cc: Jens Axboe <[email protected]>,
[email protected], [email protected],
Yanteng Si <[email protected]>
Subject: Re: [PATCH v2 03/13] mm: Convert free_huge_page() to free_huge_folio()
Date: Wed, 16 Aug 2023 13:14:18 -0700 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 8/16/23 8:11 AM, Matthew Wilcox (Oracle) wrote:
> Pass a folio instead of the head page to save a few instructions.
> Update the documentation, at least in English.
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
> Cc: Yanteng Si <[email protected]>
> ---
> Documentation/mm/hugetlbfs_reserv.rst | 14 +++---
> .../zh_CN/mm/hugetlbfs_reserv.rst | 4 +-
> include/linux/hugetlb.h | 2 +-
> mm/hugetlb.c | 48 +++++++++----------
> mm/page_alloc.c | 2 +-
> 5 files changed, 34 insertions(+), 36 deletions(-)
>
Reviewed-by: Sidhartha Kumar <[email protected]>
> diff --git a/Documentation/mm/hugetlbfs_reserv.rst b/Documentation/mm/hugetlbfs_reserv.rst
> index d9c2b0f01dcd..4914fbf07966 100644
> --- a/Documentation/mm/hugetlbfs_reserv.rst
> +++ b/Documentation/mm/hugetlbfs_reserv.rst
> @@ -271,12 +271,12 @@ to the global reservation count (resv_huge_pages).
> Freeing Huge Pages
> ==================
>
> -Huge page freeing is performed by the routine free_huge_page(). This routine
> -is the destructor for hugetlbfs compound pages. As a result, it is only
> -passed a pointer to the page struct. When a huge page is freed, reservation
> -accounting may need to be performed. This would be the case if the page was
> -associated with a subpool that contained reserves, or the page is being freed
> -on an error path where a global reserve count must be restored.
> +Huge pages are freed by free_huge_folio(). It is only passed a pointer
> +to the folio as it is called from the generic MM code. When a huge page
> +is freed, reservation accounting may need to be performed. This would
> +be the case if the page was associated with a subpool that contained
> +reserves, or the page is being freed on an error path where a global
> +reserve count must be restored.
>
> The page->private field points to any subpool associated with the page.
> If the PagePrivate flag is set, it indicates the global reserve count should
> @@ -525,7 +525,7 @@ However, there are several instances where errors are encountered after a huge
> page is allocated but before it is instantiated. In this case, the page
> allocation has consumed the reservation and made the appropriate subpool,
> reservation map and global count adjustments. If the page is freed at this
> -time (before instantiation and clearing of PagePrivate), then free_huge_page
> +time (before instantiation and clearing of PagePrivate), then free_huge_folio
> will increment the global reservation count. However, the reservation map
> indicates the reservation was consumed. This resulting inconsistent state
> will cause the 'leak' of a reserved huge page. The global reserve count will
> diff --git a/Documentation/translations/zh_CN/mm/hugetlbfs_reserv.rst b/Documentation/translations/zh_CN/mm/hugetlbfs_reserv.rst
> index b7a0544224ad..0f7e7fb5ca8c 100644
> --- a/Documentation/translations/zh_CN/mm/hugetlbfs_reserv.rst
> +++ b/Documentation/translations/zh_CN/mm/hugetlbfs_reserv.rst
> @@ -219,7 +219,7 @@ vma_commit_reservation()之间,预留映射有可能被改变。如果hugetlb_
> 释放巨页
> ========
>
> -巨页释放是由函数free_huge_page()执行的。这个函数是hugetlbfs复合页的析构器。因此,它只传
> +巨页释放是由函数free_huge_folio()执行的。这个函数是hugetlbfs复合页的析构器。因此,它只传
> 递一个指向页面结构体的指针。当一个巨页被释放时,可能需要进行预留计算。如果该页与包含保
> 留的子池相关联,或者该页在错误路径上被释放,必须恢复全局预留计数,就会出现这种情况。
>
> @@ -387,7 +387,7 @@ region_count()在解除私有巨页映射时被调用。在私有映射中,预
>
> 然而,有几种情况是,在一个巨页被分配后,但在它被实例化之前,就遇到了错误。在这种情况下,
> 页面分配已经消耗了预留,并进行了适当的子池、预留映射和全局计数调整。如果页面在这个时候被释放
> -(在实例化和清除PagePrivate之前),那么free_huge_page将增加全局预留计数。然而,预留映射
> +(在实例化和清除PagePrivate之前),那么free_huge_folio将增加全局预留计数。然而,预留映射
> 显示报留被消耗了。这种不一致的状态将导致预留的巨页的 “泄漏” 。全局预留计数将比它原本的要高,
> 并阻止分配一个预先分配的页面。
>
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index 5a1dfaffbd80..5b2626063f4f 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -26,7 +26,7 @@ typedef struct { unsigned long pd; } hugepd_t;
> #define __hugepd(x) ((hugepd_t) { (x) })
> #endif
>
> -void free_huge_page(struct page *page);
> +void free_huge_folio(struct folio *folio);
>
> #ifdef CONFIG_HUGETLB_PAGE
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index e327a5a7602c..086eb51bf845 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1706,10 +1706,10 @@ static void add_hugetlb_folio(struct hstate *h, struct folio *folio,
> zeroed = folio_put_testzero(folio);
> if (unlikely(!zeroed))
> /*
> - * It is VERY unlikely soneone else has taken a ref on
> - * the page. In this case, we simply return as the
> - * hugetlb destructor (free_huge_page) will be called
> - * when this other ref is dropped.
> + * It is VERY unlikely soneone else has taken a ref
> + * on the folio. In this case, we simply return as
> + * free_huge_folio() will be called when this other ref
> + * is dropped.
> */
> return;
>
> @@ -1875,13 +1875,12 @@ struct hstate *size_to_hstate(unsigned long size)
> return NULL;
> }
>
> -void free_huge_page(struct page *page)
> +void free_huge_folio(struct folio *folio)
> {
> /*
> * Can't pass hstate in here because it is called from the
> * compound page destructor.
> */
> - struct folio *folio = page_folio(page);
> struct hstate *h = folio_hstate(folio);
> int nid = folio_nid(folio);
> struct hugepage_subpool *spool = hugetlb_folio_subpool(folio);
> @@ -1936,7 +1935,7 @@ void free_huge_page(struct page *page)
> spin_unlock_irqrestore(&hugetlb_lock, flags);
> update_and_free_hugetlb_folio(h, folio, true);
> } else {
> - arch_clear_hugepage_flags(page);
> + arch_clear_hugepage_flags(&folio->page);
> enqueue_hugetlb_folio(h, folio);
> spin_unlock_irqrestore(&hugetlb_lock, flags);
> }
> @@ -2246,7 +2245,7 @@ static int alloc_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed,
> folio = alloc_fresh_hugetlb_folio(h, gfp_mask, node,
> nodes_allowed, node_alloc_noretry);
> if (folio) {
> - free_huge_page(&folio->page); /* free it into the hugepage allocator */
> + free_huge_folio(folio); /* free it into the hugepage allocator */
> return 1;
> }
> }
> @@ -2429,13 +2428,13 @@ static struct folio *alloc_surplus_hugetlb_folio(struct hstate *h,
> * We could have raced with the pool size change.
> * Double check that and simply deallocate the new page
> * if we would end up overcommiting the surpluses. Abuse
> - * temporary page to workaround the nasty free_huge_page
> + * temporary page to workaround the nasty free_huge_folio
> * codeflow
> */
> if (h->surplus_huge_pages >= h->nr_overcommit_huge_pages) {
> folio_set_hugetlb_temporary(folio);
> spin_unlock_irq(&hugetlb_lock);
> - free_huge_page(&folio->page);
> + free_huge_folio(folio);
> return NULL;
> }
>
> @@ -2547,8 +2546,7 @@ static int gather_surplus_pages(struct hstate *h, long delta)
> __must_hold(&hugetlb_lock)
> {
> LIST_HEAD(surplus_list);
> - struct folio *folio;
> - struct page *page, *tmp;
> + struct folio *folio, *tmp;
> int ret;
> long i;
> long needed, allocated;
> @@ -2608,21 +2606,21 @@ static int gather_surplus_pages(struct hstate *h, long delta)
> ret = 0;
>
> /* Free the needed pages to the hugetlb pool */
> - list_for_each_entry_safe(page, tmp, &surplus_list, lru) {
> + list_for_each_entry_safe(folio, tmp, &surplus_list, lru) {
> if ((--needed) < 0)
> break;
> /* Add the page to the hugetlb allocator */
> - enqueue_hugetlb_folio(h, page_folio(page));
> + enqueue_hugetlb_folio(h, folio);
> }
> free:
> spin_unlock_irq(&hugetlb_lock);
>
> /*
> * Free unnecessary surplus pages to the buddy allocator.
> - * Pages have no ref count, call free_huge_page directly.
> + * Pages have no ref count, call free_huge_folio directly.
> */
> - list_for_each_entry_safe(page, tmp, &surplus_list, lru)
> - free_huge_page(page);
> + list_for_each_entry_safe(folio, tmp, &surplus_list, lru)
> + free_huge_folio(folio);
> spin_lock_irq(&hugetlb_lock);
>
> return ret;
> @@ -2836,11 +2834,11 @@ static long vma_del_reservation(struct hstate *h,
> * 2) No reservation was in place for the page, so hugetlb_restore_reserve is
> * not set. However, alloc_hugetlb_folio always updates the reserve map.
> *
> - * In case 1, free_huge_page later in the error path will increment the
> - * global reserve count. But, free_huge_page does not have enough context
> + * In case 1, free_huge_folio later in the error path will increment the
> + * global reserve count. But, free_huge_folio does not have enough context
> * to adjust the reservation map. This case deals primarily with private
> * mappings. Adjust the reserve map here to be consistent with global
> - * reserve count adjustments to be made by free_huge_page. Make sure the
> + * reserve count adjustments to be made by free_huge_folio. Make sure the
> * reserve map indicates there is a reservation present.
> *
> * In case 2, simply undo reserve map modifications done by alloc_hugetlb_folio.
> @@ -2856,7 +2854,7 @@ void restore_reserve_on_error(struct hstate *h, struct vm_area_struct *vma,
> * Rare out of memory condition in reserve map
> * manipulation. Clear hugetlb_restore_reserve so
> * that global reserve count will not be incremented
> - * by free_huge_page. This will make it appear
> + * by free_huge_folio. This will make it appear
> * as though the reservation for this folio was
> * consumed. This may prevent the task from
> * faulting in the folio at a later time. This
> @@ -3232,7 +3230,7 @@ static void __init gather_bootmem_prealloc(void)
> if (prep_compound_gigantic_folio(folio, huge_page_order(h))) {
> WARN_ON(folio_test_reserved(folio));
> prep_new_hugetlb_folio(h, folio, folio_nid(folio));
> - free_huge_page(page); /* add to the hugepage allocator */
> + free_huge_folio(folio); /* add to the hugepage allocator */
> } else {
> /* VERY unlikely inflated ref count on a tail page */
> free_gigantic_folio(folio, huge_page_order(h));
> @@ -3264,7 +3262,7 @@ static void __init hugetlb_hstate_alloc_pages_onenode(struct hstate *h, int nid)
> &node_states[N_MEMORY], NULL);
> if (!folio)
> break;
> - free_huge_page(&folio->page); /* free it into the hugepage allocator */
> + free_huge_folio(folio); /* free it into the hugepage allocator */
> }
> cond_resched();
> }
> @@ -3542,7 +3540,7 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
> while (count > persistent_huge_pages(h)) {
> /*
> * If this allocation races such that we no longer need the
> - * page, free_huge_page will handle it by freeing the page
> + * page, free_huge_folio will handle it by freeing the page
> * and reducing the surplus.
> */
> spin_unlock_irq(&hugetlb_lock);
> @@ -3658,7 +3656,7 @@ static int demote_free_hugetlb_folio(struct hstate *h, struct folio *folio)
> prep_compound_page(subpage, target_hstate->order);
> folio_change_private(inner_folio, NULL);
> prep_new_hugetlb_folio(target_hstate, inner_folio, nid);
> - free_huge_page(subpage);
> + free_huge_folio(inner_folio);
> }
> mutex_unlock(&target_hstate->resize_lock);
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 548c8016190b..b569fd5562aa 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -620,7 +620,7 @@ void destroy_large_folio(struct folio *folio)
> enum compound_dtor_id dtor = folio->_folio_dtor;
>
> if (folio_test_hugetlb(folio)) {
> - free_huge_page(&folio->page);
> + free_huge_folio(folio);
> return;
> }
>
next prev parent reply other threads:[~2023-08-16 20:15 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-16 15:11 [PATCH v2 00/13] Remove _folio_dtor and _folio_order Matthew Wilcox (Oracle)
2023-08-16 15:11 ` [PATCH v2 01/13] io_uring: Stop calling free_compound_page() Matthew Wilcox (Oracle)
2023-08-16 15:11 ` [PATCH v2 02/13] mm: Call free_huge_page() directly Matthew Wilcox (Oracle)
2023-08-16 15:11 ` [PATCH v2 03/13] mm: Convert free_huge_page() to free_huge_folio() Matthew Wilcox (Oracle)
2023-08-16 20:14 ` Sidhartha Kumar [this message]
2023-08-17 3:31 ` Yanteng Si
2023-08-16 15:11 ` [PATCH v2 04/13] mm: Convert free_transhuge_folio() to folio_undo_large_rmappable() Matthew Wilcox (Oracle)
2023-08-16 15:11 ` [PATCH v2 05/13] mm; Convert prep_transhuge_page() to folio_prep_large_rmappable() Matthew Wilcox (Oracle)
2023-08-16 15:11 ` [PATCH v2 06/13] mm: Remove free_compound_page() and the compound_page_dtors array Matthew Wilcox (Oracle)
2023-08-16 15:11 ` [PATCH v2 07/13] mm: Remove HUGETLB_PAGE_DTOR Matthew Wilcox (Oracle)
2023-08-16 21:45 ` Sidhartha Kumar
2023-08-22 3:13 ` Mike Kravetz
2023-08-22 3:32 ` Matthew Wilcox
2023-08-22 17:19 ` Mike Kravetz
2023-12-08 17:54 ` Vlastimil Babka
2023-12-08 18:31 ` Matthew Wilcox
2023-12-13 12:23 ` David Hildenbrand
2023-08-16 15:11 ` [PATCH v2 08/13] mm: Add large_rmappable page flag Matthew Wilcox (Oracle)
2023-08-16 15:11 ` [PATCH v2 09/13] mm: Rearrange page flags Matthew Wilcox (Oracle)
2023-08-16 15:11 ` [PATCH v2 10/13] mm: Free up a word in the first tail page Matthew Wilcox (Oracle)
2023-08-22 23:17 ` Mike Kravetz
2023-08-23 0:29 ` Matthew Wilcox
2023-08-16 15:11 ` [PATCH v2 11/13] mm: Remove folio_test_transhuge() Matthew Wilcox (Oracle)
2023-08-16 15:12 ` [PATCH v2 12/13] mm: Add tail private fields to struct folio Matthew Wilcox (Oracle)
2023-08-16 15:21 ` David Hildenbrand
2023-08-16 15:12 ` [PATCH v2 13/13] mm: Convert split_huge_pages_pid() to use a folio Matthew Wilcox (Oracle)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox