public inbox for [email protected]
 help / color / mirror / Atom feed
From: Chenliang Li <[email protected]>
To: [email protected]
Cc: [email protected], [email protected], [email protected],
	[email protected], [email protected],
	[email protected], [email protected],
	[email protected]
Subject: Re: [PATCH v4 3/4] io_uring/rsrc: add init and account functions for coalesced imus
Date: Tue, 18 Jun 2024 11:11:10 +0800	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On Sun, 16 Jun 2024 18:43:13 +0100, Pavel Begunkov wrote:
> On 6/17/24 04:12, Chenliang Li wrote:
>> On Sun, 16 Jun 2024 19:04:38 +0100, Pavel Begunkov wrote:
>>> On 5/14/24 08:54, Chenliang Li wrote:
>>>> Introduce helper functions to check whether a buffer can
>>>> be coalesced or not, and gather folio data for later use.
>>>>
>>>> The coalescing optimizes time and space consumption caused
>>>> by mapping and storing multi-hugepage fixed buffers.
>>>>
>>>> A coalescable multi-hugepage buffer should fully cover its folios
>>>> (except potentially the first and last one), and these folios should
>>>> have the same size. These requirements are for easier later process,
>>>> also we need same size'd chunks in io_import_fixed for fast iov_iter
>>>> adjust.
>>>>
>>>> Signed-off-by: Chenliang Li <[email protected]>
>>>> ---
>>>>    io_uring/rsrc.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++
>>>>    io_uring/rsrc.h | 10 +++++++
>>>>    2 files changed, 88 insertions(+)
>>>>
>>>> diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c
>>>> index 65417c9553b1..d08224c0c5b0 100644
>>>> --- a/io_uring/rsrc.c
>>>> +++ b/io_uring/rsrc.c
>>>> @@ -871,6 +871,84 @@ static int io_buffer_account_pin(struct io_ring_ctx *ctx, struct page **pages,
>>>>    	return ret;
>>>>    }
>>>>    
>>>> +static bool __io_sqe_buffer_try_coalesce(struct page **pages, int nr_pages,
>>>> +					 struct io_imu_folio_data *data)
>>> io_can_coalesce_buffer(), you're not actually trying to
>>> do it here.
>> 
>> Will change it.
>> 
>>>> +static bool io_sqe_buffer_try_coalesce(struct page **pages, int nr_pages,
>>>> +				       struct io_imu_folio_data *data)
>>>> +{
>>>> +	int i, j;
>>>> +
>>>> +	if (nr_pages <= 1 ||
>>>> +		!__io_sqe_buffer_try_coalesce(pages, nr_pages, data))
>>>> +		return false;
>>>> +
>>>> +	/*
>>>> +	 * The pages are bound to the folio, it doesn't
>>>> +	 * actually unpin them but drops all but one reference,
>>>> +	 * which is usually put down by io_buffer_unmap().
>>>> +	 * Note, needs a better helper.
>>>> +	 */
>>>> +	if (data->nr_pages_head > 1)
>>>> +		unpin_user_pages(&pages[1], data->nr_pages_head - 1);
>>> Should be pages[0]. page[1] can be in another folio, and even
>>> though data->nr_pages_head > 1 protects against touching it,
>>> it's still flimsy.
>> 
>> But here it is unpinning the tail pages inside those coalesceable folios,
>> I think we only unpin pages[0] when failure, am I right? And in
>> __io_sqe_buffer_try_coalesce we have ensured that pages[1:nr_head_pages] are
>> in same folio and contiguous.
>
> We want the entire folio to still be pinned, but don't want to
> leave just one reference and not care down the line how many
> refcounts / etc. you have to put down.
> 
> void unpin_user_page(struct page *page)
> {
> 	sanity_check_pinned_pages(&page, 1);
> 	gup_put_folio(page_folio(page), 1, FOLL_PIN);
> }
>
> And all that goes to the folio as a single object, so doesn't
> really matter which page you pass. Anyway, let's then leave it
> as is then, I wish there would be unpin_folio_nr(), but there
> is unpin_user_page_range_dirty_lock() resembling it.

I see. Thanks for the explanation.

>>>> +
>>>> +	j = data->nr_pages_head;
>>>> +	nr_pages -= data->nr_pages_head;
>>>> +	for (i = 1; i < data->nr_folios; i++) {
>>>> +		unsigned int nr_unpin;
>>>> +
>>>> +		nr_unpin = min_t(unsigned int, nr_pages - 1,
>>>> +					data->nr_pages_mid - 1);
>>>> +		if (nr_unpin == 0)
>>>> +			break;
>>>> +		unpin_user_pages(&pages[j+1], nr_unpin);
>>> same
>>>> +		j += data->nr_pages_mid;
>>> And instead of duplicating this voodoo iteration later,
>>> please just assemble a new compacted ->nr_folios sized
>>> page array.
>> 
>> Indeed, a new page array would make things a lot easier.
>> If alloc overhead is not a concern here, then yeah I'll change it.
>
> It's not, and the upside is reducing memory footprint,
> which would be noticeable with huge pages. It's also
> kvmalloc'ed, so compacting also improves the TLB situation.

OK, will use a new page array.

Thanks,
Chenliang Li

  parent reply	other threads:[~2024-06-18  3:39 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20240514075453epcas5p17974fb62d65a88b1a1b55b97942ee2be@epcas5p1.samsung.com>
2024-05-14  7:54 ` [PATCH v4 0/4] io_uring/rsrc: coalescing multi-hugepage registered buffers Chenliang Li
     [not found]   ` <CGME20240514075457epcas5p10f02f1746f957df91353724ec859664f@epcas5p1.samsung.com>
2024-05-14  7:54     ` [PATCH v4 1/4] io_uring/rsrc: add hugepage buffer coalesce helpers Chenliang Li
2024-05-16 14:07       ` Anuj gupta
2024-06-16 18:04       ` Pavel Begunkov
     [not found]         ` <CGME20240617031218epcas5p4f706f53094ed8650a2b59b2006120956@epcas5p4.samsung.com>
2024-06-17  3:12           ` [PATCH v2 0/4] io_uring/rsrc: coalescing multi-hugepage registered buffers Chenliang Li
2024-06-17 12:38             ` Pavel Begunkov
     [not found]               ` <CGME20240618031115epcas5p25e2275b5e73f974f13aa5ba060979973@epcas5p2.samsung.com>
2024-06-18  3:11                 ` Chenliang Li [this message]
     [not found]   ` <CGME20240514075459epcas5p2275b4c26f16bcfcea200e97fc75c2a14@epcas5p2.samsung.com>
2024-05-14  7:54     ` [PATCH v4 2/4] io_uring/rsrc: store folio shift and mask into imu Chenliang Li
2024-05-16 14:08       ` Anuj gupta
     [not found]   ` <CGME20240514075500epcas5p1e638b1ae84727b3669ff6b780cd1cb23@epcas5p1.samsung.com>
2024-05-14  7:54     ` [PATCH v4 3/4] io_uring/rsrc: add init and account functions for coalesced imus Chenliang Li
2024-06-16 17:43       ` Pavel Begunkov
     [not found]         ` <CGME20240617031611epcas5p26e5c5f65a182af069427b1609f01d1d0@epcas5p2.samsung.com>
2024-06-17  3:16           ` [PATCH v2 0/4] io_uring/rsrc: coalescing multi-hugepage registered buffers Chenliang Li
2024-06-17 12:22             ` Pavel Begunkov
     [not found]               ` <CGME20240618032433epcas5p258e5fe6863a91a1f6243f3408b3378f9@epcas5p2.samsung.com>
2024-06-18  3:24                 ` [PATCH v4 3/4] io_uring/rsrc: add init and account functions for coalesced imus Chenliang Li
     [not found]   ` <CGME20240514075502epcas5p10be6bef71d284a110277575d6008563d@epcas5p1.samsung.com>
2024-05-14  7:54     ` [PATCH v4 4/4] io_uring/rsrc: enable multi-hugepage buffer coalescing Chenliang Li
2024-05-16 14:09       ` Anuj gupta
2024-05-16 14:01   ` [PATCH v4 0/4] io_uring/rsrc: coalescing multi-hugepage registered buffers Anuj gupta
2024-05-16 14:58     ` Jens Axboe
     [not found]       ` <CGME20240530051050epcas5p122f30aebcf99e27a8d02cc1318dbafc8@epcas5p1.samsung.com>
2024-05-30  5:10         ` Chenliang Li
2024-06-04 13:33           ` Anuj gupta
     [not found]           ` <CGME20240613024932epcas5p2f053609efe7e9fb3d87318a66c2ccf53@epcas5p2.samsung.com>
2024-06-13  2:49             ` Chenliang Li
2024-06-16  2:54               ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox