public inbox for [email protected]
 help / color / mirror / Atom feed
From: Pavel Begunkov <[email protected]>
To: Chenliang Li <[email protected]>, [email protected]
Cc: [email protected], [email protected],
	[email protected], [email protected],
	[email protected], [email protected]
Subject: Re: [PATCH v5 1/3] io_uring/rsrc: add hugepage fixed buffer coalesce helpers
Date: Tue, 9 Jul 2024 14:09:12 +0100	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 6/28/24 09:44, Chenliang Li wrote:
> Introduce helper functions to check and coalesce hugepage-backed fixed
> buffers. The coalescing optimizes both time and space consumption caused
> by mapping and storing multi-hugepage fixed buffers. Currently we only
> have single-hugepage buffer coalescing, so add support for multi-hugepage
> fixed buffer coalescing.
> 
> A coalescable multi-hugepage buffer should fully cover its folios
> (except potentially the first and last one), and these folios should
> have the same size. These requirements are for easier processing later,
> also we need same size'd chunks in io_import_fixed for fast iov_iter
> adjust.
> 
> Signed-off-by: Chenliang Li <[email protected]>
> ---
>   io_uring/rsrc.c | 87 +++++++++++++++++++++++++++++++++++++++++++++++++
>   io_uring/rsrc.h |  9 +++++
>   2 files changed, 96 insertions(+)
> 
> diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c
> index 60c00144471a..c88ce8c38515 100644
> --- a/io_uring/rsrc.c
> +++ b/io_uring/rsrc.c
> @@ -849,6 +849,93 @@ static int io_buffer_account_pin(struct io_ring_ctx *ctx, struct page **pages,
>   	return ret;
>   }
>   
> +static bool io_do_coalesce_buffer(struct page ***pages, int *nr_pages,
> +				struct io_imu_folio_data *data, int nr_folios)
> +{
> +	struct page **page_array = *pages, **new_array = NULL;
> +	int nr_pages_left = *nr_pages, i, j;
> +
> +	/* Store head pages only*/
> +	new_array = kvmalloc_array(nr_folios, sizeof(struct page *),
> +					GFP_KERNEL);
> +	if (!new_array)
> +		return false;
> +
> +	new_array[0] = page_array[0];
> +	/*
> +	 * The pages are bound to the folio, it doesn't
> +	 * actually unpin them but drops all but one reference,
> +	 * which is usually put down by io_buffer_unmap().
> +	 * Note, needs a better helper.
> +	 */
> +	if (data->nr_pages_head > 1)
> +		unpin_user_pages(&page_array[1], data->nr_pages_head - 1);
> +
> +	j = data->nr_pages_head;
> +	nr_pages_left -= data->nr_pages_head;
> +	for (i = 1; i < nr_folios; i++) {
> +		unsigned int nr_unpin;
> +
> +		new_array[i] = page_array[j];
> +		nr_unpin = min_t(unsigned int, nr_pages_left - 1,
> +					data->nr_pages_mid - 1);
> +		if (nr_unpin)
> +			unpin_user_pages(&page_array[j+1], nr_unpin);
> +		j += data->nr_pages_mid;
> +		nr_pages_left -= data->nr_pages_mid;
> +	}
> +	kvfree(page_array);
> +	*pages = new_array;
> +	*nr_pages = nr_folios;
> +	return true;
> +}
> +
> +static bool io_try_coalesce_buffer(struct page ***pages, int *nr_pages,
> +					 struct io_imu_folio_data *data)

I believe unused static function will trigger a warning, we don't
want that, especially since error on warn is a thing.

You can either reshuffle patches or at least add a
__maybe_unused attribute.


> +{
> +	struct page **page_array = *pages;
> +	struct folio *folio = page_folio(page_array[0]);
> +	unsigned int count = 1, nr_folios = 1;
> +	int i;
> +
> +	if (*nr_pages <= 1)
> +		return false;
> +
> +	data->nr_pages_mid = folio_nr_pages(folio);
> +	if (data->nr_pages_mid == 1)
> +		return false;
> +
> +	data->folio_shift = folio_shift(folio);
> +	data->folio_size = folio_size(folio);
> +	/*
> +	 * Check if pages are contiguous inside a folio, and all folios have
> +	 * the same page count except for the head and tail.
> +	 */
> +	for (i = 1; i < *nr_pages; i++) {
> +		if (page_folio(page_array[i]) == folio &&
> +			page_array[i] == page_array[i-1] + 1) {
> +			count++;
> +			continue;
> +		}

Seems like the first and last folios can be not border aligned,
i.e. the first should end at the folio_size boundary, and the
last one should start at the beginning of the folio.

Not really a bug, but we might get some problems with optimising
calculations down the road if we don't restrict it.

> +
> +		if (nr_folios == 1)
> +			data->nr_pages_head = count;
> +		else if (count != data->nr_pages_mid)
> +			return false;
> +
> +		folio = page_folio(page_array[i]);
> +		if (folio_size(folio) != data->folio_size)
> +			return false;
> +
> +		count = 1;
> +		nr_folios++;
> +	}
> +	if (nr_folios == 1)
> +		data->nr_pages_head = count;
> +
> +	return io_do_coalesce_buffer(pages, nr_pages, data, nr_folios);
> +}
> +
>   static int io_sqe_buffer_register(struct io_ring_ctx *ctx, struct iovec *iov,
>   				  struct io_mapped_ubuf **pimu,
>   				  struct page **last_hpage)
> diff --git a/io_uring/rsrc.h b/io_uring/rsrc.h
> index c032ca3436ca..cc66323535f6 100644
> --- a/io_uring/rsrc.h
> +++ b/io_uring/rsrc.h
> @@ -50,6 +50,15 @@ struct io_mapped_ubuf {
>   	struct bio_vec	bvec[] __counted_by(nr_bvecs);
>   };
>   
> +struct io_imu_folio_data {
> +	/* Head folio can be partially included in the fixed buf */
> +	unsigned int	nr_pages_head;
> +	/* For non-head/tail folios, has to be fully included */
> +	unsigned int	nr_pages_mid;
> +	unsigned int	folio_shift;
> +	size_t		folio_size;
> +};
> +
>   void io_rsrc_node_ref_zero(struct io_rsrc_node *node);
>   void io_rsrc_node_destroy(struct io_ring_ctx *ctx, struct io_rsrc_node *ref_node);
>   struct io_rsrc_node *io_rsrc_node_alloc(struct io_ring_ctx *ctx);

-- 
Pavel Begunkov

  reply	other threads:[~2024-07-09 13:09 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20240628084418epcas5p14c304761ca375a6afba3aa199c27f9e3@epcas5p1.samsung.com>
2024-06-28  8:44 ` [PATCH v5 0/3] io_uring/rsrc: coalescing multi-hugepage registered buffers Chenliang Li
     [not found]   ` <CGME20240628084420epcas5p32f49e7c977695d20bcef7734eb2e38b4@epcas5p3.samsung.com>
2024-06-28  8:44     ` [PATCH v5 1/3] io_uring/rsrc: add hugepage fixed buffer coalesce helpers Chenliang Li
2024-07-09 13:09       ` Pavel Begunkov [this message]
     [not found]         ` <CGME20240710022336epcas5p2685a44c8e04962830f4e7f8ffee8168f@epcas5p2.samsung.com>
2024-07-10  2:23           ` Chenliang Li
     [not found]   ` <CGME20240628084422epcas5p3b5d4c93e5fa30069c703bcead1fa0033@epcas5p3.samsung.com>
2024-06-28  8:44     ` [PATCH v5 2/3] io_uring/rsrc: store folio shift and mask into imu Chenliang Li
     [not found]   ` <CGME20240628084424epcas5p3c34ec2fb8fb45752ef6a11447812ae0d@epcas5p3.samsung.com>
2024-06-28  8:44     ` [PATCH v5 3/3] io_uring/rsrc: enable multi-hugepage buffer coalescing Chenliang Li
2024-07-09 13:17       ` Pavel Begunkov
     [not found]         ` <CGME20240710022900epcas5p368c4ebc44f3ace1ca0804116bd913512@epcas5p3.samsung.com>
2024-07-10  2:28           ` Chenliang Li
     [not found]   ` <CGME20240708021432epcas5p4e7e74d81a42a559f2b059e94e7022740@epcas5p4.samsung.com>
2024-07-08  2:14     ` [PATCH v5 0/3] io_uring/rsrc: coalescing multi-hugepage registered buffers Chenliang Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox