public inbox for [email protected]
 help / color / mirror / Atom feed
From: Pavel Begunkov <[email protected]>
To: Keith Busch <[email protected]>, Kanchan Joshi <[email protected]>
Cc: [email protected], [email protected], [email protected],
	[email protected], [email protected], [email protected],
	[email protected], [email protected],
	[email protected], [email protected],
	[email protected], [email protected],
	[email protected], [email protected],
	Anuj Gupta <[email protected]>
Subject: Re: [PATCH v6 06/10] io_uring/rw: add support to send metadata along with read/write
Date: Thu, 31 Oct 2024 14:39:09 +0000	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 10/30/24 21:09, Keith Busch wrote:
> On Wed, Oct 30, 2024 at 11:31:08PM +0530, Kanchan Joshi wrote:
>> diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
>> index 024745283783..48dcca125db3 100644
>> --- a/include/uapi/linux/io_uring.h
>> +++ b/include/uapi/linux/io_uring.h
>> @@ -105,6 +105,22 @@ struct io_uring_sqe {
>>   		 */
>>   		__u8	cmd[0];
>>   	};
>> +	/*
>> +	 * If the ring is initialized with IORING_SETUP_SQE128, then
>> +	 * this field is starting offset for 64 bytes of data. For meta io
>> +	 * this contains 'struct io_uring_meta_pi'
>> +	 */
>> +	__u8	big_sqe[0];
>> +};

I don't think zero sized arrays are good as a uapi regardless of
cmd[0] above, let's just do

sqe = get_sqe();
big_sqe = (void *)(sqe + 1)

with an appropriate helper.

>> +
>> +/* this is placed in SQE128 */
>> +struct io_uring_meta_pi {
>> +	__u16		pi_flags;
>> +	__u16		app_tag;
>> +	__u32		len;
>> +	__u64		addr;
>> +	__u64		seed;
>> +	__u64		rsvd[2];
>>   };
> 
> On the previous version, I was more questioning if it aligns with what

I missed that discussion, let me know if I need to look it up

> Pavel was trying to do here. I didn't quite get it, so I was more
> confused than saying it should be this way now.

The point is, SQEs don't have nearly enough space to accommodate all
such optional features, especially when it's taking so much space and
not applicable to all reads but rather some specific  use cases and
files. Consider that there might be more similar extensions and we might
even want to use them together.

1. SQE128 makes it big for all requests, intermixing with requests that
don't need additional space wastes space. SQE128 is fine to use but at
the same time we should be mindful about it and try to avoid enabling it
if feasible.

2. This API hard codes io_uring_meta_pi into the extended part of the
SQE. If we want to add another feature it'd need to go after the meta
struct. SQE256? And what if the user doesn't need PI but only the second
feature?

In short, the uAPI need to have a clear vision of how it can be used
with / extended to multiple optional features and not just PI.

One option I mentioned before is passing a user pointer to an array of
structures, each would will have the type specifying what kind of
feature / meta information it is, e.g. META_TYPE_PI. It's not a
complete solution but a base idea to extend upon. I separately
mentioned before, if copy_from_user is expensive we can optimise it
with pre-registering memory. I think Jens even tried something similar
with structures we pass as waiting parameters.

I didn't read through all iterations of the series, so if there is
some other approach described that ticks the boxes and flexible
enough, I'd be absolutely fine with it.


-- 
Pavel Begunkov

  reply	other threads:[~2024-10-31 14:38 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20241030180957epcas5p3312b0a582e8562f8c2169e64d41592b2@epcas5p3.samsung.com>
2024-10-30 18:01 ` [PATCH v6 00/10] Read/Write with metadata/integrity Kanchan Joshi
     [not found]   ` <CGME20241030181000epcas5p2bfb47a79f1e796116135f646c6f0ccc7@epcas5p2.samsung.com>
2024-10-30 18:01     ` [PATCH v6 01/10] block: define set of integrity flags to be inherited by cloned bip Kanchan Joshi
     [not found]   ` <CGME20241030181002epcas5p2b44e244bcd0c49d0a379f0f4fe07dc3f@epcas5p2.samsung.com>
2024-10-30 18:01     ` [PATCH v6 02/10] block: copy back bounce buffer to user-space correctly in case of split Kanchan Joshi
     [not found]   ` <CGME20241030181005epcas5p43b40adb5af1029c9ffaecde317bf1c5d@epcas5p4.samsung.com>
2024-10-30 18:01     ` [PATCH v6 03/10] block: modify bio_integrity_map_user to accept iov_iter as argument Kanchan Joshi
2024-10-31  4:33       ` kernel test robot
     [not found]   ` <CGME20241030181008epcas5p333603fdbf3afb60947d3fc51138d11bf@epcas5p3.samsung.com>
2024-10-30 18:01     ` [PATCH v6 04/10] fs, iov_iter: define meta io descriptor Kanchan Joshi
2024-10-31  6:55       ` Christoph Hellwig
     [not found]   ` <CGME20241030181010epcas5p2c399ecea97ed6d0e5fb228b5d15c2089@epcas5p2.samsung.com>
2024-10-30 18:01     ` [PATCH v6 05/10] fs: introduce IOCB_HAS_METADATA for metadata Kanchan Joshi
     [not found]   ` <CGME20241030181013epcas5p2762403c83e29c81ec34b2a7755154245@epcas5p2.samsung.com>
2024-10-30 18:01     ` [PATCH v6 06/10] io_uring/rw: add support to send metadata along with read/write Kanchan Joshi
2024-10-30 21:09       ` Keith Busch
2024-10-31 14:39         ` Pavel Begunkov [this message]
2024-11-01 17:54           ` Kanchan Joshi
2024-11-07 17:23             ` Pavel Begunkov
2024-11-10 17:41               ` Kanchan Joshi
2024-11-12  0:54                 ` Pavel Begunkov
2024-11-10 18:36               ` Kanchan Joshi
2024-11-12  1:32                 ` Pavel Begunkov
2024-10-31  6:55       ` Christoph Hellwig
     [not found]   ` <CGME20241030181016epcas5p3da284aa997e81d9855207584ab4bace3@epcas5p3.samsung.com>
2024-10-30 18:01     ` [PATCH v6 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags Kanchan Joshi
     [not found]   ` <CGME20241030181019epcas5p135961d721959d80f1f60bd4790ed52cf@epcas5p1.samsung.com>
2024-10-30 18:01     ` [PATCH v6 08/10] nvme: add support for passing on the application tag Kanchan Joshi
     [not found]   ` <CGME20241030181021epcas5p1c61b7980358f3120014b4f99390d1595@epcas5p1.samsung.com>
2024-10-30 18:01     ` [PATCH v6 09/10] scsi: add support for user-meta interface Kanchan Joshi
2024-10-31  5:09       ` kernel test robot
2024-10-31  5:10       ` kernel test robot
     [not found]   ` <CGME20241030181024epcas5p3964697a08159f8593a6f94764f77a7f3@epcas5p3.samsung.com>
2024-10-30 18:01     ` [PATCH v6 10/10] block: add support to pass user meta buffer Kanchan Joshi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox