public inbox for [email protected]
 help / color / mirror / Atom feed
From: Kanchan Joshi <[email protected]>
To: Keith Busch <[email protected]>, Christoph Hellwig <[email protected]>
Cc: Anuj gupta <[email protected]>,
	Anuj Gupta <[email protected]>,
	[email protected], [email protected],
	[email protected], [email protected], [email protected],
	[email protected], [email protected],
	[email protected], [email protected],
	[email protected], [email protected],
	[email protected], [email protected]
Subject: Re: [PATCH v7 06/10] io_uring/rw: add support to send metadata along with read/write
Date: Tue, 5 Nov 2024 22:20:41 +0530	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 11/5/2024 9:53 PM, Keith Busch wrote:
> On Tue, Nov 05, 2024 at 05:00:51PM +0100, Christoph Hellwig wrote:
>> On Tue, Nov 05, 2024 at 09:21:27PM +0530, Kanchan Joshi wrote:
>>> Can add the documentation (if this version is palatable for Jens/Pavel),
>>> but this was discussed in previous iteration:
>>>
>>> 1. Each meta type may have different space requirement in SQE.
>>>
>>> Only for PI, we need so much space that we can't fit that in first SQE.
>>> The SQE128 requirement is only for PI type.
>>> Another different meta type may just fit into the first SQE. For that we
>>> don't have to mandate SQE128.
>>
>> Ok, I'm really confused now.  The way I understood Anuj was that this
>> is NOT about block level metadata, but about other uses of the big SQE.
>>
>> Which version is right?  Or did I just completely misunderstand Anuj?
> 
> Let's not call this "meta_type". Can we use something that has a less
> overloaded meaning, like "sqe_extended_capabilities", or "ecap", or
> something like that.
>   

Right, something like that. We need to change it.
Seems a useful thing is not being seen that way because of its name.

>>> 2. If two meta types are known not to co-exist, they can be kept in the
>>> same place within SQE. Since each meta-type is a flag, we can check what
>>> combinations are valid within io_uring and throw the error in case of
>>> incompatibility.
>>
>> And this sounds like what you refer to is not actually block metadata
>> as in this patchset or nvme, (or weirdly enough integrity in the block
>> layer code).
>>
>>> 3. Previous version was relying on SQE128 flag. If user set the ring
>>> that way, it is assumed that PI information was sent.
>>> This is more explicitly conveyed now - if user passed META_TYPE_PI flag,
>>> it has sent the PI. This comment in the code:
>>>
>>> +       /* if sqe->meta_type is META_TYPE_PI, last 32 bytes are for PI */
>>> +       union {
>>>
>>> If this flag is not passed, parsing of second SQE is skipped, which is
>>> the current behavior as now also one can send regular (non pi)
>>> read/write on SQE128 ring.
>>
>> And while I don't understand how this threads in with the previous
>> statements, this makes sense.  If you only want to send a pointer (+len)
>> to metadata you can use the normal 64-byte SQE.  If you want to send
>> a PI tuple you need SEQ128.  Is that what the various above statements
>> try to express?  If so the right API to me would be to have two flags:
>>
>>   - a flag that a pointer to metadata is passed.  This can work with
>>     a 64-bit SQE.
>>   - another flag that a PI tuple is passed.  This requires a 128-byte
>>     and also the previous flag.
> 
> I don't think anything done so far aligns with what Pavel had in mind.
> Let me try to lay out what I think he's going for. Just bare with me,
> this is just a hypothetical example.

I have the same example in mind.


>    This patch adds a PI extension.
>    Later, let's say write streams needs another extenion.
>    Then key per-IO wants another extention.
>    Then someone else adds wizbang-awesome-feature extention.
> 
> Let's say you have device that can do all 4, or any combination of them.
> Pavel wants a solution that is future proof to such a scenario. So not
> just a single new "meta_type" with its structure, but a list of types in
> no particular order, and their structures.
> 
> That list can exist either in the extended SQE, or in some other user
> address that the kernel will need copy.

That list is the meta_type bit-flags this series creates.

For some future meta_type there can be "META_TYPE_XYZ_INDIRECT" flag and 
that will mean extra-information needs to fetched via copy_from_user.

  reply	other threads:[~2024-11-05 16:50 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20241104141427epcas5p2174ded627e2d785294ac4977b011a75b@epcas5p2.samsung.com>
2024-11-04 14:05 ` [PATCH v7 00/10] Read/Write with meta/integrity Anuj Gupta
     [not found]   ` <CGME20241104141445epcas5p3fa11a5bebe88ac2bb3541850369591f7@epcas5p3.samsung.com>
2024-11-04 14:05     ` [PATCH v7 01/10] block: define set of integrity flags to be inherited by cloned bip Anuj Gupta
     [not found]   ` <CGME20241104141448epcas5p4179505e12f9cf45fd792dc6da6afce8e@epcas5p4.samsung.com>
2024-11-04 14:05     ` [PATCH v7 02/10] block: copy back bounce buffer to user-space correctly in case of split Anuj Gupta
2024-11-05 10:03       ` Christoph Hellwig
2024-11-05 13:15         ` Anuj gupta
     [not found]   ` <CGME20241104141451epcas5p2aef1f93e905c27e34b3e16d89ff39245@epcas5p2.samsung.com>
2024-11-04 14:05     ` [PATCH v7 03/10] block: modify bio_integrity_map_user to accept iov_iter as argument Anuj Gupta
     [not found]   ` <CGME20241104141453epcas5p201e4aabfa7aa1f4af1cdf07228f8d4e7@epcas5p2.samsung.com>
2024-11-04 14:05     ` [PATCH v7 04/10] fs, iov_iter: define meta io descriptor Anuj Gupta
2024-11-05  9:55       ` Christoph Hellwig
     [not found]   ` <CGME20241104141456epcas5p38fef2ccde087de84ffc6f479f50e8071@epcas5p3.samsung.com>
2024-11-04 14:05     ` [PATCH v7 05/10] fs: introduce IOCB_HAS_METADATA for metadata Anuj Gupta
     [not found]   ` <CGME20241104141459epcas5p27991e140158b1e7294b4d6c4e767373c@epcas5p2.samsung.com>
2024-11-04 14:05     ` [PATCH v7 06/10] io_uring/rw: add support to send metadata along with read/write Anuj Gupta
2024-11-05  9:56       ` Christoph Hellwig
2024-11-05 13:04         ` Anuj gupta
2024-11-05 13:56           ` Christoph Hellwig
2024-11-05 15:51             ` Kanchan Joshi
2024-11-05 16:00               ` Christoph Hellwig
2024-11-05 16:23                 ` Keith Busch
2024-11-05 16:50                   ` Kanchan Joshi [this message]
2024-11-06  5:29                   ` Christoph Hellwig
2024-11-06  6:00                     ` Kanchan Joshi
2024-11-06  6:12                       ` Christoph Hellwig
2024-11-05 16:38                 ` Kanchan Joshi
2024-11-06  5:33                   ` Christoph Hellwig
     [not found]   ` <CGME20241104141501epcas5p38203d98ce0b2ac95cc45e02a142e84ef@epcas5p3.samsung.com>
2024-11-04 14:05     ` [PATCH v7 07/10] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags Anuj Gupta
     [not found]   ` <CGME20241104141504epcas5p47e46a75f9248a37c9a4180de8e72b54c@epcas5p4.samsung.com>
2024-11-04 14:05     ` [PATCH v7 08/10] nvme: add support for passing on the application tag Anuj Gupta
     [not found]   ` <CGME20241104141507epcas5p161e39cef85f8fa5f5ad59e959e070d0b@epcas5p1.samsung.com>
2024-11-04 14:06     ` [PATCH v7 09/10] scsi: add support for user-meta interface Anuj Gupta
     [not found]   ` <CGME20241104141509epcas5p4ed0c68c42ccad27f9a38dc0c0ef7628d@epcas5p4.samsung.com>
2024-11-04 14:06     ` [PATCH v7 10/10] block: add support to pass user meta buffer Anuj Gupta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox