From: Pavel Begunkov <[email protected]>
To: "Darrick J. Wong" <[email protected]>,
Matthew Wilcox <[email protected]>
Cc: Christoph Hellwig <[email protected]>, Anuj Gupta <[email protected]>,
[email protected], [email protected], [email protected],
[email protected], [email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
Kanchan Joshi <[email protected]>
Subject: Re: [PATCH v9 06/11] io_uring: introduce attributes for read/write and PI support
Date: Thu, 21 Nov 2024 13:45:58 +0000 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <20241120173517.GQ9425@frogsfrogsfrogs>
On 11/20/24 17:35, Darrick J. Wong wrote:
> On Fri, Nov 15, 2024 at 06:04:01PM +0000, Matthew Wilcox wrote:
>> On Thu, Nov 14, 2024 at 01:09:44PM +0000, Pavel Begunkov wrote:
>>> With SQE128 it's also a problem that now all SQEs are 128 bytes regardless
>>> of whether a particular request needs it or not, and the user will need
>>> to zero them for each request.
>>
>> The way we handled this in NVMe was to use a bit in the command that
>> was called (iirc) FUSED, which let you use two consecutive entries for
>> a single command.
>>
>> Some variant on that could surely be used for io_uring. Perhaps a
>> special opcode that says "the real opcode is here, and this is a two-slot
>> command". Processing gets a little spicy when one slot is the last in
>> the buffer and the next is the the first in the buffer, but that's a SMOP.
>
> I like willy's suggestion -- what's the difficulty in having a SQE flag
> that says "...and keep going into the next SQE"? I guess that
> introduces the problem that you can no longer react to the observation
> of 4 new SQEs by creating 4 new contexts to process those SQEs and throw
> all 4 of them at background threads, since you don't know how many IOs
> are there.
Some variation on "variable size SQE" was discussed back in the day
as an option instead of SQE128. I don't remember why it was refused
exactly, but I'd think it was exactly the "spicy" moment Matthew
mentioned, especially since nvme passthrough was spanning its payload
across both parts of the SQE.
I'm pretty sure I can find more than a couple of downsides, like for
it to be truly generic you need a flag in each SQE and finding a bit
is not that easy, and also in terms of some overhead to everyone else
while this extension is not even needed. By the end of the day, the
main concern is how it's placed and not where specifically,
SQE / user memory / etc.
--
Pavel Begunkov
next prev parent reply other threads:[~2024-11-21 13:45 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20241114105326epcas5p103b2c996293fa680092b97c747fdbd59@epcas5p1.samsung.com>
2024-11-14 10:45 ` [PATCH v9 00/11] Read/Write with meta/integrity Anuj Gupta
[not found] ` <CGME20241114105352epcas5p109c1742fa8a6552296b9c104f2271308@epcas5p1.samsung.com>
2024-11-14 10:45 ` [PATCH v9 01/11] block: define set of integrity flags to be inherited by cloned bip Anuj Gupta
[not found] ` <CGME20241114105354epcas5p49a73947c3d37be4189023f66fb7ba413@epcas5p4.samsung.com>
2024-11-14 10:45 ` [PATCH v9 02/11] block: copy back bounce buffer to user-space correctly in case of split Anuj Gupta
[not found] ` <CGME20241114105357epcas5p41fd14282d4abfe564e858b37babe708a@epcas5p4.samsung.com>
2024-11-14 10:45 ` [PATCH v9 03/11] block: modify bio_integrity_map_user to accept iov_iter as argument Anuj Gupta
[not found] ` <CGME20241114105400epcas5p270b8062a0c4f26833a5b497f057d65a7@epcas5p2.samsung.com>
2024-11-14 10:45 ` [PATCH v9 04/11] fs, iov_iter: define meta io descriptor Anuj Gupta
[not found] ` <CGME20241114105402epcas5p41b1f6054a557f1bda2cfddfdfb9a9477@epcas5p4.samsung.com>
2024-11-14 10:45 ` [PATCH v9 05/11] fs: introduce IOCB_HAS_METADATA for metadata Anuj Gupta
[not found] ` <CGME20241114105405epcas5p24ca2fb9017276ff8a50ef447638fd739@epcas5p2.samsung.com>
2024-11-14 10:45 ` [PATCH v9 06/11] io_uring: introduce attributes for read/write and PI support Anuj Gupta
2024-11-14 12:16 ` Christoph Hellwig
2024-11-14 13:09 ` Pavel Begunkov
2024-11-14 15:19 ` Christoph Hellwig
2024-11-15 16:40 ` Pavel Begunkov
2024-11-15 17:12 ` Christoph Hellwig
2024-11-15 17:44 ` Jens Axboe
2024-11-15 18:00 ` Christoph Hellwig
2024-11-15 19:03 ` Pavel Begunkov
2024-11-18 12:49 ` Christoph Hellwig
2024-11-15 18:04 ` Matthew Wilcox
2024-11-20 17:35 ` Darrick J. Wong
2024-11-21 6:54 ` Christoph Hellwig
2024-11-21 13:45 ` Pavel Begunkov [this message]
2024-11-15 13:29 ` Anuj gupta
2024-11-16 0:00 ` Pavel Begunkov
2024-11-16 0:32 ` Pavel Begunkov
2024-11-18 12:50 ` Christoph Hellwig
2024-11-18 16:59 ` Pavel Begunkov
2024-11-18 17:03 ` Christoph Hellwig
2024-11-18 17:45 ` Pavel Begunkov
2024-11-19 12:49 ` Christoph Hellwig
2024-11-21 13:29 ` Pavel Begunkov
2024-11-21 8:59 ` Anuj Gupta
2024-11-21 15:45 ` Pavel Begunkov
2024-11-16 23:09 ` kernel test robot
[not found] ` <CGME20241114105408epcas5p3c77cda2faf7ccb37abbfd8e95b4ad1f5@epcas5p3.samsung.com>
2024-11-14 10:45 ` [PATCH v9 07/11] io_uring: inline read/write attributes and PI Anuj Gupta
[not found] ` <CGME20241114105410epcas5p1c6a4e6141b073ccfd6277288f7d5e28b@epcas5p1.samsung.com>
2024-11-14 10:45 ` [PATCH v9 08/11] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags Anuj Gupta
[not found] ` <CGME20241114105413epcas5p2d7da8675df2de0d1efba3057144e691d@epcas5p2.samsung.com>
2024-11-14 10:45 ` [PATCH v9 09/11] nvme: add support for passing on the application tag Anuj Gupta
[not found] ` <CGME20241114105416epcas5p3a7aa552775cfe50f60ef89f7d982ea12@epcas5p3.samsung.com>
2024-11-14 10:45 ` [PATCH v9 10/11] scsi: add support for user-meta interface Anuj Gupta
[not found] ` <CGME20241114105418epcas5p1537d72b9016d10670cf97751704e2cc8@epcas5p1.samsung.com>
2024-11-14 10:45 ` [PATCH v9 11/11] block: add support to pass user meta buffer Anuj Gupta
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox