From: "[email protected]" <[email protected]>
To: Jens Axboe <[email protected]>
Cc: Pavel Begunkov <[email protected]>,
Damien Le Moal <[email protected]>,
Kanchan Joshi <[email protected]>,
"[email protected]" <[email protected]>,
"[email protected]" <[email protected]>,
"[email protected]" <[email protected]>,
"[email protected]" <[email protected]>,
"[email protected]" <[email protected]>,
"[email protected]" <[email protected]>,
"[email protected]" <[email protected]>,
"[email protected]" <[email protected]>,
"[email protected]" <[email protected]>
Subject: Re: [PATCH 3/3] io_uring: add support for zone-append
Date: Sun, 21 Jun 2020 20:52:07 +0200 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 19.06.2020 09:02, Jens Axboe wrote:
>On 6/19/20 8:59 AM, Pavel Begunkov wrote:
>> On 19/06/2020 17:15, Jens Axboe wrote:
>>> On 6/19/20 3:41 AM, [email protected] wrote:
>>>> Jens,
>>>>
>>>> Would you have time to answer a question below in this thread?
>>>>
>>>> On 18.06.2020 11:11, [email protected] wrote:
>>>>> On 18.06.2020 08:47, Damien Le Moal wrote:
>>>>>> On 2020/06/18 17:35, [email protected] wrote:
>>>>>>> On 18.06.2020 07:39, Damien Le Moal wrote:
>>>>>>>> On 2020/06/18 2:27, Kanchan Joshi wrote:
>>>>>>>>> From: Selvakumar S <[email protected]>
>>>>>>>>>
>>>>>>>>> Introduce three new opcodes for zone-append -
>>>>>>>>>
>>>>>>>>> IORING_OP_ZONE_APPEND : non-vectord, similiar to IORING_OP_WRITE
>>>>>>>>> IORING_OP_ZONE_APPENDV : vectored, similar to IORING_OP_WRITEV
>>>>>>>>> IORING_OP_ZONE_APPEND_FIXED : append using fixed-buffers
>>>>>>>>>
>>>>>>>>> Repurpose cqe->flags to return zone-relative offset.
>>>>>>>>>
>>>>>>>>> Signed-off-by: SelvaKumar S <[email protected]>
>>>>>>>>> Signed-off-by: Kanchan Joshi <[email protected]>
>>>>>>>>> Signed-off-by: Nitesh Shetty <[email protected]>
>>>>>>>>> Signed-off-by: Javier Gonzalez <[email protected]>
>>>>>>>>> ---
>>>>>>>>> fs/io_uring.c | 72 +++++++++++++++++++++++++++++++++++++++++--
>>>>>>>>> include/uapi/linux/io_uring.h | 8 ++++-
>>>>>>>>> 2 files changed, 77 insertions(+), 3 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>>>>>>>>> index 155f3d8..c14c873 100644
>>>>>>>>> --- a/fs/io_uring.c
>>>>>>>>> +++ b/fs/io_uring.c
>>>>>>>>> @@ -649,6 +649,10 @@ struct io_kiocb {
>>>>>>>>> unsigned long fsize;
>>>>>>>>> u64 user_data;
>>>>>>>>> u32 result;
>>>>>>>>> +#ifdef CONFIG_BLK_DEV_ZONED
>>>>>>>>> + /* zone-relative offset for append, in bytes */
>>>>>>>>> + u32 append_offset;
>>>>>>>>
>>>>>>>> this can overflow. u64 is needed.
>>>>>>>
>>>>>>> We chose to do it this way to start with because struct io_uring_cqe
>>>>>>> only has space for u32 when we reuse the flags.
>>>>>>>
>>>>>>> We can of course create a new cqe structure, but that will come with
>>>>>>> larger changes to io_uring for supporting append.
>>>>>>>
>>>>>>> Do you believe this is a better approach?
>>>>>>
>>>>>> The problem is that zone size are 32 bits in the kernel, as a number
>>>>>> of sectors. So any device that has a zone size smaller or equal to
>>>>>> 2^31 512B sectors can be accepted. Using a zone relative offset in
>>>>>> bytes for returning zone append result is OK-ish, but to match the
>>>>>> kernel supported range of possible zone size, you need 31+9 bits...
>>>>>> 32 does not cut it.
>>>>>
>>>>> Agree. Our initial assumption was that u32 would cover current zone size
>>>>> requirements, but if this is a no-go, we will take the longer path.
>>>>
>>>> Converting to u64 will require a new version of io_uring_cqe, where we
>>>> extend at least 32 bits. I believe this will need a whole new allocation
>>>> and probably ioctl().
>>>>
>>>> Is this an acceptable change for you? We will of course add support for
>>>> liburing when we agree on the right way to do this.
>>>
>>> If you need 64-bit of return value, then it's not going to work. Even
>>> with the existing patches, reusing cqe->flags isn't going to fly, as
>>> it would conflict with eg doing zone append writes with automatic
>>> buffer selection.
>>
>> Buffer selection is for reads/recv kind of requests, but appends
>> are writes. In theory they can co-exist using cqe->flags.
>
>Yeah good point, since it's just writes, doesn't matter. But the other
>point still stands, it could potentially conflict with other flags, but
>I guess only to the extent where both flags would need extra storage in
>->flags. So not a huge concern imho.
Very good point Pavel!
If co-existing with the current flags is an option, I'll explore this
for the next version.
Thanks Jens and Pavel for the time and ideas!
Javier
next prev parent reply other threads:[~2020-06-21 18:52 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20200617172653epcas5p488de50090415eb802e62acc0e23d8812@epcas5p4.samsung.com>
2020-06-17 17:23 ` [PATCH 0/3] zone-append support in aio and io-uring Kanchan Joshi
[not found] ` <CGME20200617172702epcas5p4dbf4729d31d9a85ab1d261d04f238e61@epcas5p4.samsung.com>
2020-06-17 17:23 ` [PATCH 1/3] fs,block: Introduce IOCB_ZONE_APPEND and direct-io handling Kanchan Joshi
2020-06-17 19:02 ` Pavel Begunkov
2020-06-18 7:16 ` Damien Le Moal
2020-06-18 18:35 ` Kanchan Joshi
[not found] ` <CGME20200617172706epcas5p4dcbc164063f58bad95b211b9d6dfbfa9@epcas5p4.samsung.com>
2020-06-17 17:23 ` [PATCH 2/3] aio: add support for zone-append Kanchan Joshi
2020-06-18 7:33 ` Damien Le Moal
[not found] ` <CGME20200617172713epcas5p352f2907a12bd4ee3c97be1c7d8e1569e@epcas5p3.samsung.com>
2020-06-17 17:23 ` [PATCH 3/3] io_uring: " Kanchan Joshi
2020-06-17 18:55 ` Pavel Begunkov
2020-06-18 7:39 ` Damien Le Moal
2020-06-18 8:35 ` [email protected]
2020-06-18 8:47 ` Damien Le Moal
2020-06-18 9:11 ` [email protected]
2020-06-19 9:41 ` [email protected]
2020-06-19 11:15 ` Matias Bjørling
2020-06-19 14:18 ` Jens Axboe
2020-06-19 15:14 ` Matias Bjørling
2020-06-19 15:20 ` Jens Axboe
2020-06-19 15:40 ` Matias Bjørling
2020-06-19 15:44 ` Jens Axboe
2020-06-21 18:55 ` [email protected]
2020-06-19 14:15 ` Jens Axboe
2020-06-19 14:59 ` Pavel Begunkov
2020-06-19 15:02 ` Jens Axboe
2020-06-21 18:52 ` [email protected] [this message]
2020-06-17 17:42 ` [PATCH 0/3] zone-append support in aio and io-uring Matthew Wilcox
2020-06-18 6:56 ` Christoph Hellwig
2020-06-18 8:29 ` Javier González
2020-06-18 17:52 ` Kanchan Joshi
2020-06-19 3:08 ` Damien Le Moal
2020-06-19 7:56 ` Christoph Hellwig
2020-06-18 8:04 ` Matias Bjørling
2020-06-18 8:27 ` Javier González
2020-06-18 8:32 ` Matias Bjørling
2020-06-18 8:39 ` Javier González
2020-06-18 8:46 ` Matias Bjørling
2020-06-18 14:16 ` Christoph Hellwig
2020-06-18 19:21 ` Kanchan Joshi
2020-06-18 20:04 ` Matias Bjørling
2020-06-19 1:03 ` Damien Le Moal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200621185207.m7535hzpm4ubrk4i@MacBook-Pro.localdomain \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox