From: Jens Axboe <[email protected]>
To: Hao Xu <[email protected]>, [email protected]
Cc: [email protected]
Subject: Re: [PATCHSET 0/4] Allow allocated direct descriptors
Date: Tue, 10 May 2022 06:26:48 -0600 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 5/9/22 11:35 PM, Hao Xu wrote:
> ? 2022/5/9 ??10:49, Jens Axboe ??:
>> On 5/9/22 7:20 AM, Hao Xu wrote:
>>> ? 2022/5/9 ??7:49, Jens Axboe ??:
>>>> Hi,
>>>>
>>>> Currently using direct descriptors with open or accept requires the
>>>> application to manage the descriptor space, picking which slot to use
>>>> for any given file. However, there are cases where it's useful to just
>>>> get a direct descriptor and not care about which value it is, instead
>>>> just return it like a normal open or accept would.
>>>>
>>>> This will also be useful for multishot accept support, where allocated
>>>> direct descriptors are a requirement to make that feature work with
>>>> these kinds of files.
>>>>
>>>> This adds support for allocating a new fixed descriptor. This is chosen
>>>> by passing in UINT_MAX as the fixed slot, which otherwise has a limit
>>>> of INT_MAX like any file descriptor does.
>>>>
>>>> fs/io_uring.c | 100 +++++++++++++++++++++++++++++++++++++++++++++++---
>>>> 1 file changed, 94 insertions(+), 6 deletions(-)
>>>>
>>> Hi Jens,
>>> I've read this idea of leveraging bitmap, it looks great. a small flaw
>>> of it is that when the file_table is very long, the bitmap searching
>>> seems to be O({length of table}/BITS_PER_LONG), to make the time
>>> complexity stable, I did a linked list version, could you have a look
>>> when you're avalible. totally untested, just to show my idea. Basically
>>> I use a list to link all the free slots, when we need a slot, just get
>>> the head of it.
>>> https://github.com/HowHsu/linux/commits/for-5.19/io_uring_multishot_accept_v5
>>>
>>> (borrowed some commit message from your patches)
>>
>> While that's certainly true, I'm skeptical that the list management will
>> be faster for most cases. It's worth nothing that the regular file
>> allocator is very much the same thing. A full scan is unlikely unless
>> you already got -ENFILE. Any clear in between will reset the hint and
>> it'll be O(1) again. So yes, the pathological case of having no
>
> it's not O(1) actually, and a full bitmap is not the only worst case.
> For instance, the bitmap is like:
> hint
> |
> 1111111111111111111111111110000
>
> then a bit is cleared and hint is updated:
> hint
> |
> 1110111111111111111111111110000
>
> then next time the complexity is high
Next time it's fine, since the hint is that bit. If you do do, then yes
the second would be a slower.
> So in this kind of scenario(first allocate many in order, then clear
> low bit and allocation goes on in turn), it would be slow. And I think
> these cases are not rare since people usually allocate many fds then
> free the early used fds from time to time.
It's by no means perfect, but if it's good enough for the normal file
allocator, then I don't think it'd be wise to over-engineer this one
until there's a proven need to do so.
The single list items tracking free items is most certainly a LOT slower
for the common cases, so I don't think that's a good approach at all.
My suggestion would be to stick with the proposed approach until there's
evidence that the allocator needs improving. I did write a benchmark
that uses a 500K map and does opens and closes, and I don't see anything
to worry about in terms of overhead. The bitmap handling doesn't even
really register, dwarfed by the rest of the open path.
>> If the case of finding a new descriptor is slow for a mostly full space,
>> in the past I've done something like axmap [1] in fio, where you each
>> 64-bit entry is representing by a single bit a layer up. That still has
>> very good space utilization and good cache layout, which the list very
>> much does not. But given the above, I don't think we need to worry about
>> that really.
>>
>> As a side note, I do think we need to just bump the size of the max
>> direct descriptors we can have. With the file table potentially being
>> vmalloc backed, there's no reason to limit it to the current 32K.
>
> Agree.
>
>>
>> [1] https://git.kernel.dk/cgit/fio/tree/lib/axmap.c
>>
> Cool, I'll have a look.
It can get boiled down to something a bit simpler as the fio
implementation supports a variety of different use cases. For example, I
think it should be implemented as a single indexed array that holds all
the levels, rather than separate is it's done there. In short, it just
condenses everything down to one qword eventually, and finding a free
bit is always O(log64(N)).
--
Jens Axboe
next prev parent reply other threads:[~2022-05-10 12:28 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-08 23:49 [PATCHSET 0/4] Allow allocated direct descriptors Jens Axboe
2022-05-08 23:49 ` [PATCH 1/4] io_uring: track fixed files with a bitmap Jens Axboe
2022-05-09 13:26 ` Hao Xu
2022-05-09 14:55 ` Jens Axboe
2022-05-08 23:49 ` [PATCH 2/4] io_uring: add basic fixed file allocator Jens Axboe
2022-05-08 23:49 ` [PATCH 3/4] io_uring: allow allocated fixed files for openat/openat2 Jens Axboe
2022-05-08 23:49 ` [PATCH 4/4] io_uring: allow allocated fixed files for accept Jens Axboe
2022-05-09 13:20 ` [PATCHSET 0/4] Allow allocated direct descriptors Hao Xu
2022-05-09 14:49 ` Jens Axboe
2022-05-10 5:35 ` Hao Xu
2022-05-10 12:26 ` Jens Axboe [this message]
2022-06-10 13:41 ` Victor Stewart
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox