public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jens Axboe <[email protected]>
To: Hao Xu <[email protected]>, [email protected]
Cc: [email protected]
Subject: Re: [PATCHSET 0/4] Allow allocated direct descriptors
Date: Tue, 10 May 2022 06:26:48 -0600	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 5/9/22 11:35 PM, Hao Xu wrote:
> ? 2022/5/9 ??10:49, Jens Axboe ??:
>> On 5/9/22 7:20 AM, Hao Xu wrote:
>>> ? 2022/5/9 ??7:49, Jens Axboe ??:
>>>> Hi,
>>>>
>>>> Currently using direct descriptors with open or accept requires the
>>>> application to manage the descriptor space, picking which slot to use
>>>> for any given file. However, there are cases where it's useful to just
>>>> get a direct descriptor and not care about which value it is, instead
>>>> just return it like a normal open or accept would.
>>>>
>>>> This will also be useful for multishot accept support, where allocated
>>>> direct descriptors are a requirement to make that feature work with
>>>> these kinds of files.
>>>>
>>>> This adds support for allocating a new fixed descriptor. This is chosen
>>>> by passing in UINT_MAX as the fixed slot, which otherwise has a limit
>>>> of INT_MAX like any file descriptor does.
>>>>
>>>>    fs/io_uring.c | 100 +++++++++++++++++++++++++++++++++++++++++++++++---
>>>>     1 file changed, 94 insertions(+), 6 deletions(-)
>>>>
>>> Hi Jens,
>>> I've read this idea of leveraging bitmap, it looks great. a small flaw
>>> of it is that when the file_table is very long, the bitmap searching
>>> seems to be O({length of table}/BITS_PER_LONG), to make the time
>>> complexity stable, I did a linked list version, could you have a look
>>> when you're avalible. totally untested, just to show my idea. Basically
>>> I use a list to link all the free slots, when we need a slot, just get
>>> the head of it.
>>> https://github.com/HowHsu/linux/commits/for-5.19/io_uring_multishot_accept_v5
>>>
>>> (borrowed some commit message from your patches)
>>
>> While that's certainly true, I'm skeptical that the list management will
>> be faster for most cases. It's worth nothing that the regular file
>> allocator is very much the same thing. A full scan is unlikely unless
>> you already got -ENFILE. Any clear in between will reset the hint and
>> it'll be O(1) again. So yes, the pathological case of having no
> 
> it's not O(1) actually, and a full bitmap is not the only worst case.
> For instance, the bitmap is like:
>                              hint
>                               |
>    1111111111111111111111111110000
> 
> then a bit is cleared and hint is updated:
>      hint
>       |
>    1110111111111111111111111110000
> 
> then next time the complexity is high

Next time it's fine, since the hint is that bit. If you do do, then yes
the second would be a slower.

> So in this kind of scenario(first allocate many in order, then clear
> low bit and allocation goes on in turn), it would be slow. And I think
> these cases are not rare since people usually allocate many fds then
> free the early used fds from time to time.

It's by no means perfect, but if it's good enough for the normal file
allocator, then I don't think it'd be wise to over-engineer this one
until there's a proven need to do so.

The single list items tracking free items is most certainly a LOT slower
for the common cases, so I don't think that's a good approach at all.

My suggestion would be to stick with the proposed approach until there's
evidence that the allocator needs improving. I did write a benchmark
that uses a 500K map and does opens and closes, and I don't see anything
to worry about in terms of overhead. The bitmap handling doesn't even
really register, dwarfed by the rest of the open path.

>> If the case of finding a new descriptor is slow for a mostly full space,
>> in the past I've done something like axmap [1] in fio, where you each
>> 64-bit entry is representing by a single bit a layer up. That still has
>> very good space utilization and good cache layout, which the list very
>> much does not. But given the above, I don't think we need to worry about
>> that really.
>>
>> As a side note, I do think we need to just bump the size of the max
>> direct descriptors we can have. With the file table potentially being
>> vmalloc backed, there's no reason to limit it to the current 32K.
> 
> Agree.
> 
>>
>> [1] https://git.kernel.dk/cgit/fio/tree/lib/axmap.c
>>
> Cool, I'll have a look.

It can get boiled down to something a bit simpler as the fio
implementation supports a variety of different use cases. For example, I
think it should be implemented as a single indexed array that holds all
the levels, rather than separate is it's done there. In short, it just
condenses everything down to one qword eventually, and finding a free
bit is always O(log64(N)).

-- 
Jens Axboe


  reply	other threads:[~2022-05-10 12:28 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-08 23:49 [PATCHSET 0/4] Allow allocated direct descriptors Jens Axboe
2022-05-08 23:49 ` [PATCH 1/4] io_uring: track fixed files with a bitmap Jens Axboe
2022-05-09 13:26   ` Hao Xu
2022-05-09 14:55     ` Jens Axboe
2022-05-08 23:49 ` [PATCH 2/4] io_uring: add basic fixed file allocator Jens Axboe
2022-05-08 23:49 ` [PATCH 3/4] io_uring: allow allocated fixed files for openat/openat2 Jens Axboe
2022-05-08 23:49 ` [PATCH 4/4] io_uring: allow allocated fixed files for accept Jens Axboe
2022-05-09 13:20 ` [PATCHSET 0/4] Allow allocated direct descriptors Hao Xu
2022-05-09 14:49   ` Jens Axboe
2022-05-10  5:35     ` Hao Xu
2022-05-10 12:26       ` Jens Axboe [this message]
2022-06-10 13:41         ` Victor Stewart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox