From: Pavel Begunkov <[email protected]>
To: Jens Axboe <[email protected]>, Josh Triplett <[email protected]>
Cc: [email protected],
"David S . Miller" <[email protected]>,
Jakub Kicinski <[email protected]>,
[email protected], [email protected],
Stefan Metzmacher <[email protected]>
Subject: Re: [PATCH v3 0/4] open/accept directly into io_uring fixed file table
Date: Tue, 24 Aug 2021 10:48:01 +0100 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 8/23/21 8:40 PM, Jens Axboe wrote:
> On 8/23/21 1:13 PM, Josh Triplett wrote:
>> On Sat, Aug 21, 2021 at 08:18:12PM -0600, Jens Axboe wrote:
>>> On 8/21/21 9:52 AM, Pavel Begunkov wrote:
>>>> Add an optional feature to open/accept directly into io_uring's fixed
>>>> file table bypassing the normal file table. Same behaviour if as the
>>>> snippet below, but in one operation:
>>>>
>>>> sqe = prep_[open,accept](...);
>>>> cqe = submit_and_wait(sqe);
>>>> io_uring_register_files_update(uring_idx, (fd = cqe->res));
>>>> close((fd = cqe->res));
>>>>
>>>> The idea in pretty old, and was brough up and implemented a year ago
>>>> by Josh Triplett, though haven't sought the light for some reasons.
>>>>
>>>> The behaviour is controlled by setting sqe->file_index, where 0 implies
>>>> the old behaviour. If non-zero value is specified, then it will behave
>>>> as described and place the file into a fixed file slot
>>>> sqe->file_index - 1. A file table should be already created, the slot
>>>> should be valid and empty, otherwise the operation will fail.
>>>>
>>>> we can't use IOSQE_FIXED_FILE to switch between modes, because accept
>>>> takes a file, and it already uses the flag with a different meaning.
>>>>
>>>> since RFC:
>>>> - added attribution
>>>> - updated descriptions
>>>> - rebased
>>>>
>>>> since v1:
>>>> - EBADF if slot is already used (Josh Triplett)
>>>> - alias index with splice_fd_in (Josh Triplett)
>>>> - fix a bound check bug
>>>
>>> With the prep series, this looks good to me now. Josh, what do you
>>> think?
>>
>> I would still like to see this using a union with the `nofile` field in
>> io_open and io_accept, rather than overloading the 16-bit buf_index
>> field. That would avoid truncating to 16 bits, and make less work for
>> expansion to more than 16 bits of fixed file indexes.
>>
>> (I'd also like that to actually use a union, rather than overloading the
>> meaning of buf_index/nofile.)
>
> Agree, and in fact there's room in the open and accept command parts, so
> we can just make it a separate entry there instead of using ->buf_index.
> Then just pass in the index to io_install_fixed_file() instead of having
> it pull it from req->buf_index.
That's internal details, can be expanded at wish in the future, if we'd
ever need larger tables. ->buf_index already holds indexes to different
resources just fine.
Aliasing with nofile would rather be ugly, so the only option, as you
mentioned, is to grab some space from open/accept structs, but don't see
why we'd want it when there is a more convenient alternative.
>> I personally still feel that using non-zero to signify index-plus-one is
>> both error-prone and not as future-compatible. I think we could do
>> better with no additional overhead. But I think the final call on that
>> interface is up to you, Jens. Do you think it'd be worth spending a flag
>> bit or using a different opcode, to get a cleaner interface? If you
>> don't, then I'd be fine with seeing this go in with just the io_open and
>> io_accept change.
>
> I'd be inclined to go the extra opcode route instead, as the flag only
> really would make sense to requests that instantiate file descriptors.
> For this particular case, we'd need 3 new opcodes for
> openat/openat2/accept, which is probably a worthwhile expenditure.
>
> Pavel, what do you think? Switch to using a different opcode for the new
> requests, and just grab some space in io_open and io_accept for the fd
> and pass it in to install.
I don't get it, why it's even called hackish? How that's anyhow better?
To me the feature looks like a natural extension to the operations, just
like a read can be tuned with flags, so and creating new opcodes seems
a bit ugly, unnecessary taking space from opcodes and adding duplication
(even if both versions call the same handler).
First, why it's not future-compatible? It's a serious argument, but I
don't see where it came from. Do I miss something?
It's u32 now, and so will easily cover all indexes. SQE fields should
always be zeroed, that's a rule, liburing follows it, and there would
have been already lots of problems for users not honoring it. And there
will be a helper hiding all the index conversions for convenience.
void io_uring_prep_open_direct(sqe, index, ...)
{
io_uring_prep_open(sqe, ...);
sqe->file_index = index + 1;
}
--
Pavel Begunkov
next prev parent reply other threads:[~2021-08-24 9:48 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-21 15:52 [PATCH v3 0/4] open/accept directly into io_uring fixed file table Pavel Begunkov
2021-08-21 15:52 ` [PATCH v3 1/4] net: add accept helper not installing fd Pavel Begunkov
2021-08-23 16:35 ` Jakub Kicinski
2021-08-21 15:52 ` [PATCH v3 2/4] io_uring: openat directly into fixed fd table Pavel Begunkov
2021-08-21 15:52 ` [PATCH v3 3/4] io_uring: hand code io_accept() fd installing Pavel Begunkov
2021-08-21 15:52 ` [PATCH v3 4/4] io_uring: accept directly into fixed file table Pavel Begunkov
2021-08-22 2:18 ` [PATCH v3 0/4] open/accept directly into io_uring " Jens Axboe
2021-08-23 19:13 ` Josh Triplett
2021-08-23 19:40 ` Jens Axboe
2021-08-24 9:48 ` Pavel Begunkov [this message]
2021-08-24 14:02 ` Jens Axboe
2021-08-24 14:43 ` Pavel Begunkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox