From: Pavel Begunkov <[email protected]>
To: Jens Axboe <[email protected]>, Vito Caputo <[email protected]>,
[email protected]
Subject: Re: relative openat dirfd reference on submit
Date: Tue, 3 Nov 2020 00:41:29 +0000 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 03/11/2020 00:34, Jens Axboe wrote:
> On 11/2/20 5:17 PM, Pavel Begunkov wrote:
>> On 03/11/2020 00:05, Jens Axboe wrote:
>>> On 11/2/20 1:52 PM, Vito Caputo wrote:
>>>> Hello list,
>>>>
>>>> I've been tinkering a bit with some async continuation passing style
>>>> IO-oriented code employing liburing. This exposed a kind of awkward
>>>> behavior I suspect could be better from an ergonomics perspective.
>>>>
>>>> Imagine a bunch of OPENAT SQEs have been prepared, and they're all
>>>> relative to a common dirfd. Once io_uring_submit() has consumed all
>>>> these SQEs across the syscall boundary, logically it seems the dirfd
>>>> should be safe to close, since these dirfd-dependent operations have
>>>> all been submitted to the kernel.
>>>>
>>>> But when I attempted this, the subsequent OPENAT CQE results were all
>>>> -EBADFD errors. It appeared the submit didn't add any references to
>>>> the dependent dirfd.
>>>>
>>>> To work around this, I resorted to stowing the dirfd and maintaining a
>>>> shared refcount in the closures associated with these SQEs and
>>>> executed on their CQEs. This effectively forced replicating the
>>>> batched relationship implicit in the shared parent dirfd, where I
>>>> otherwise had zero need to. Just so I could defer closing the dirfd
>>>> until once all these closures had run on their respective CQE arrivals
>>>> and the refcount for the batch had reached zero.
>>>>
>>>> It doesn't seem right. If I ensure sufficient queue depth and
>>>> explicitly flush all the dependent SQEs beforehand
>>>> w/io_uring_submit(), it seems like I should be able to immediately
>>>> close(dirfd) and have the close be automagically deferred until the
>>>> last dependent CQE removes its reference from the kernel side.
>>>
>>> We pass the 'dfd' straight on, and only the async part acts on it.
>>> Which is why it needs to be kept open. But I wonder if we can get
>>> around it by just pinning the fd for the duration. Since you didn't
>>> include a test case, can you try with this patch applied? Totally
>>> untested...
>>
>> afaik this doesn't pin an fd in a file table, so the app closes and
>> dfd right after submit and then do_filp_open() tries to look up
>> closed dfd. Doesn't seem to work, and we need to pass that struct
>> file to do_filp_open().
>
> Yeah, I just double checked, and it's just referenced, but close() will
> still make it NULL in the file table. So won't work... We'll have to
> live with it for now, I'm afraid.
Is there a problem with passing in a struct file? Apart from it
being used deep in open callchains?
--
Pavel Begunkov
next prev parent reply other threads:[~2020-11-03 0:44 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-02 20:52 relative openat dirfd reference on submit Vito Caputo
2020-11-03 0:05 ` Jens Axboe
2020-11-03 0:17 ` Pavel Begunkov
2020-11-03 0:34 ` Jens Axboe
2020-11-03 0:41 ` Pavel Begunkov [this message]
2020-11-04 23:43 ` Jens Axboe
2020-11-05 8:45 ` Stefan Metzmacher
2020-11-05 14:09 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox