From: Jens Axboe <[email protected]>
To: Pavel Begunkov <[email protected]>,
io-uring <[email protected]>
Cc: Jann Horn <[email protected]>
Subject: Re: [PATCH for-next] io_uring: ensure IOSQE_ASYNC file table grabbing works, with SQPOLL
Date: Thu, 10 Sep 2020 17:04:55 -0600 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 9/10/20 4:11 PM, Jens Axboe wrote:
> On 9/10/20 3:01 PM, Jens Axboe wrote:
>> On 9/10/20 12:18 PM, Jens Axboe wrote:
>>> On 9/10/20 7:11 AM, Jens Axboe wrote:
>>>> On 9/10/20 6:37 AM, Pavel Begunkov wrote:
>>>>> On 09/09/2020 19:07, Jens Axboe wrote:
>>>>>> On 9/9/20 9:48 AM, Pavel Begunkov wrote:
>>>>>>> On 09/09/2020 16:10, Jens Axboe wrote:
>>>>>>>> On 9/9/20 1:09 AM, Pavel Begunkov wrote:
>>>>>>>>> On 09/09/2020 01:54, Jens Axboe wrote:
>>>>>>>>>> On 9/8/20 3:22 PM, Jens Axboe wrote:
>>>>>>>>>>> On 9/8/20 2:58 PM, Pavel Begunkov wrote:
>>>>>>>>>>>> On 08/09/2020 20:48, Jens Axboe wrote:
>>>>>>>>>>>>> Fd instantiating commands like IORING_OP_ACCEPT now work with SQPOLL, but
>>>>>>>>>>>>> we have an error in grabbing that if IOSQE_ASYNC is set. Ensure we assign
>>>>>>>>>>>>> the ring fd/file appropriately so we can defer grab them.
>>>>>>>>>>>>
>>>>>>>>>>>> IIRC, for fcheck() in io_grab_files() to work it should be under fdget(),
>>>>>>>>>>>> that isn't the case with SQPOLL threads. Am I mistaken?
>>>>>>>>>>>>
>>>>>>>>>>>> And it looks strange that the following snippet will effectively disable
>>>>>>>>>>>> such requests.
>>>>>>>>>>>>
>>>>>>>>>>>> fd = dup(ring_fd)
>>>>>>>>>>>> close(ring_fd)
>>>>>>>>>>>> ring_fd = fd
>>>>>>>>>>>
>>>>>>>>>>> Not disagreeing with that, I think my initial posting made it clear
>>>>>>>>>>> it was a hack. Just piled it in there for easier testing in terms
>>>>>>>>>>> of functionality.
>>>>>>>>>>>
>>>>>>>>>>> But the next question is how to do this right...>
>>>>>>>>>> Looking at this a bit more, and I don't necessarily think there's a
>>>>>>>>>> better option. If you dup+close, then it just won't work. We have no
>>>>>>>>>> way of knowing if the 'fd' changed, but we can detect if it was closed
>>>>>>>>>> and then we'll end up just EBADF'ing the requests.
>>>>>>>>>>
>>>>>>>>>> So right now the answer is that we can support this just fine with
>>>>>>>>>> SQPOLL, but you better not dup and close the original fd. Which is not
>>>>>>>>>> ideal, but better than NOT being able to support it.
>>>>>>>>>>
>>>>>>>>>> Only other option I see is to to provide an io_uring_register()
>>>>>>>>>> command to update the fd/file associated with it. Which may be useful,
>>>>>>>>>> it allows a process to indeed to this, if it absolutely has to.
>>>>>>>>>
>>>>>>>>> Let's put aside such dirty hacks, at least until someone actually
>>>>>>>>> needs it. Ideally, for many reasons I'd prefer to get rid of
>>>>>>>>
>>>>>>>> BUt it is actually needed, otherwise we're even more in a limbo state of
>>>>>>>> "SQPOLL works for most things now, just not all". And this isn't that
>>>>>>>> hard to make right - on the flush() side, we just need to park/stall the
>>>>>>>
>>>>>>> I understand that it isn't hard, but I just don't want to expose it to
>>>>>>> the userspace, a) because it's a userspace API, so couldn't probably be
>>>>>>> killed in the future, b) works around kernel's problems, and so
>>>>>>> shouldn't really be exposed to the userspace in normal circumstances.
>>>>>>>
>>>>>>> And it's not generic enough because of a possible "many fds -> single
>>>>>>> file" mapping, and there will be a lot of questions and problems.
>>>>>>>
>>>>>>> e.g. if a process shares a io_uring with another process, then
>>>>>>> dup()+close() would require not only this hook but also additional
>>>>>>> inter-process synchronisation. And so on.
>>>>>>
>>>>>> I think you're blowing this out of proportion. Just to restate the
>>>>>
>>>>> I just think that if there is a potentially cleaner solution without
>>>>> involving userspace, we should try to look for it first, even if it
>>>>> would take more time. That was the point.
>>>>
>>>> Regardless of whether or not we can eliminate that need, at least it'll
>>>> be a relaxing of the restriction, not an increase of it. It'll never
>>>> hurt to do an extra system call for the case where you're swapping fds.
>>>> I do get your point, I just don't think it's a big deal.
>>>
>>> BTW, I don't see how we can ever get rid of a need to enter the kernel,
>>> we'd need some chance at grabbing the updated ->files, for instance.
>>> Might be possible to hold a reference to the task and grab it from
>>> there, though feels a bit iffy to hold a task reference from the ring on
>>> the task that holds a reference to the ring. Haven't looked too close,
>>> should work though as this won't hold a file/files reference, it's just
>>> a freeing reference.
>>
>> Sort of half assed attempt...
>>
>> Idea is to assign a ->files sequence before we grab files, and then
>> compare with the current one once we need to use the files. If they
>> mismatch, we -ECANCELED the request.
>>
>> For SQPOLL, don't grab ->files upfront, grab a reference to the task
>> instead. Use the task reference to assign files when we need it.
>>
>> Adding Jann to help poke holes in this scheme. I'd be surprised if it's
>> solid as-is, but hopefully we can build on this idea and get rid of the
>> fcheck().
>
> Split it into two, to make it easier to reason about. Added a few
> comments, etc.
Pushed to a temp branch, made a few more edits (forgot to wire up open).
https://git.kernel.dk/cgit/linux-block/log/?h=io_uring-files_struct
--
Jens Axboe
next prev parent reply other threads:[~2020-09-10 23:05 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-08 17:48 [PATCH for-next] io_uring: ensure IOSQE_ASYNC file table grabbing works, with SQPOLL Jens Axboe
2020-09-08 20:58 ` Pavel Begunkov
2020-09-08 21:22 ` Jens Axboe
2020-09-08 22:54 ` Jens Axboe
2020-09-09 0:48 ` Josef
2020-09-09 7:09 ` Pavel Begunkov
2020-09-09 13:10 ` Jens Axboe
2020-09-09 13:53 ` Jens Axboe
2020-09-09 15:48 ` Pavel Begunkov
2020-09-09 16:07 ` Jens Axboe
2020-09-10 12:37 ` Pavel Begunkov
2020-09-10 13:11 ` Jens Axboe
2020-09-10 18:18 ` Jens Axboe
2020-09-10 21:01 ` Jens Axboe
2020-09-10 22:11 ` Jens Axboe
2020-09-10 23:04 ` Jens Axboe [this message]
2020-09-11 19:23 ` Pavel Begunkov
2020-09-11 20:06 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox